Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 610

Search results for: clustering

70 Career Guidance System Using Machine Learning

Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan

Abstract:

Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should evaluate properly their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, neural networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable of offering an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.

Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills

Procedia PDF Downloads 71

69 Big Data Analysis on the Development of Jinan’s Consumption Centers under the Influence of E-Commerce

Authors: Hang Wang, Xiaoming Gao

Abstract:

The rapid development of e-commerce has significantly transformed consumer behavior and urban consumption patterns worldwide. This study explores the impact of e-commerce on the development and spatial distribution of consumption centers, with a particular focus on Jinan City, China. Traditionally, urban consumption centers are defined by physical commercial spaces, such as shopping malls and markets. However, the rise of e-commerce has introduced a shift towards virtual consumption hubs, with a corresponding impact on physical retail locations. Utilizing Gaode POI (Point of Interest) data, this research aims to provide a comprehensive analysis of the spatial distribution of consumption centers in Jinan, comparing e-commerce-driven virtual consumption hubs with traditional physical consumption centers. The study methodology involves gathering and analyzing POI data, focusing on logistics distribution for e-commerce activities and mobile charging point locations to represent offline consumption behavior. A spatial clustering technique is applied to examine the concentration of commercial activities and to identify emerging trends in consumption patterns. The findings reveal a clear differentiation between e-commerce and physical consumption centers in Jinan. E-commerce activities are dispersed across a wider geographic area, correlating closely with residential zones and logistics centers, while traditional consumption hubs remain concentrated around historical and commercial areas such as Honglou and the old city center. Additionally, the research identifies an ongoing transition within Jinan’s consumption landscape, with online and offline retail coexisting, though at different spatial and functional levels. This study contributes to urban planning by providing insights into how e-commerce is reshaping consumption behaviors and spatial structures in cities like Jinan. By leveraging big data analytics, the research offers a valuable tool for urban designers and planners to adapt to the evolving demands of digital commerce and to optimize the spatial layout of city infrastructure to better serve the needs of modern consumers.

Keywords: big data, consumption centers, e-commerce, urban planning, jinan

Procedia PDF Downloads 22

68 A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs Based on Machine Learning Algorithms

Authors: Kottaridi Klimentia, Demopoulos Vasilis, Sidiropoulos Anastasios, Ihara Diego, Nikolaidis Vasileios, Antonopoulos Dimitrios

Abstract:

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity, and aflatoxinogenic capacity of the strains, topography, soil, and climate parameters of the fig orchards, are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive, and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), the concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P, and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques, i.e., dimensionality reduction on the original dataset (principal component analysis), metric learning (Mahalanobis metric for clustering), and k-nearest neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson correlation coefficient (PCC) between observed and predicted values.

Keywords: aflatoxins, Aspergillus spp., dried figs, k-nearest neighbors, machine learning, prediction

Procedia PDF Downloads 184

67 Examining Social Connectivity through Email Network Analysis: Study of Librarians' Emailing Groups in Pakistan

Authors: Muhammad Arif Khan, Haroon Idrees, Imran Aziz, Sidra Mushtaq

Abstract:

Social platforms like online discussion and mailing groups are well aligned with academic as well as professional learning spaces. Professional communities are increasingly moving to online forums for sharing and capturing the intellectual abilities. This study investigated dynamics of social connectivity of yahoo mailing groups of Pakistani Library and Information Science (LIS) professionals using Graph Theory technique. Design/Methodology: Social Network Analysis is the increasingly concerned domain for scientists in identifying whether people grow together through online social interaction or, whether they just reflect connectivity. We have conducted a longitudinal study using Network Graph Theory technique to analyze the large data-set of email communication. The data was collected from three yahoo mailing groups using network analysis software over a period of six months i.e. January to June 2016. Findings of the network analysis were reviewed through focus group discussion with LIS experts and selected respondents of the study. Data were analyzed in Microsoft Excel and network diagrams were visualized using NodeXL and ORA-Net Scene package. Findings: Findings demonstrate that professionals and students exhibit intellectual growth the more they get tied within a network by interacting and participating in communication through online forums. The study reports on dynamics of the large network by visualizing the email correspondence among group members in a network consisting vertices (members) and edges (randomized correspondence). The model pair wise relationship between group members was illustrated to show characteristics, reasons, and strength of ties. Connectivity of nodes illustrated the frequency of communication among group members through examining node coupling, diffusion of networks, and node clustering has been demonstrated in-depth. Network analysis was found to be a useful technique in investigating the dynamics of the large network.

Keywords: emailing networks, network graph theory, online social platforms, yahoo mailing groups

Procedia PDF Downloads 241

66 Ischemic Stroke Detection in Computed Tomography Examinations

Authors: Allan F. F. Alves, Fernando A. Bacchim Neto, Guilherme Giacomini, Marcela de Oliveira, Ana L. M. Pavan, Maria E. D. Rosa, Diana R. Pina

Abstract:

Stroke is a worldwide concern, only in Brazil it accounts for 10% of all registered deaths. There are 2 stroke types, ischemic (87%) and hemorrhagic (13%). Early diagnosis is essential to avoid irreversible cerebral damage. Non-enhanced computed tomography (NECT) is one of the main diagnostic techniques used due to its wide availability and rapid diagnosis. Detection depends on the size and severity of lesions and the time spent between the first symptoms and examination. The Alberta Stroke Program Early CT Score (ASPECTS) is a subjective method that increases the detection rate. The aim of this work was to implement an image segmentation system to enhance ischemic stroke and to quantify the area of ischemic and hemorrhagic stroke lesions in CT scans. We evaluated 10 patients with NECT examinations diagnosed with ischemic stroke. Analyzes were performed in two axial slices, one at the level of the thalamus and basal ganglion and one adjacent to the top edge of the ganglionic structures with window width between 80 and 100 Hounsfield Units. We used different image processing techniques such as morphological filters, discrete wavelet transform and Fuzzy C-means clustering. Subjective analyzes were performed by a neuroradiologist according to the ASPECTS scale to quantify ischemic areas in the middle cerebral artery region. These subjective analysis results were compared with objective analyzes performed by the computational algorithm. Preliminary results indicate that the morphological filters actually improve the ischemic areas for subjective evaluations. The comparison in area of the ischemic region contoured by the neuroradiologist and the defined area by computational algorithm showed no deviations greater than 12% in any of the 10 examination tests. Although there is a tendency that the areas contoured by the neuroradiologist are smaller than those obtained by the algorithm. These results show the importance of a computer aided diagnosis software to assist neuroradiology decisions, especially in critical situations as the choice of treatment for ischemic stroke.

Keywords: ischemic stroke, image processing, CT scans, Fuzzy C-means

Procedia PDF Downloads 369

65 Recommendations for Data Quality Filtering of Opportunistic Species Occurrence Data

Authors: Camille Van Eupen, Dirk Maes, Marc Herremans, Kristijn R. R. Swinnen, Ben Somers, Stijn Luca

Abstract:

In ecology, species distribution models are commonly implemented to study species-environment relationships. These models increasingly rely on opportunistic citizen science data when high-quality species records collected through standardized recording protocols are unavailable. While these opportunistic data are abundant, uncertainty is usually high, e.g., due to observer effects or a lack of metadata. Data quality filtering is often used to reduce these types of uncertainty in an attempt to increase the value of studies relying on opportunistic data. However, filtering should not be performed blindly. In this study, recommendations are built for data quality filtering of opportunistic species occurrence data that are used as input for species distribution models. Using an extensive database of 5.7 million citizen science records from 255 species in Flanders, the impact on model performance was quantified by applying three data quality filters, and these results were linked to species traits. More specifically, presence records were filtered based on record attributes that provide information on the observation process or post-entry data validation, and changes in the area under the receiver operating characteristic (AUC), sensitivity, and specificity were analyzed using the Maxent algorithm with and without filtering. Controlling for sample size enabled us to study the combined impact of data quality filtering, i.e., the simultaneous impact of an increase in data quality and a decrease in sample size. Further, the variation among species in their response to data quality filtering was explored by clustering species based on four traits often related to data quality: commonness, popularity, difficulty, and body size. Findings show that model performance is affected by i) the quality of the filtered data, ii) the proportional reduction in sample size caused by filtering and the remaining absolute sample size, and iii) a species ‘quality profile’, resulting from a species classification based on the four traits related to data quality. The findings resulted in recommendations on when and how to filter volunteer generated and opportunistically collected data. This study confirms that correctly processed citizen science data can make a valuable contribution to ecological research and species conservation.

Keywords: citizen science, data quality filtering, species distribution models, trait profiles

Procedia PDF Downloads 204

64 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 309

63 Investigating Homicide Offender Typologies Based on Their Clinical Histories and Crime Scene Behaviour Patterns

Authors: Valeria Abreu Minero, Edward Barker, Hannah Dickson, Francois Husson, Sandra Flynn, Jennifer Shaw

Abstract:

Purpose – The purpose of this paper is to identify offender typologies based on aspects of the offenders’ psychopathology and their associations with crime scene behaviours using data derived from the National Confidential Enquiry into Suicide and Safety in Mental Health concerning homicides in England and Wales committed by offenders in contact with mental health services in the year preceding the offence (n=759). Design/methodology/approach – The authors used multiple correspondence analysis to investigate the interrelationships between the variables and hierarchical agglomerative clustering to identify offender typologies. Variables describing: the offender’s mental health history; the offenders’ mental state at the time of offence; characteristics useful for police investigations; and patterns of crime scene behaviours were included. Findings – Results showed differences in the offender’s histories in relation to their crime scene behaviours. Further, analyses revealed three homicide typologies: externalising, psychosis and depression. Analyses revealed three homicide typologies: externalising, psychotic and depressive. Practical implications – These typologies may assist the police during homicide investigations by: furthering their understanding of the crime or likely suspect; offering insights into crime patterns; provide advice as to what an offender’s offence behaviour might signify about his/her mental health background; findings suggest information concerning offender psychopathology may be useful for offender profiling purposes in cases of homicide offenders with schizophrenia, depression and comorbid diagnosis of personality disorder and alcohol/drug dependence. Originality/value – Empirical studies with an emphasis on offender profiling have almost exclusively focussed on the inference of offender demographic characteristics. This study provides a first step in the exploration of offender psychopathology and its integration to the multivariate analysis of offence information for the purposes of investigative profiling of homicide by identifying the dominant patterns of mental illness within homicidal behaviour.

Keywords: offender profiling, mental illness, psychopathology, multivariate analysis, homicide, crime scene analysis, crime scene behviours, investigative advice

Procedia PDF Downloads 130

62 Statistical Pattern Recognition for Biotechnological Process Characterization Based on High Resolution Mass Spectrometry

Authors: S. Fröhlich, M. Herold, M. Allmer

Abstract:

Early stage quantitative analysis of host cell protein (HCP) variations is challenging yet necessary for comprehensive bioprocess development. High resolution mass spectrometry (HRMS) provides a high-end technology for accurate identification alongside with quantitative information. Hereby we describe a flexible HRMS assay platform to quantify HCPs relevant in microbial expression systems such as E. Coli in both up and downstream development by means of MVDA tools. Cell pellets were lysed and proteins extracted, purified samples not further treated before applying the SMART tryptic digest kit. Peptides separation was optimized using an RP-UHPLC separation platform. HRMS-MSMS analysis was conducted on an Orbitrap Velos Elite applying CID. Quantification was performed label-free taking into account ionization properties and physicochemical peptide similarities. Results were analyzed using SIEVE 2.0 (Thermo Fisher Scientific) and SIMCA (Umetrics AG). The developed HRMS platform was applied to an E. Coli expression set with varying productivity and the corresponding downstream process. Selected HCPs were successfully quantified within the fmol range. Analysing HCP networks based on pattern analysis facilitated low level quantification and enhanced validity. This approach is of high relevance for high-throughput screening experiments during upstream development, e.g. for titer determination, dynamic HCP network analysis or product characterization. Considering the downstream purification process, physicochemical clustering of identified HCPs is of relevance to adjust buffer conditions accordingly. However, the technology provides an innovative approach for label-free MS based quantification relying on statistical pattern analysis and comparison. Absolute quantification based on physicochemical properties and peptide similarity score provides a technological approach without the need of sophisticated sample preparation strategies and is therefore proven to be straightforward, sensitive and highly reproducible in terms of product characterization.

Keywords: process analytical technology, mass spectrometry, process characterization, MVDA, pattern recognition

Procedia PDF Downloads 252

61 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: brain-computer interface, electroencephalography, finger motion decoding, independent component analysis, pseudo real-time motion decoding

Procedia PDF Downloads 138

60 RNA-Seq Analysis of the Wild Barley (H. spontaneum) Leaf Transcriptome under Salt Stress

Authors: Ahmed Bahieldin, Ahmed Atef, Jamal S. M. Sabir, Nour O. Gadalla, Sherif Edris, Ahmed M. Alzohairy, Nezar A. Radhwan, Mohammed N. Baeshen, Ahmed M. Ramadan, Hala F. Eissa, Sabah M. Hassan, Nabih A. Baeshen, Osama Abuzinadah, Magdy A. Al-Kordy, Fotouh M. El-Domyati, Robert K. Jansen

Abstract:

Wild salt-tolerant barley (Hordeum spontaneum) is the ancestor of cultivated barley (Hordeum vulgare or H. vulgare). Although the cultivated barley genome is well studied, little is known about genome structure and function of its wild ancestor. In the present study, RNA-Seq analysis was performed on young leaves of wild barley treated with salt (500 mM NaCl) at four different time intervals. Transcriptome sequencing yielded 103 to 115 million reads for all replicates of each treatment, corresponding to over 10 billion nucleotides per sample. Of the total reads, between 74.8 and 80.3% could be mapped and 77.4 to 81.7% of the transcripts were found in the H. vulgare unigene database (unigene-mapped). The unmapped wild barley reads for all treatments and replicates were assembled de novo and the resulting contigs were used as a new reference genome. This resultedin94.3 to 95.3%oftheunmapped reads mapping to the new reference. The number of differentially expressed transcripts was 9277, 3861 of which were uni gene-mapped. The annotated unigene- and de novo-mapped transcripts (5100) were utilized to generate expression clusters across time of salt stress treatment. Two-dimensional hierarchical clustering classified differential expression profiles into nine expression clusters, four of which were selected for further analysis. Differentially expressed transcripts were assigned to the main functional categories. The most important groups were ‘response to external stimulus’ and ‘electron-carrier activity’. Highly expressed transcripts are involved in several biological processes, including electron transport and exchanger mechanisms, flavonoid biosynthesis, reactive oxygen species (ROS) scavenging, ethylene production, signaling network and protein refolding. The comparisons demonstrated that mRNA-Seq is an efficient method for the analysis of differentially expressed genes and biological processes under salt stress.

Keywords: electron transport, flavonoid biosynthesis, reactive oxygen species, rnaseq

Procedia PDF Downloads 393

59 Optimal Pricing Based on Real Estate Demand Data

Authors: Vanessa Kummer, Maik Meusel

Abstract:

Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.

Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning

Procedia PDF Downloads 289

58 Phenotypic Diversity of the Tomato Germplasm from the Lazio Region in Central Italy, with a Case Study on Molecular Distinctiveness

Authors: Barbara Farinon, Maurizio E. Picarella, Lorenzo Mancini, Andrea Mazzucato

Abstract:

Italy is notoriously a secondary center of diversification for cultivated tomatoes (Solanum lycopersicum L.). The study of phenotypic and genetic diversity in landrace collections is important for germplasm conservation and biodiversity protection. Here, we set up to study the germplasm collected in the region of Lazio in Central Italy with a focus on the distinctiveness among landraces and the attribution of membership to unnamed accessions. Our regional collection included 30 accessions belonging to six different locally recognized landraces and 21 unnamed accessions. All accessions were gathered in Lazio and belonged to the collection held at the Regional Agency for the Development and Innovation of Agriculture in Lazio (ARSIAL, in the application of the Regional Act n. 15/2000, funded by Lazio Rural Development Plan 2014 – 2020 Agro-environmental Measure, Action 10.2.1) and at the University of Tuscia. We included 13 control genotypes as references. The collection showed wide phenotypic variability for several traits, such as fruit weight (range 14-277 g), locule number (2-12), shape index (0.54-2.65), yield (0.24-3.08 kg/plant), and soluble solids (3.4-7.5 °B). A few landraces showed uncommon phenotypes, such as potato leaf, colorless fruit epidermis, or delayed ripening. Multivariate analysis of 25 cardinal phenotypic variables grouped the named varieties and allowed to assign of some of the unnamed to recognized groups. A case study for distinctiveness is presented for the flattened-ribbed types that presented overlapping distribution according to the phenotypic data. Molecular markers retrieved by previous studies revealed differences compared to the phenotyping clustering, indicating that the named varieties “Scatolone di Bolsena” and “Pantano Romanesco” belong to the Marmande group, together with the reference landrace from Tuscany “Costoluto Fiorentino”. Differently, the landrace “Spagnoletta di Formia e Gaeta” was clearly distinct from the former at the molecular level. Therefore, a genotypic analysis of the analyzed collection appears needed to better define the molecular distinctiveness among the flattened-ribbed accessions, as well as to properly attribute the membership group of the unnamed accessions.

Keywords: distinctiveness, flattened-ribbed fruits, regional landraces, tomato

Procedia PDF Downloads 139

57 Comprehensive Longitudinal Multi-omic Profiling in Weight Gain and Insulin Resistance

Authors: Christine Y. Yeh, Brian D. Piening, Sarah M. Totten, Kimberly Kukurba, Wenyu Zhou, Kevin P. F. Contrepois, Gucci J. Gu, Sharon Pitteri, Michael Snyder

Abstract:

Three million deaths worldwide are attributed to obesity. However, the biomolecular mechanisms that describe the link between adiposity and subsequent disease states are poorly understood. Insulin resistance characterizes approximately half of obese individuals and is a major cause of obesity-mediated diseases such as Type II diabetes, hypertension and other cardiovascular diseases. This study makes use of longitudinal quantitative and high-throughput multi-omics (genomics, epigenomics, transcriptomics, glycoproteomics etc.) methodologies on blood samples to develop multigenic and multi-analyte signatures associated with weight gain and insulin resistance. Participants of this study underwent a 30-day period of weight gain via excessive caloric intake followed by a 60-day period of restricted dieting and return to baseline weight. Blood samples were taken at three different time points per patient: baseline, peak-weight and post weight loss. Patients were characterized as either insulin resistant (IR) or insulin sensitive (IS) before having their samples processed via longitudinal multi-omic technologies. This comparative study revealed a wealth of biomolecular changes associated with weight gain after using methods in machine learning, clustering, network analysis etc. Pathways of interest included those involved in lipid remodeling, acute inflammatory response and glucose metabolism. Some of these biomolecules returned to baseline levels as the patient returned to normal weight whilst some remained elevated. IR patients exhibited key differences in inflammatory response regulation in comparison to IS patients at all time points. These signatures suggest differential metabolism and inflammatory pathways between IR and IS patients. Biomolecular differences associated with weight gain and insulin resistance were identified on various levels: in gene expression, epigenetic change, transcriptional regulation and glycosylation. This study was not only able to contribute to new biology that could be of use in preventing or predicting obesity-mediated diseases, but also matured novel biomedical informatics technologies to produce and process data on many comprehensive omics levels.

Keywords: insulin resistance, multi-omics, next generation sequencing, proteogenomics, type ii diabetes

Procedia PDF Downloads 429

56 A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Authors: Natalia Rudeli, Elisabeth Viles, Adrian Santilli

Abstract:

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Keywords: cluster analysis, construction management, earned value, schedule

Procedia PDF Downloads 266

55 The Extent of Virgin Olive-Oil Prices' Distribution Revealing the Behavior of Market Speculators

Authors: Fathi Abid, Bilel Kaffel

Abstract:

The olive tree, the olive harvest during winter season and the production of olive oil better known by professionals under the name of the crushing operation have interested institutional traders such as olive-oil offices and private companies such as food industry refining and extracting pomace olive oil as well as export-import public and private companies specializing in olive oil. The major problem facing producers of olive oil each winter campaign, contrary to what is expected, it is not whether the harvest will be good or not but whether the sale price will allow them to cover production costs and achieve a reasonable margin of profit or not. These questions are entirely legitimate if we judge by the importance of the issue and the heavy complexity of the uncertainty and competition made tougher by a high level of indebtedness and the experience and expertise of speculators and producers whose objectives are sometimes conflicting. The aim of this paper is to study the formation mechanism of olive oil prices in order to learn about speculators’ behavior and expectations in the market, how they contribute by their industry knowledge and their financial alliances and the size the financial challenge that may be involved for them to build private information hoses globally to take advantage. The methodology used in this paper is based on two stages, in the first stage we study econometrically the formation mechanisms of olive oil price in order to understand the market participant behavior by implementing ARMA, SARMA, GARCH and stochastic diffusion processes models, the second stage is devoted to prediction purposes, we use a combined wavelet- ANN approach. Our main findings indicate that olive oil market participants interact with each other in a way that they promote stylized facts formation. The unstable participant’s behaviors create the volatility clustering, non-linearity dependent and cyclicity phenomena. By imitating each other in some periods of the campaign, different participants contribute to the fat tails observed in the olive oil price distribution. The best prediction model for the olive oil price is based on a back propagation artificial neural network approach with input information based on wavelet decomposition and recent past history.

Keywords: olive oil price, stylized facts, ARMA model, SARMA model, GARCH model, combined wavelet-artificial neural network, continuous-time stochastic volatility mode

Procedia PDF Downloads 340

54 The Relationship between Violence against Women in the Family and Common Mental Disorders in Urban Informal Settlements of Mumbai, India: A Cross-Sectional Study

Authors: Abigail Bentley, Audrey Prost, Nayreen Daruwalla, Apoorwa Gupta, David Osrin

Abstract:

BACKGROUND: Intimate partner violence (IPV) can impact a woman’s physical, reproductive and mental health, including common mental disorders such as anxiety and depression. However, people other than an intimate partner may also perpetrate violence against women in the family, particularly in India. This study aims to investigate the relationship between experiences of violence perpetrated by the husband and other members of the wider household and symptoms of common mental disorders in women residing in informal settlement (slum) areas of Mumbai. METHODS: Experiences of violence were assessed through a detailed cross-sectional survey of 598 women, including questions about specific acts of emotional, economic, physical and sexual violence across different time points in the woman’s life and the main perpetrator of each act. Symptoms of common mental disorders were assessed using the 12-item General Health Questionnaire (GHQ-12). The GHQ-12 scores were divided into four groups and the relationship between experiences of each type of violence in the last 12 months and GHQ-12 score group was analyzed using ordinal logistic regression, adjusted for the woman’s age and clustering. RESULTS: 482 (81%) women consented to interview. On average, they were 28.5 years old, had completed 7 years of education and had been married 9 years. 88% were Muslim and 47% lived in joint and 53% in nuclear families. 44% of women had experienced at least one act of violence in their lifetime (33% emotional, 22% economic, 23% physical, 12% sexual). 7% had a high GHQ-12 score (6 or above). For violence experiences in the last 12 months, the odds of being in the highest GHQ-12 score group versus the lower groups combined were 13.1 for emotional violence, 6.5 for economic, 5.7 for physical and 6.3 for sexual (p<0.001 for all outcomes). DISCUSSION: The high level of violence reported across the lifetime could be due to the detailed assessment of violent acts at multiple time points and the inclusion of perpetrators within the family other than the husband. Each type of violence was associated with greater odds of a higher GHQ-12 score and therefore more symptoms of common mental disorders. Emotional violence was far more strongly associated with symptoms of common mental disorders than physical or sexual violence. However, it is not possible to attribute causal directionality to the association. Further work to investigate the relationship between differing severity of violence experiences and women’s mental health and the components of emotional violence that make it so strongly associated with symptoms of common mental disorders would be beneficial.

Keywords: common mental disorders, family violence, India, informal settlements, mental health, violence against women

Procedia PDF Downloads 360

53 Identification of Damage Mechanisms in Interlock Reinforced Composites Using a Pattern Recognition Approach of Acoustic Emission Data

Authors: M. Kharrat, G. Moreau, Z. Aboura

Abstract:

The latest advances in the weaving industry, combined with increasingly sophisticated means of materials processing, have made it possible to produce complex 3D composite structures. Mainly used in aeronautics, composite materials with 3D architecture offer better mechanical properties than 2D reinforced composites. Nevertheless, these materials require a good understanding of their behavior. Because of the complexity of such materials, the damage mechanisms are multiple, and the scenario of their appearance and evolution depends on the nature of the exerted solicitations. The AE technique is a well-established tool for discriminating between the damage mechanisms. Suitable sensors are used during the mechanical test to monitor the structural health of the material. Relevant AE-features are then extracted from the recorded signals, followed by a data analysis using pattern recognition techniques. In order to better understand the damage scenarios of interlock composite materials, a multi-instrumentation was set-up in this work for tracking damage initiation and development, especially in the vicinity of the first significant damage, called macro-damage. The deployed instrumentation includes video-microscopy, Digital Image Correlation, Acoustic Emission (AE) and micro-tomography. In this study, a multi-variable AE data analysis approach was developed for the discrimination between the different signal classes representing the different emission sources during testing. An unsupervised classification technique was adopted to perform AE data clustering without a priori knowledge. The multi-instrumentation and the clustered data served to label the different signal families and to build a learning database. This latter is useful to construct a supervised classifier that can be used for automatic recognition of the AE signals. Several materials with different ingredients were tested under various solicitations in order to feed and enrich the learning database. The methodology presented in this work was useful to refine the damage threshold for the new generation materials. The damage mechanisms around this threshold were highlighted. The obtained signal classes were assigned to the different mechanisms. The isolation of a 'noise' class makes it possible to discriminate between the signals emitted by damages without resorting to spatial filtering or increasing the AE detection threshold. The approach was validated on different material configurations. For the same material and the same type of solicitation, the identified classes are reproducible and little disturbed. The supervised classifier constructed based on the learning database was able to predict the labels of the classified signals.

Keywords: acoustic emission, classifier, damage mechanisms, first damage threshold, interlock composite materials, pattern recognition

Procedia PDF Downloads 156

52 Robust Electrical Segmentation for Zone Coherency Delimitation Base on Multiplex Graph Community Detection

Authors: Noureddine Henka, Sami Tazi, Mohamad Assaad

Abstract:

The electrical grid is a highly intricate system designed to transfer electricity from production areas to consumption areas. The Transmission System Operator (TSO) is responsible for ensuring the efficient distribution of electricity and maintaining the grid's safety and quality. However, due to the increasing integration of intermittent renewable energy sources, there is a growing level of uncertainty, which requires a faster responsive approach. A potential solution involves the use of electrical segmentation, which involves creating coherence zones where electrical disturbances mainly remain within the zone. Indeed, by means of coherent electrical zones, it becomes possible to focus solely on the sub-zone, reducing the range of possibilities and aiding in managing uncertainty. It allows faster execution of operational processes and easier learning for supervised machine learning algorithms. Electrical segmentation can be applied to various applications, such as electrical control, minimizing electrical loss, and ensuring voltage stability. Since the electrical grid can be modeled as a graph, where the vertices represent electrical buses and the edges represent electrical lines, identifying coherent electrical zones can be seen as a clustering task on graphs, generally called community detection. Nevertheless, a critical criterion for the zones is their ability to remain resilient to the electrical evolution of the grid over time. This evolution is due to the constant changes in electricity generation and consumption, which are reflected in graph structure variations as well as line flow changes. One approach to creating a resilient segmentation is to design robust zones under various circumstances. This issue can be represented through a multiplex graph, where each layer represents a specific situation that may arise on the grid. Consequently, resilient segmentation can be achieved by conducting community detection on this multiplex graph. The multiplex graph is composed of multiple graphs, and all the layers share the same set of vertices. Our proposal involves a model that utilizes a unified representation to compute a flattening of all layers. This unified situation can be penalized to obtain (K) connected components representing the robust electrical segmentation clusters. We compare our robust segmentation to the segmentation based on a single reference situation. The robust segmentation proves its relevance by producing clusters with high intra-electrical perturbation and low variance of electrical perturbation. We saw through the experiences when robust electrical segmentation has a benefit and in which context.

Keywords: community detection, electrical segmentation, multiplex graph, power grid

Procedia PDF Downloads 79

51 MicroRNA-1246 Expression Associated with Resistance to Oncogenic BRAF Inhibitors in Mutant BRAF Melanoma Cells

Authors: Jae-Hyeon Kim, Michael Lee

Abstract:

Intrinsic and acquired resistance limits the therapeutic benefits of oncogenic BRAF inhibitors in melanoma. MicroRNAs (miRNA) regulate the expression of target mRNAs by repressing their translation. Thus, we investigated miRNA expression patterns in melanoma cell lines to identify candidate biomarkers for acquired resistance to BRAF inhibitor. Here, we used Affymetrix miRNA V3.0 microarray profiling platform to compare miRNA expression levels in three cell lines containing BRAF inhibitor-sensitive A375P BRAF V600E cells, their BRAF inhibitor-resistant counterparts (A375P/Mdr), and SK-MEL-2 BRAF-WT cells with intrinsic resistance to BRAF inhibitor. The miRNAs with at least a two-fold change in expression between BRAF inhibitor-sensitive and –resistant cell lines, were identified as differentially expressed. Averaged intensity measurements identified 138 and 217 miRNAs that were differentially expressed by 2 fold or more between: 1) A375P and A375P/Mdr; 2) A375P and SK-MEL-2, respectively. The hierarchical clustering revealed differences in miRNA expression profiles between BRAF inhibitor-sensitive and –resistant cell lines for miRNAs involved in intrinsic and acquired resistance to BRAF inhibitor. In particular, 43 miRNAs were identified whose expression was consistently altered in two BRAF inhibitor-resistant cell lines, regardless of intrinsic and acquired resistance. Twenty five miRNAs were consistently upregulated and 18 downregulated more than 2-fold. Although some discrepancies were detected when miRNA microarray data were compared with qPCR-measured expression levels, qRT-PCR for five miRNAs (miR-3617, miR-92a1, miR-1246, miR-1936-3p, and miR-17-3p) results showed excellent agreement with microarray experiments. To further investigate cellular functions of miRNAs, we examined effects on cell proliferation. Synthetic oligonucleotide miRNA mimics were transfected into three cell lines, and proliferation was quantified using a colorimetric assay. Of the 5 miRNAs tested, only miR-1246 altered cell proliferation of A375P/Mdr cells. The transfection of miR-1246 mimic strongly conferred PLX-4720 resistance to A375P/Mdr cells, implying that miR-1246 upregulation confers acquired resistance to BRAF inhibition. We also found that PLX-4720 caused much greater G2/M arrest in A375P/Mdr cells transfected with miR-1246mimic than that seen in scrambled RNA-transfected cells. Additionally, miR-1246 mimic partially caused a resistance to autophagy induction by PLX-4720. These results indicate that autophagy does play an essential death-promoting role inPLX-4720-induced cell death. Taken together, these results suggest that miRNA expression profiling in melanoma cells can provide valuable information for a network of BRAF inhibitor resistance-associated miRNAs.

Keywords: microRNA, BRAF inhibitor, drug resistance, autophagy

Procedia PDF Downloads 326

50 Applying GIS Geographic Weighted Regression Analysis to Assess Local Factors Impeding Smallholder Farmers from Participating in Agribusiness Markets: A Case Study of Vihiga County, Western Kenya

Authors: Mwehe Mathenge, Ben G. J. S. Sonneveld, Jacqueline E. W. Broerse

Abstract:

Smallholder farmers are important drivers of agriculture productivity, food security, and poverty reduction in Sub-Saharan Africa. However, they are faced with myriad challenges in their efforts at participating in agribusiness markets. How the geographic explicit factors existing at the local level interact to impede smallholder farmers' decision to participates (or not) in agribusiness markets is not well understood. Deconstructing the spatial complexity of the local environment could provide a deeper insight into how geographically explicit determinants promote or impede resource-poor smallholder farmers from participating in agribusiness. This paper’s objective was to identify, map, and analyze local spatial autocorrelation in factors that impede poor smallholders from participating in agribusiness markets. Data were collected using geocoded researcher-administered survey questionnaires from 392 households in Western Kenya. Three spatial statistics methods in geographic information system (GIS) were used to analyze data -Global Moran’s I, Cluster and Outliers Analysis (Anselin Local Moran’s I), and geographically weighted regression. The results of Global Moran’s I reveal the presence of spatial patterns in the dataset that was not caused by spatial randomness of data. Subsequently, Anselin Local Moran’s I result identified spatially and statistically significant local spatial clustering (hot spots and cold spots) in factors hindering smallholder participation. Finally, the geographically weighted regression results unearthed those specific geographic explicit factors impeding market participation in the study area. The results confirm that geographically explicit factors are indispensable in influencing the smallholder farming decisions, and policymakers should take cognizance of them. Additionally, this research demonstrated how geospatial explicit analysis conducted at the local level, using geographically disaggregated data, could help in identifying households and localities where the most impoverished and resource-poor smallholder households reside. In designing spatially targeted interventions, policymakers could benefit from geospatial analysis methods in understanding complex geographic factors and processes that interact to influence smallholder farmers' decision-making processes and choices.

Keywords: agribusiness markets, GIS, smallholder farmers, spatial statistics, disaggregated spatial data

Procedia PDF Downloads 139

49 Magnetic Navigation in Underwater Networks

Authors: Kumar Divyendra

Abstract:

Underwater Sensor Networks (UWSNs) have wide applications in areas such as water quality monitoring, marine wildlife management etc. A typical UWSN system consists of a set of sensors deployed randomly underwater which communicate with each other using acoustic links. RF communication doesn't work underwater, and GPS too isn't available underwater. Additionally Automated Underwater Vehicles (AUVs) are deployed to collect data from some special nodes called Cluster Heads (CHs). These CHs aggregate data from their neighboring nodes and forward them to the AUVs using optical links when an AUV is in range. This helps reduce the number of hops covered by data packets and helps conserve energy. We consider the three-dimensional model of the UWSN. Nodes are initially deployed randomly underwater. They attach themselves to the surface using a rod and can only move upwards or downwards using a pump and bladder mechanism. We use graph theory concepts to maximize the coverage volume while every node maintaining connectivity with at least one surface node. We treat the surface nodes as landmarks and each node finds out its hop distance from every surface node. We treat these hop-distances as coordinates and use them for AUV navigation. An AUV intending to move closer to a node with given coordinates moves hop by hop through nodes that are closest to it in terms of these coordinates. In absence of GPS, multiple different approaches like Inertial Navigation System (INS), Doppler Velocity Log (DVL), computer vision-based navigation, etc., have been proposed. These systems have their own drawbacks. INS accumulates error with time, vision techniques require prior information about the environment. We propose a method that makes use of the earth's magnetic field values for navigation and combines it with other methods that simultaneously increase the coverage volume under the UWSN. The AUVs are fitted with magnetometers that measure the magnetic intensity (I), horizontal inclination (H), and Declination (D). The International Geomagnetic Reference Field (IGRF) is a mathematical model of the earth's magnetic field, which provides the field values for the geographical coordinateson earth. Researchers have developed an inverse deep learning model that takes the magnetic field values and predicts the location coordinates. We make use of this model within our work. We combine this with with the hop-by-hop movement described earlier so that the AUVs move in such a sequence that the deep learning predictor gets trained as quickly and precisely as possible We run simulations in MATLAB to prove the effectiveness of our model with respect to other methods described in the literature.

Keywords: clustering, deep learning, network backbone, parallel computing

Procedia PDF Downloads 99

48 Retrieving Iconometric Proportions of South Indian Sculptures Based on Statistical Analysis

Authors: M. Bagavandas

Abstract:

Introduction: South Indian stone sculptures are known for their elegance and history. They are available in large numbers in different monuments situated different parts of South India. These art pieces have been studied using iconography details, but this pioneering study introduces a novel method known as iconometry which is a quantitative study that deals with measurements of different parts of icons to find answers for important unanswered questions. The main aim of this paper is to compare iconometric measurements of the sculptures with canonical proportion to determine whether the sculptors of the past had followed any of the canonical proportions prescribed in the ancient text. If not, this study recovers the proportions used for carving sculptures which is not available to us now. Also, it will be interesting to see how these sculptural proportions of different monuments belonging to different dynasties differ from one another in terms these proportions. Methods and Materials: As Indian sculptures are depicted in different postures, one way of making measurements independent of size, is to decode on a suitable measurement and convert the other measurements as proportions with respect to the chosen measurement. Since in all canonical texts of Indian art, all different measurements are given in terms of face length, it is chosen as the required measurement for standardizing the measurements. In order to compare these facial measurements with measurements prescribed in Indian canons of Iconography, the ten facial measurements like face length, morphological face length, nose length, nose-to-chin length, eye length, lip length, face breadth, nose breadth, eye breadth and lip breadth were standardized using the face length and the number of measurements reduced to nine. Each measurement was divided by the corresponding face length and multiplied by twelve and given in angula unit used in the canonical texts. The reason for multiplying by twelve is that the face length is given as twelve angulas in the canonical texts for all figures. Clustering techniques were used to determine whether the sculptors of the past had followed any of the proportions prescribed in the canonical texts of the past to carve sculptures and also to compare the proportions of sculptures of different monuments. About one hundred twenty-seven stone sculptures from four monuments belonging to the Pallava, the Chola, the Pandya and the Vijayanagar dynasties were taken up for this study. These art pieces belong to a period ranging from the eighth to the sixteenth century A.D. and all of them adorning different monuments situated in different parts of Tamil Nadu State, South India. Anthropometric instruments were used for taking measurements and the author himself had measured all the sample pieces of this study. Result: Statistical analysis of sculptures of different centers of art from different dynasties shows a considerable difference in facial proportions and many of these proportions differ widely from the canonical proportions. The retrieved different facial proportions indicate that the definition of beauty has been changing from period to period and region to region.

Keywords: iconometry, proportions, sculptures, statistics

Procedia PDF Downloads 154

47 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 213

46 Spatial Economic Attributes of O. R. Tambo Airport, South Africa

Authors: Masilonyane Mokhele

Abstract:

Across the world, different planning models of the so-called airport-led developments are becoming bandwagons hailed as key to the future of cities. However, in the existing knowledge, there is paucity of empirically informed description and explanation of the economic fundamentals driving the forces of attraction of airports. This void is arguably a result of the absence of an appropriate theoretical framework to guide the analyses. Given this paucity, the aim of the paper is to contribute towards a theoretical framework that could be used to describe and explain forces that drive the location and mix of airport-centric developments. Towards achieving this aim, the objectives of the paper are: one, to establish the type of economic activities that are located on and around O.R. Tambo International Airport (ORTIA), and analyse the reasons for locating there; two, to establish changes that have occurred over time in the form of the airport-centric development of ORTIA; three, to identify the propulsive economic qualities of ORTIA; four, to analyse the spatial, economic and structural linkages within the airport-centric development of ORTIA, between the airport-centric development and the airport, as well as the airport-centric development’s linkages with their metropolitan area and other regional, national and international airport-centric developments and locations. To address the objectives above, the study adopted a case study approach, centred on ORTIA in South Africa: Africa’s busiest airport in terms of passengers and airfreight handled. Using a lens of location theory, a survey was adopted as a main research method, wherein telephonic interviews were conducted with a representative number of firms on and around ORTIA. Other data collection methods encompassed in-depth qualitative interviews (to augment the information obtained through the survey) and analysis of secondary information, particularly as regards establishing changes that have occurred in the form of ORTIA and surrounds. From the empirical findings, ORTIA was discovered to have propulsive economic qualities that act as significant forces of attraction in the clustering of firms. Together with its airport-centric development, ORTIA was discovered to have growth pole properties because of the linkages that occur within the study area, and the linkages that exist between the airport-centric firms and the airport. It was noted that the transport-oriented firms (typified by couriers and freight carriers) act as anchors in some fellow airport-centric firms making use of elements of urbanisation economies, particularly as regards the use of the airport for airfreight services. The empirical findings presented in the paper (in conjunction with results from other airport-centric development case studies) could be used as contribution towards extending theory that describes and explains forces that drive the location and mix of airport-centric developments.

Keywords: airports, airport-centric development, O. R. Tambo international airport, South Africa

Procedia PDF Downloads 272

45 The Dynamics of Planktonic Crustacean Populations in an Open Access Lagoon, Bordered by Heavy Industry, Southwest, Nigeria

Authors: E. O. Clarke, O. J. Aderinola, O. A. Adeboyejo, M. A. Anetekhai

Abstract:

Aims: The study is aimed at establishing the influence of some physical and chemical parameters on the abundance, distribution pattern and seasonal variations of the planktonic crustacean populations. Place and Duration of Study: A premier investigation into the dynamics of planktonic crustacean populations in Ologe lagoon was carried out from January 2011 to December 2012. Study Design: The study covered identification, temporal abundance, spatial distribution and diversity of the planktonic crustacea. Methodology: Standard techniques were used to collect samples from eleven stations covering five proximal satellite towns (Idoluwo, Oto, Ibiye, Obele, and Gbanko) bordering the lagoon. Data obtained were statistically analyzed using linear regression and hierarchical clustering. Results:Thirteen (13) planktonic crustacean populations were identified. Total percentage abundance was highest for Bosmina species (20%) and lowest for Polyphemus species (0.8%). The Pearson’s correlation coefficient (“r” values) between total planktonic crustacean population and some physical and chemical parameters showed that positive correlations having low level of significance occurred with salinity (r = 0.042) (sig = 0.184) and with surface water dissolved oxygen (r = 0.299) (sig = 0.155). Linear regression plots indicated that, the total population of planktonic crustacea were mainly influenced and only increased with an increase in value of surface water temperature (Rsq = 0.791) and conductivity (Rsq = 0.589). The total population of planktonic crustacea had a near neutral (zero correlation) with the surface water dissolved oxygen and thus, does not significantly change with the level of the surface water dissolved oxygen. The correlations were positive with NO3-N (midstream) at Ibiye (Rsq =0.022) and (downstream) Gbanko (Rsq =0.013), PO4-P at Ibiye (Rsq =0.258), K at Idoluwo (Rsq =0.295) and SO4-S at Oto (Rsq = 0.094) and Gbanko (Rsq = 0.457). The Berger-Parker Dominance Index (BPDI) showed that the most dominant species was Bosmina species (BPDI = 1.000), followed by Calanus species (BPDI = 1.254). Clusters by squared Euclidan distances using average linkage between groups showed proximities, transcending the borders of genera. Conclusion: The results revealed that planktonic crustacean population in Ologe lagoon undergo seasonal perturbations, were highly influenced by nutrient, metal and organic matter inputs from river Owoh, Agbara industrial estate and surrounding farmlands and were patchy in spatial distribution.

Keywords: diversity, dominance, perturbations, richness, crustacea, lagoon

Procedia PDF Downloads 722

44 Sustainable Business Model Archetypes – A Systematic Review and Application to the Plastic Industry

Authors: Felix Schumann, Giorgia Carratta, Tobias Dauth, Liv Jaeckel

Abstract:

In the last few decades, the rapid growth of the use and disposal of plastic items has led to their overaccumulation in the environment. As a result, plastic pollution has become a subject of global concern. Today plastics are used as raw materials in almost every industry. While the recognition of the ecological, social, and economic impact of plastics in academic research is on the rise, the potential role of the ‘plastic industry’ in dealing with such issues is still largely underestimated. Therefore, the literature on sustainable plastic management is still nascent and fragmented. Working towards sustainability requires a fundamental shift in the way companies employ plastics in their day-to-day business. For that reason, the applicability of the business model concept has recently gained momentum in environmental research. Business model innovation is increasingly recognized as an important driver to re-conceptualize the purpose of the firm and to readily integrate sustainability in their business. It can serve as a starting point to investigate whether and how sustainability can be realized under industry- and firm-specific circumstances. Yet, there is no comprehensive view in the plastic industry on how firms start refining their business models to embed sustainability in their operations. Our study addresses this gap, looking primarily at the industrial sectors responsible for the production of the largest amount of plastic waste today: plastic packaging, consumer goods, construction, textile, and transport. Relying on the archetypes of sustainable business models and applying them to the aforementioned sectors, we try to identify companies’ current strategies to make their business models more sustainable. Based on the thematic clustering, we can develop an integrative framework for the plastic industry. The findings are underpinned and illustrated by a variety of relevant plastic management solutions that the authors have identiﬁed through a systematic literature review and analysis of existing, empirically grounded research in this field. Using the archetypes, we can promote options for business model innovations for the most important sectors in which plastics are used. Moreover, by linking the proposed business model archetypes to the plastic industry, our research approach guides firms in exploring sustainable business opportunities. Likewise, researchers and policymakers can utilize our classification to identify best practices. The authors believe that the study advances the current knowledge on sustainable plastic management through its broad empirical industry analyses. Hence, the application of business model archetypes in the plastic industry will be useful for shaping companies’ transformation to create and deliver more sustainability and provides avenues for future research endeavors.

Keywords: business models, environmental economics, plastic management, plastic pollution, sustainability

Procedia PDF Downloads 100

43 The Relationship between Violence against Women and Levels of Self-Esteem in Urban Informal Settlements of Mumbai, India: A Cross-Sectional Study

Authors: A. Bentley, A. Prost, N. Daruwalla, D. Osrin

Abstract:

Background: This study aims to investigate the relationship between experiences of violence against women in the family, and levels of self-esteem in women residing in informal settlement (slum) areas of Mumbai, India. The authors hypothesise that violence against women in Indian households extends beyond that of intimate partner violence (IPV), to include other members of the family and that experiences of violence are associated with lower levels of self-esteem. Methods: Experiences of violence were assessed through a cross-sectional survey of 598 women, including questions about specific acts of emotional, economic, physical and sexual violence across different time points, and the main perpetrator of each. Self-esteem was assessed using the Rosenberg self-esteem questionnaire. A global score for self-esteem was calculated and the relationship between violence in the past year and Rosenberg self-esteem score was assessed using multivariable linear regression models, adjusted for years of education completed, and clustering using robust standard errors. Results: 482 (81%) women consented to interview. On average, they were 28.5 years old, had completed 6 years of education and had been married 9.5 years. 88% were Muslim and 46% lived in joint families. 44% of women had experienced at least one act of violence in their lifetime (33% emotional, 22% economic, 24% physical, 12% sexual). Of the women who experienced violence after marriage, 70% cited a perpetrator other than the husband for at least one of the acts. 5% had low self-esteem (Rosenberg score < 15). For women who experienced emotional violence in the past year, the Rosenberg score was 2.6 points lower (p < 0.001). It was 1.2 points lower (p = 0.03) for women who experienced economic violence. For physical or sexual violence in the past year, no statistically significant relationship with Rosenberg score was seen. However, for a one-unit increase in the number of different acts of each type of violence experienced in the past year, a decrease in Rosenberg score was seen (-0.62 for emotional, -0.76 for economic, -0.53 for physical and -0.47 for sexual; p < 0.05 for all). Discussion: The high prevalence of violence experiences across the lifetime was likely due to the detailed assessment of violence and the inclusion of perpetrators within the family other than the husband. Experiences of emotional or economic violence in the past year were associated with lower Rosenberg scores and therefore lower self-esteem, but no relationship was seen between experiences of physical or sexual violence and Rosenberg score overall. For all types of violence in the past year, a greater number of different acts were associated with a decrease in Rosenberg score. Emotional violence showed the strongest relationship with self-esteem, but for all types of violence the more complex the pattern of perpetration with different methods used, the lower the levels of self-esteem. Due to the cross-sectional nature of the study causal directionality cannot be attributed. Further work to investigate the relationship between severity of violence and self-esteem and whether self-esteem mediates relationships between violence and poorer mental health would be beneficial.

Keywords: family violence, India, informal settlements, Rosenberg self-esteem scale, self-esteem, violence against women

Procedia PDF Downloads 126

42 Applying Big Data Analysis to Efficiently Exploit the Vast Unconventional Tight Oil Reserves

Authors: Shengnan Chen, Shuhua Wang

Abstract:

Successful production of hydrocarbon from unconventional tight oil reserves has changed the energy landscape in North America. The oil contained within these reservoirs typically will not flow to the wellbore at economic rates without assistance from advanced horizontal well and multi-stage hydraulic fracturing. Efficient and economic development of these reserves is a priority of society, government, and industry, especially under the current low oil prices. Meanwhile, society needs technological and process innovations to enhance oil recovery while concurrently reducing environmental impacts. Recently, big data analysis and artificial intelligence become very popular, developing data-driven insights for better designs and decisions in various engineering disciplines. However, the application of data mining in petroleum engineering is still in its infancy. The objective of this research aims to apply intelligent data analysis and data-driven models to exploit unconventional oil reserves both efficiently and economically. More specifically, a comprehensive database including the reservoir geological data, reservoir geophysical data, well completion data and production data for thousands of wells is firstly established to discover the valuable insights and knowledge related to tight oil reserves development. Several data analysis methods are introduced to analysis such a huge dataset. For example, K-means clustering is used to partition all observations into clusters; principle component analysis is applied to emphasize the variation and bring out strong patterns in the dataset, making the big data easy to explore and visualize; exploratory factor analysis (EFA) is used to identify the complex interrelationships between well completion data and well production data. Different data mining techniques, such as artificial neural network, fuzzy logic, and machine learning technique are then summarized, and appropriate ones are selected to analyze the database based on the prediction accuracy, model robustness, and reproducibility. Advanced knowledge and patterned are finally recognized and integrated into a modified self-adaptive differential evolution optimization workflow to enhance the oil recovery and maximize the net present value (NPV) of the unconventional oil resources. This research will advance the knowledge in the development of unconventional oil reserves and bridge the gap between the big data and performance optimizations in these formations. The newly developed data-driven optimization workflow is a powerful approach to guide field operation, which leads to better designs, higher oil recovery and economic return of future wells in the unconventional oil reserves.

Keywords: big data, artificial intelligence, enhance oil recovery, unconventional oil reserves

Procedia PDF Downloads 285

41 Machine Learning Prediction of Diabetes Prevalence in the U.S. Using Demographic, Physical, and Lifestyle Indicators: A Study Based on NHANES 2009-2018

Authors: Oluwafunmibi Omotayo Fasanya, Augustine Kena Adjei

Abstract:

To develop a machine learning model to predict diabetes (DM) prevalence in the U.S. population using demographic characteristics, physical indicators, and lifestyle habits, and to analyze how these factors contribute to the likelihood of diabetes. We analyzed data from 23,546 participants aged 20 and older, who were non-pregnant, from the 2009-2018 National Health and Nutrition Examination Survey (NHANES). The dataset included key demographic (age, sex, ethnicity), physical (BMI, leg length, total cholesterol [TCHOL], fasting plasma glucose), and lifestyle indicators (smoking habits). A weighted sample was used to account for NHANES survey design features such as stratification and clustering. A classification machine learning model was trained to predict diabetes status. The target variable was binary (diabetes or non-diabetes) based on fasting plasma glucose measurements. The following models were evaluated: Logistic Regression (baseline), Random Forest Classifier, Gradient Boosting Machine (GBM), Support Vector Machine (SVM). Model performance was assessed using accuracy, F1-score, AUC-ROC, and precision-recall metrics. Feature importance was analyzed using SHAP values to interpret the contributions of variables such as age, BMI, ethnicity, and smoking status. The Gradient Boosting Machine (GBM) model outperformed other classifiers with an AUC-ROC score of 0.85. Feature importance analysis revealed the following key predictors: Age: The most significant predictor, with diabetes prevalence increasing with age, peaking around the 60s for males and 70s for females. BMI: Higher BMI was strongly associated with a higher risk of diabetes. Ethnicity: Black participants had the highest predicted prevalence of diabetes (14.6%), followed by Mexican-Americans (13.5%) and Whites (10.6%). TCHOL: Diabetics had lower total cholesterol levels, particularly among White participants (mean decline of 23.6 mg/dL). Smoking: Smoking showed a slight increase in diabetes risk among Whites (0.2%) but had a limited effect in other ethnic groups. Using machine learning models, we identified key demographic, physical, and lifestyle predictors of diabetes in the U.S. population. The results confirm that diabetes prevalence varies significantly across age, BMI, and ethnic groups, with lifestyle factors such as smoking contributing differently by ethnicity. These findings provide a basis for more targeted public health interventions and resource allocation for diabetes management.

Keywords: diabetes, NHANES, random forest, gradient boosting machine, support vector machine

Procedia PDF Downloads 11