Search results for: possibilistic clustering
258 Machine Learning Analysis of Eating Disorders Risk, Physical Activity and Psychological Factors in Adolescents: A Community Sample Study
Authors: Marc Toutain, Pascale Leconte, Antoine Gauthier
Abstract:
Introduction: Eating Disorders (ED), such as anorexia, bulimia, and binge eating, are psychiatric illnesses that mostly affect young people. The main symptoms concern eating (restriction, excessive food intake) and weight control behaviors (laxatives, vomiting). Psychological comorbidities (depression, executive function disorders, etc.) and problematic behaviors toward physical activity (PA) are commonly associated with ED. Acquaintances on ED risk factors are still lacking, and more community sample studies are needed to improve prevention and early detection. To our knowledge, studies are needed to specifically investigate the link between ED risk level, PA, and psychological risk factors in a community sample of adolescents. The aim of this study is to assess the relation between ED risk level, exercise (type, frequency, and motivations for engaging in exercise), and psychological factors based on the Jacobi risk factors model. We suppose that a high risk of ED will be associated with the practice of high caloric cost PA, motivations oriented to weight and shape control, and psychological disturbances. Method: An online survey destined for students has been sent to several middle schools and colleges in northwest France. This survey combined several questionnaires, the Eating Attitude Test-26 assessing ED risk; the Exercise Motivation Inventory–2 assessing motivations toward PA; the Hospital Anxiety and Depression Scale assessing anxiety and depression, the Contour Drawing Rating Scale; and the Body Esteem Scale assessing body dissatisfaction, Rosenberg Self-esteem Scale assessing self-esteem, the Exercise Dependence Scale-Revised assessing PA dependence, the Multidimensional Assessment of Interoceptive Awareness assessing interoceptive awareness and the Frost Multidimensional Perfectionism Scale assessing perfectionism. Machine learning analysis will be performed in order to constitute groups with a tree-based model clustering method, extract risk profile(s) with a bootstrap method comparison, and predict ED risk with a prediction method based on a decision tree-based model. Expected results: 1044 complete records have already been collected, and the survey will be closed at the end of May 2022. Records will be analyzed with a clustering method and a bootstrap method in order to reveal risk profile(s). Furthermore, a predictive tree decision method will be done to extract an accurate predictive model of ED risk. This analysis will confirm typical main risk factors and will give more data on presumed strong risk factors such as exercise motivations and interoceptive deficit. Furthermore, it will enlighten particular risk profiles with a strong level of proof and greatly contribute to improving the early detection of ED and contribute to a better understanding of ED risk factors.Keywords: eating disorders, risk factors, physical activity, machine learning
Procedia PDF Downloads 83257 The Efficacy of Open Educational Resources in Students’ Performance and Engagement
Authors: Huda Al-Shuaily, E. M. Lacap
Abstract:
Higher Education is one of the most essential fundamentals for the advancement and progress of a country. It demands to be as accessible as possible and as comprehensive as it can be reached. In this paper, we succeeded to expand the accessibility and delivery of higher education using an Open Educational Resources (OER), a freely accessible, openly licensed documents, and media for teaching and learning. This study creates a comparative design of student’s academic performance on the course Introduction to Database and student engagement to the virtual learning environment (VLE). The study was done in two successive semesters - one without using the OER and the other is using OER. In the study, we established that there is a significant increase in student’s engagement in VLE in the latter semester compared to the former. By using the latter semester’s data, we manage to show that the student’s engagement has a positive impact on students’ academic performance. Moreso, after clustering their academic performance, the impact is seen higher for students who are low performing. The results show that these engagements can be used to potentially predict the learning styles of the student with a high degree of precision.Keywords: EDM, learning analytics, moodle, OER, student-engagement
Procedia PDF Downloads 339256 Microbial Biogeography of Greek Olive Varieties Assessed by Amplicon-Based Metagenomics Analysis
Authors: Lena Payati, Maria Kazou, Effie Tsakalidou
Abstract:
Table olives are one of the most popular fermented vegetables worldwide, which along with olive oil, have a crucial role in the world economy. They are highly appreciated by the consumers for their characteristic taste and pleasant aromas, while several health and nutritional benefits have been reported as well. Until recently, microbial biogeography, i.e., the study of microbial diversity over time and space, has been mainly associated with wine. However, nowadays, the term 'terroir' has been extended to other crops and food products so as to link the geographical origin and environmental conditions to quality aspects of fermented foods. Taking the above into consideration, the present study focuses on the microbial fingerprinting of the most important olive varieties of Greece with the state-of-the-art amplicon-based metagenomics analysis. Towards this, in 2019, 61 samples from 38 different olive varieties were collected at the final stage of ripening from 13 well spread geographical regions in Greece. For the metagenomics analysis, total DNA was extracted from the olive samples, and the 16S rRNA gene and ITS DNA region were sequenced and analyzed using bioinformatics tools for the identification of bacterial and yeasts/fungal diversity, respectively. Furthermore, principal component analysis (PCA) was also performed for data clustering based on the average microbial composition of all samples from each region of origin. According to the composition, results obtained, when samples were analyzed separately, the majority of both bacteria (such as Pantoea, Enterobacter, Roserbergiella, and Pseudomonas) and yeasts/fungi (such as Aureobasidium, Debaromyces, Candida, and Cladosporium) genera identified were found in all 61 samples. Even though interesting differences were observed at the relative abundance level of the identified genera, the bacterial genus Pantoea and the yeast/fungi genus Aureobasidium were the dominant ones in 35 and 40 samples, respectively. Of note, olive samples collected from the same region had similar fingerprint (genera identified and relative abundance level) regardless of the variety, indicating a potential association between the relative abundance of certain taxa and the geographical region. When samples were grouped by region of origin, distinct bacterial profiles per region were observed, which was also evident from the PCA analysis. This was not the case for the yeast/fungi profiles since 10 out of the 13 regions were grouped together mainly due to the dominance of the genus Aureobasidium. A second cluster was formed for the islands Crete and Rhodes, both of which are located in the Southeast Aegean Sea. These two regions clustered together mainly due to the identification of the genus Toxicocladosporium in relatively high abundances. Finally, the Agrinio region was separated from the others as it showed a completely different microbial fingerprinting. However, due to the limited number of olive samples from some regions, a subsequent PCA analysis with more samples from these regions is expected to yield in a more clear clustering. The present study is part of a bigger project, the first of its kind in Greece, with the ultimate goal to analyze a larger set of olive samples of different varieties and from different regions in Greece in order to have a reliable olives’ microbial biogeography.Keywords: amplicon-based metagenomics analysis, bacteria, microbial biogeography, olive microbiota, yeasts/fungi
Procedia PDF Downloads 114255 Genomic Adaptation to Local Climate Conditions in Native Cattle Using Whole Genome Sequencing Data
Authors: Rugang Tian
Abstract:
In this study, we generated whole-genome sequence (WGS) data from110 native cattle. Together with whole-genome sequences from world-wide cattle populations, we estimated the genetic diversity and population genetic structure of different cattle populations. Our findings revealed clustering of cattle groups in line with their geographic locations. We identified noticeable genetic diversity between indigenous cattle breeds and commercial populations. Among all studied cattle groups, lower genetic diversity measures were found in commercial populations, however, high genetic diversity were detected in some local cattle, particularly in Rashoki and Mongolian breeds. Our search for potential genomic regions under selection in native cattle revealed several candidate genes related with immune response and cold shock protein on multiple chromosomes such as TRPM8, NMUR1, PRKAA2, SMTNL2 and OXR1 that are involved in energy metabolism and metabolic homeostasis.Keywords: cattle, whole-genome, population structure, adaptation
Procedia PDF Downloads 73254 Bioinformatics Analysis of DGAT1 Gene in Domestic Ruminnants
Authors: Sirous Eydivandi
Abstract:
Diacylglycerol-O-acyltransferase (DGAT1) gene encodes diacylglycerol transferase enzyme that plays an important role in glycerol lipid metabolism. DGAT1 is considered to be the key enzyme in controlling the synthesis of triglycerides in adipocytes. This enzyme catalyzes the final step of triglyceride synthesis (transform triacylglycerol (DAG) into triacylglycerol (TAG). A total of 20 DGAT1 gene sequences and corresponding amino acids belonging to 4 species include cattle, goats, sheep and yaks were analyzed, and the differentiation within and among the species was also studied. The length of the DGAT1 gene varies greatly, from 1527 to 1785 bp, due to deletion, insertion, and stop codon mutation resulting in elongation. Observed genetic diversity was higher among species than within species, and Goat had more polymorphisms than any other species. Novel amino acid variation sites were detected within several species which might be used to illustrate the functional variation. Differentiation of the DGAT1 gene was obvious among species, and the clustering result was consistent with the taxonomy in the National Center for Biotechnology Information.Keywords: DGAT1gene, bioinformatic, ruminnants, biotechnology information
Procedia PDF Downloads 491253 Sentiment Classification of Documents
Authors: Swarnadip Ghosh
Abstract:
Sentiment Analysis is the process of detecting the contextual polarity of text. In other words, it determines whether a piece of writing is positive, negative or neutral.Sentiment analysis of documents holds great importance in today's world, when numerous information is stored in databases and in the world wide web. An efficient algorithm to illicit such information, would be beneficial for social, economic as well as medical purposes. In this project, we have developed an algorithm to classify a document into positive or negative. Using our algorithm, we obtained a feature set from the data, and classified the documents based on this feature set. It is important to note that, in the classification, we have not used the independence assumption, which is considered by many procedures like the Naive Bayes. This makes the algorithm more general in scope. Moreover, because of the sparsity and high dimensionality of such data, we did not use empirical distribution for estimation, but developed a method by finding degree of close clustering of the data points. We have applied our algorithm on a movie review data set obtained from IMDb and obtained satisfactory results.Keywords: sentiment, Run's Test, cross validation, higher dimensional pmf estimation
Procedia PDF Downloads 402252 Verification & Validation of Map Reduce Program Model for Parallel K-Mediod Algorithm on Hadoop Cluster
Authors: Trapti Sharma, Devesh Kumar Srivastava
Abstract:
This paper is basically a analysis study of above MapReduce implementation and also to verify and validate the MapReduce solution model for Parallel K-Mediod algorithm on Hadoop Cluster. MapReduce is a programming model which authorize the managing of huge amounts of data in parallel, on a large number of devices. It is specially well suited to constant or moderate changing set of data since the implementation point of a position is usually high. MapReduce has slowly become the framework of choice for “big data”. The MapReduce model authorizes for systematic and instant organizing of large scale data with a cluster of evaluate nodes. One of the primary affect in Hadoop is how to minimize the completion length (i.e. makespan) of a set of MapReduce duty. In this paper, we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Mediod clustering algorithm. We have found that as the amount of nodes increases the completion time decreases.Keywords: hadoop, mapreduce, k-mediod, validation, verification
Procedia PDF Downloads 369251 A Study of the Performance Parameter for Recommendation Algorithm Evaluation
Authors: C. Rana, S. K. Jain
Abstract:
The enormous amount of Web data has challenged its usage in efficient manner in the past few years. As such, a range of techniques are applied to tackle this problem; prominent among them is personalization and recommender system. In fact, these are the tools that assist user in finding relevant information of web. Most of the e-commerce websites are applying such tools in one way or the other. In the past decade, a large number of recommendation algorithms have been proposed to tackle such problems. However, there have not been much research in the evaluation criteria for these algorithms. As such, the traditional accuracy and classification metrics are still used for the evaluation purpose that provides a static view. This paper studies how the evolution of user preference over a period of time can be mapped in a recommender system using a new evaluation methodology that explicitly using time dimension. We have also presented different types of experimental set up that are generally used for recommender system evaluation. Furthermore, an overview of major accuracy metrics and metrics that go beyond the scope of accuracy as researched in the past few years is also discussed in detail.Keywords: collaborative filtering, data mining, evolutionary, clustering, algorithm, recommender systems
Procedia PDF Downloads 413250 IT-Aided Business Process Enabling Real-Time Analysis of Candidates for Clinical Trials
Authors: Matthieu-P. Schapranow
Abstract:
Recruitment of participants for clinical trials requires the screening of a big number of potential candidates, i.e. the testing for trial-specific inclusion and exclusion criteria, which is a time-consuming and complex task. Today, a significant amount of time is spent on identification of adequate trial participants as their selection may affect the overall study results. We introduce a unique patient eligibility metric, which allows systematic ranking and classification of candidates based on trial-specific filter criteria. Our web application enables real-time analysis of patient data and assessment of candidates using freely definable inclusion and exclusion criteria. As a result, the overall time required for identifying eligible candidates is tremendously reduced whilst additional degrees of freedom for evaluating the relevance of individual candidates are introduced by our contribution.Keywords: in-memory technology, clinical trials, screening, eligibility metric, data analysis, clustering
Procedia PDF Downloads 493249 HPTLC Metabolite Fingerprinting of Artocarpus champeden Stembark from Several Different Locations in Indonesia and Correlation with Antimalarial Activity
Authors: Imam Taufik, Hilkatul Ilmi, Puryani, Mochammad Yuwono, Aty Widyawaruyanti
Abstract:
Artocarpus champeden Spreng stembark (Moraceae) in Indonesia well known as ‘cempedak’ had been traditionally used for malarial remedies. The difference of growth locations could cause the difference of metabolite profiling. As a consequence, there were difference antimalarial activities in spite of the same plants. The aim of this research was to obtain the profile of metabolites that contained in A. champeden stembark from different locations in Indonesia for authentication and quality control purpose of this extract. The profiling had been performed by HPTLC-Densitometry technique and antimalarial activity had been also determined by HRP2-ELISA technique. The correlation between metabolite fingerprinting and antimalarial activity had been analyzed by Principle Component Analysis, Hierarchical Clustering Analysis and Partial Least Square. As a result, there is correlation between the difference metabolite fingerprinting and antimalarial activity from several different growth locations.Keywords: antimalarial, artocarpus champeden spreng, metabolite fingerprinting, multivariate analysis
Procedia PDF Downloads 311248 Performance and Emission Prediction in a Biodiesel Engine Fuelled with Honge Methyl Ester Using RBF Neural Networks
Authors: Shiva Kumar, G. S. Vijay, Srinivas Pai P., Shrinivasa Rao B. R.
Abstract:
In the present study RBF neural networks were used for predicting the performance and emission parameters of a biodiesel engine. Engine experiments were carried out in a 4 stroke diesel engine using blends of diesel and Honge methyl ester as the fuel. Performance parameters like BTE, BSEC, Tech and emissions from the engine were measured. These experimental results were used for ANN modeling. RBF center initialization was done by random selection and by using Clustered techniques. Network was trained by using fixed and varying widths for the RBF units. It was observed that RBF results were having a good agreement with the experimental results. Networks trained by using clustering technique gave better results than using random selection of centers in terms of reduced MRE and increased prediction accuracy. The average MRE for the performance parameters was 3.25% with the prediction accuracy of 98% and for emissions it was 10.4% with a prediction accuracy of 80%.Keywords: radial basis function networks, emissions, performance parameters, fuzzy c means
Procedia PDF Downloads 558247 Detection Method of Federated Learning Backdoor Based on Weighted K-Medoids
Authors: Xun Li, Haojie Wang
Abstract:
Federated learning is a kind of distributed training and centralized training mode, which is of great value in the protection of user privacy. In order to solve the problem that the model is vulnerable to backdoor attacks in federated learning, a backdoor attack detection method based on a weighted k-medoids algorithm is proposed. First of all, this paper collates the update parameters of the client to construct a vector group, then uses the principal components analysis (PCA) algorithm to extract the corresponding feature information from the vector group, and finally uses the improved k-medoids clustering algorithm to identify the normal and backdoor update parameters. In this paper, the backdoor is implanted in the federation learning model through the model replacement attack method in the simulation experiment, and the update parameters from the attacker are effectively detected and removed by the defense method proposed in this paper.Keywords: federated learning, backdoor attack, PCA, k-medoids, backdoor defense
Procedia PDF Downloads 114246 Combining a Continuum of Hidden Regimes and a Heteroskedastic Three-Factor Model in Option Pricing
Authors: Rachid Belhachemi, Pierre Rostan, Alexandra Rostan
Abstract:
This paper develops a discrete-time option pricing model for index options. The model consists of two key ingredients. First, daily stock return innovations are driven by a continuous hidden threshold mixed skew-normal (HTSN) distribution which generates conditional non-normality that is needed to fit daily index return. The most important feature of the HTSN is the inclusion of a latent state variable with a continuum of states, unlike the traditional mixture distributions where the state variable is discrete with little number of states. The HTSN distribution belongs to the class of univariate probability distributions where parameters of the distribution capture the dependence between the variable of interest and the continuous latent state variable (the regime). The distribution has an interpretation in terms of a mixture distribution with time-varying mixing probabilities. It has been shown empirically that this distribution outperforms its main competitor, the mixed normal (MN) distribution, in terms of capturing the stylized facts known for stock returns, namely, volatility clustering, leverage effect, skewness, kurtosis and regime dependence. Second, heteroscedasticity in the model is captured by a threeexogenous-factor GARCH model (GARCHX), where the factors are taken from the principal components analysis of various world indices and presents an application to option pricing. The factors of the GARCHX model are extracted from a matrix of world indices applying principal component analysis (PCA). The empirically determined factors are uncorrelated and represent truly different common components driving the returns. Both factors and the eight parameters inherent to the HTSN distribution aim at capturing the impact of the state of the economy on price levels since distribution parameters have economic interpretations in terms of conditional volatilities and correlations of the returns with the hidden continuous state. The PCA identifies statistically independent factors affecting the random evolution of a given pool of assets -in our paper a pool of international stock indices- and sorting them by order of relative importance. The PCA computes a historical cross asset covariance matrix and identifies principal components representing independent factors. In our paper, factors are used to calibrate the HTSN-GARCHX model and are ultimately responsible for the nature of the distribution of random variables being generated. We benchmark our model to the MN-GARCHX model following the same PCA methodology and the standard Black-Scholes model. We show that our model outperforms the benchmark in terms of RMSE in dollar losses for put and call options, which in turn outperforms the analytical Black-Scholes by capturing the stylized facts known for index returns, namely, volatility clustering, leverage effect, skewness, kurtosis and regime dependence.Keywords: continuous hidden threshold, factor models, GARCHX models, option pricing, risk-premium
Procedia PDF Downloads 297245 Cotton Crops Vegetative Indices Based Assessment Using Multispectral Images
Authors: Muhammad Shahzad Shifa, Amna Shifa, Muhammad Omar, Aamir Shahzad, Rahmat Ali Khan
Abstract:
Many applications of remote sensing to vegetation and crop response depend on spectral properties of individual leaves and plants. Vegetation indices are usually determined to estimate crop biophysical parameters like crop canopies and crop leaf area indices with the help of remote sensing. Cotton crops assessment is performed with the help of vegetative indices. Remotely sensed images from an optical multispectral radiometer MSR5 are used in this study. The interpretation is based on the fact that different materials reflect and absorb light differently at different wavelengths. Non-normalized and normalized forms of these datasets are analyzed using two complementary data mining algorithms; K-means and K-nearest neighbor (KNN). Our analysis shows that the use of normalized reflectance data and vegetative indices are suitable for an automated assessment and decision making.Keywords: cotton, condition assessment, KNN algorithm, clustering, MSR5, vegetation indices
Procedia PDF Downloads 333244 Evaluation of Robust Feature Descriptors for Texture Classification
Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo
Abstract:
Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.Keywords: texture classification, texture descriptor, SIFT, SURF, ORB
Procedia PDF Downloads 369243 Performance Evaluation of Various Segmentation Techniques on MRI of Brain Tissue
Authors: U.V. Suryawanshi, S.S. Chowhan, U.V Kulkarni
Abstract:
Accuracy of segmentation methods is of great importance in brain image analysis. Tissue classification in Magnetic Resonance brain images (MRI) is an important issue in the analysis of several brain dementias. This paper portraits performance of segmentation techniques that are used on Brain MRI. A large variety of algorithms for segmentation of Brain MRI has been developed. The objective of this paper is to perform a segmentation process on MR images of the human brain, using Fuzzy c-means (FCM), Kernel based Fuzzy c-means clustering (KFCM), Spatial Fuzzy c-means (SFCM) and Improved Fuzzy c-means (IFCM). The review covers imaging modalities, MRI and methods for noise reduction and segmentation approaches. All methods are applied on MRI brain images which are degraded by salt-pepper noise demonstrate that the IFCM algorithm performs more robust to noise than the standard FCM algorithm. We conclude with a discussion on the trend of future research in brain segmentation and changing norms in IFCM for better results.Keywords: image segmentation, preprocessing, MRI, FCM, KFCM, SFCM, IFCM
Procedia PDF Downloads 331242 A Memetic Algorithm Approach to Clustering in Mobile Wireless Sensor Networks
Authors: Masood Ahmad, Ataul Aziz Ikram, Ishtiaq Wahid
Abstract:
Wireless sensor network (WSN) is the interconnection of mobile wireless nodes with limited energy and memory. These networks can be deployed formany critical applications like military operations, rescue management, fire detection and so on. In flat routing structure, every node plays an equal role of sensor and router. The topology may change very frequently due to the mobile nature of nodes in WSNs. The topology maintenance may produce more overhead messages. To avoid topology maintenance overhead messages, an optimized cluster based mobile wireless sensor network using memetic algorithm is proposed in this paper. The nodes in this network are first divided into clusters. The cluster leaders then transmit data to that base station. The network is validated through extensive simulation study. The results show that the proposed technique has superior results compared to existing techniques.Keywords: WSN, routing, cluster based, meme, memetic algorithm
Procedia PDF Downloads 481241 Intrusion Detection Using Dual Artificial Techniques
Authors: Rana I. Abdulghani, Amera I. Melhum
Abstract:
With the abnormal growth of the usage of computers over networks and under the consideration or agreement of most of the computer security experts who said that the goal of building a secure system is never achieved effectively, all these points led to the design of the intrusion detection systems(IDS). This research adopts a comparison between two techniques for network intrusion detection, The first one used the (Particles Swarm Optimization) that fall within the field (Swarm Intelligence). In this Act, the algorithm Enhanced for the purpose of obtaining the minimum error rate by amending the cluster centers when better fitness function is found through the training stages. Results show that this modification gives more efficient exploration of the original algorithm. The second algorithm used a (Back propagation NN) algorithm. Finally a comparison between the results of two methods used were based on (NSL_KDD) data sets for the construction and evaluation of intrusion detection systems. This research is only interested in clustering the two categories (Normal and Abnormal) for the given connection records. Practices experiments result in intrude detection rate (99.183818%) for EPSO and intrude detection rate (69.446416%) for BP neural network.Keywords: IDS, SI, BP, NSL_KDD, PSO
Procedia PDF Downloads 382240 Leverage Effect for Volatility with Generalized Laplace Error
Authors: Farrukh Javed, Krzysztof Podgórski
Abstract:
We propose a new model that accounts for the asymmetric response of volatility to positive ('good news') and negative ('bad news') shocks in economic time series the so-called leverage effect. In the past, asymmetric powers of errors in the conditionally heteroskedastic models have been used to capture this effect. Our model is using the gamma difference representation of the generalized Laplace distributions that efficiently models the asymmetry. It has one additional natural parameter, the shape, that is used instead of power in the asymmetric power models to capture the strength of a long-lasting effect of shocks. Some fundamental properties of the model are provided including the formula for covariances and an explicit form for the conditional distribution of 'bad' and 'good' news processes given the past the property that is important for the statistical fitting of the model. Relevant features of volatility models are illustrated using S&P 500 historical data.Keywords: heavy tails, volatility clustering, generalized asymmetric laplace distribution, leverage effect, conditional heteroskedasticity, asymmetric power volatility, GARCH models
Procedia PDF Downloads 385239 Using Emerging Hot Spot Analysis to Analyze Overall Effectiveness of Policing Policy and Strategy in Chicago
Authors: Tyler Gill, Sophia Daniels
Abstract:
The paper examines how accessing the spatial-temporal constrains of data will help inform policymakers and law enforcement officials. The authors utilize Chicago crime data from 2006-2016 to demonstrate how the Emerging Hot Spot Tool is an ideal hot spot clustering approach to analyze crime data. Traditional approaches include density maps or creating a spatial weights matrix to include the spatial-temporal constrains. This new approach utilizes a space-time implementation of the Getis-Ord Gi* statistic to visualize the data more quickly to make better decisions. The research will help complement socio-cultural research to find key patterns to help frame future policies and evaluate the implementation of prior strategies. Through this analysis, homicide trends and patterns are found more effectively and recommendations for use by non-traditional users of GIS are offered for real life implementation.Keywords: crime mapping, emerging hot spot analysis, Getis-Ord Gi*, spatial-temporal analysis
Procedia PDF Downloads 244238 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning
Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang
Abstract:
Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.Keywords: intelligence, software architecture, re-architecting, software reuse, High level design
Procedia PDF Downloads 119237 A Machine Learning-Based Analysis of Autism Prevalence Rates across US States against Multiple Potential Explanatory Variables
Authors: Ronit Chakraborty, Sugata Banerji
Abstract:
There has been a marked increase in the reported prevalence of Autism Spectrum Disorder (ASD) among children in the US over the past two decades. This research has analyzed the growth in state-level ASD prevalence against 45 different potentially explanatory factors, including socio-economic, demographic, healthcare, public policy, and political factors. The goal was to understand if these factors have adequate predictive power in modeling the differential growth in ASD prevalence across various states and if they do, which factors are the most influential. The key findings of this study include (1) the confirmation that the chosen feature set has considerable power in predicting the growth in ASD prevalence, (2) the identification of the most influential predictive factors, (3) given the nature of the most influential predictive variables, an indication that a considerable portion of the reported ASD prevalence differentials across states could be attributable to over and under diagnosis, and (4) identification of Florida as a key outlier state pointing to a potential under-diagnosis of ASD there.Keywords: autism spectrum disorder, clustering, machine learning, predictive modeling
Procedia PDF Downloads 103236 Spectral Anomaly Detection and Clustering in Radiological Search
Authors: Thomas L. McCullough, John D. Hague, Marylesa M. Howard, Matthew K. Kiser, Michael A. Mazur, Lance K. McLean, Johanna L. Turk
Abstract:
Radiological search and mapping depends on the successful recognition of anomalies in large data sets which contain varied and dynamic backgrounds. We present a new algorithmic approach for real-time anomaly detection which is resistant to common detector imperfections, avoids the limitations of a source template library and provides immediate, and easily interpretable, user feedback. This algorithm is based on a continuous wavelet transform for variance reduction and evaluates the deviation between a foreground measurement and a local background expectation using methods from linear algebra. We also present a technique for recognizing and visualizing spectrally similar clusters of data. This technique uses Laplacian Eigenmap Manifold Learning to perform dimensional reduction which preserves the geometric "closeness" of the data while maintaining sensitivity to outlying data. We illustrate the utility of both techniques on real-world data sets.Keywords: radiological search, radiological mapping, radioactivity, radiation protection
Procedia PDF Downloads 692235 Apricot Insurance Portfolio Risk
Authors: Kasirga Yildirak, Ismail Gur
Abstract:
We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.Keywords: hail insurance, spherical regression, circular regression, spherical clustering
Procedia PDF Downloads 251234 Hierarchical Piecewise Linear Representation of Time Series Data
Authors: Vineetha Bettaiah, Heggere S. Ranganath
Abstract:
This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation
Procedia PDF Downloads 275233 Comparative Assessment of ISSR and RAPD Markers among Egyptian Jojoba Shrubs
Authors: Abdelsabour G. A. Khaled, Galal A.R. El-Sherbeny, Ahmed M. Hassanein, Gameel M. G. Aly
Abstract:
Classical methods of identification, based on agronomical characterization, are not always the most accurate way due to the instability of these characteristics under the influence of the different environments. In order to estimate the genetic diversity, molecular markers provided excellent tools. In this study, Genetic variation of nine Egyptian jojoba shrubs was tested using ISSR (inter simple sequences repeats), RAPD (random amplified polymorphic DNA) markers and based on the morphological characterization. The average of the percentage of polymorphism (%P) ranged between 58.17% and 74.07% for ISSR and RAPD markers, respectively. The range of genetic similarity percents among shrubs based on ISSR and RAPD markers were from 82.9 to 97.9% and from 85.5 to 97.8%, respectively. The average of PIC (polymorphism information content) values were 0.19 (ISSR) and 0.24 (RAPD). In the present study, RAPD markers were more efficient than the ISSR markers. Where the RAPD technique exhibited higher marker index (MI) average (1.26) compared to ISSR one (1.11). There was an insignificant correlation between the ISSR and RAPD data (0.076, P > 0.05). The dendrogram constructed by the combined RAPD and ISSR data gave a relatively different clustering pattern.Keywords: correlation, molecular markers, polymorphism, marker index
Procedia PDF Downloads 478232 Wireless Sensor Networks Optimization by Using 2-Stage Algorithm Based on Imperialist Competitive Algorithm
Authors: Hamid R. Lashgarian Azad, Seyed N. Shetab Boushehri
Abstract:
Wireless sensor networks (WSN) have become progressively popular due to their wide range of applications. Wireless Sensor Network is made of numerous tiny sensor nodes that are battery-powered. It is a very significant problem to maximize the lifetime of wireless sensor networks. In this paper, we propose a two-stage protocol based on an imperialist competitive algorithm (2S-ICA) to solve a sensor network optimization problem. The energy of the sensors can be greatly reduced and the lifetime of the network reduced by long communication distances between the sensors and the sink. We can minimize the overall communication distance considerably, thereby extending the lifetime of the network lifetime through connecting sensors into a series of independent clusters using 2SICA. Comparison results of the proposed protocol and LEACH protocol, which is common to solving WSN problems, show that our protocol has a better performance in terms of improving network life and increasing the number of transmitted data.Keywords: wireless sensor network, imperialist competitive algorithm, LEACH protocol, k-means clustering
Procedia PDF Downloads 103231 Order Fulfilment Strategy in E-Commerce Warehouse Based on Simulation: Business Customers Case
Authors: Aurelija Burinskiene
Abstract:
This paper presents the study for an e-commerce warehouse. The study is aiming to improve order fulfillment activity by identifying the strategy presenting the best performance. A simulation model was proposed to reach the target of this research. This model enables various scenario tests in an e-commerce warehouse, allowing them to find out for the best order fulfillment strategy. By using simulation, model authors investigated customers’ orders representing on-line purchases for one month. Experiments were designed to evaluate various order picking methods applicable to the fulfillment of customers’ orders. The research uses cost components analysis and helps to identify the best possible order picking method improving the overall performance of e-commerce warehouse and fulfillment service to the customers. The results presented show that the application of order batching strategy is the most applicable because it brings distance savings of around 6.7 percentage. This result could be improved by taking an assortment clustering action until 8.34 percentage. So, the recommendations were given to apply the method for future e-commerce warehouse operations.Keywords: e-commerce, order, fulfilment, strategy, simulation
Procedia PDF Downloads 150230 A Geospatial Consumer Marketing Campaign Optimization Strategy: Case of Fuzzy Approach in Nigeria Mobile Market
Authors: Adeolu O. Dairo
Abstract:
Getting the consumer marketing strategy right is a crucial and complex task for firms with a large customer base such as mobile operators in a competitive mobile market. While empirical studies have made efforts to identify key constructs, no geospatial model has been developed to comprehensively assess the viability and interdependency of ground realities regarding the customer, competition, channel and the network quality of mobile operators. With this research, a geo-analytic framework is proposed for strategy formulation and allocation for mobile operators. Firstly, a fuzzy analytic network using a self-organizing feature map clustering technique based on inputs from managers and literature, which depicts the interrelationships amongst ground realities is developed. The model is tested with a mobile operator in the Nigeria mobile market. As a result, a customer-centric geospatial and visualization solution is developed. This provides a consolidated and integrated insight that serves as a transparent, logical and practical guide for strategic, tactical and operational decision making.Keywords: geospatial, geo-analytics, self-organizing map, customer-centric
Procedia PDF Downloads 183229 Social Media and Internet Celebrity for Social Commerce Intentional and Behavioral Recommendations
Authors: Shu-Hsien Liao, Yao-Hsuan Yang
Abstract:
Social media is a virtual community and online platform that people use to create, share, and exchange opinions/experiences. Internet celebrities are people who become famous on the Internet, increasing their popularity through their social networking or video websites. Social commerce (s-ecommerce) is the combination of social relations and commercial transaction activities. The combination of social media and Internet celebrities is an emerging model for the development of s-ecommerce. With recent advances in system sciences, recommendation systems are gradually moving to develop intentional and behavioral recommendations. This background leads to the research issues regarding digital and social media in enterprises. Thus, this study implements data mining analytics, including clustering analysis and association rules, to investigate Taiwanese users (n=2,102) to investigate social media and Internet celebrities’ preferences to find knowledge profiles/patterns/rules for s-ecommerce intentional and behavioral recommendations.Keywords: social media, internet celebrity, social commerce (s-ecommerce), data mining analytics, intentional and behavioral recommendations
Procedia PDF Downloads 30