Search results for: ensemble clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 760

Search results for: ensemble clustering

220 Global City Typologies: 300 Cities and Over 100 Datasets

Authors: M. Novak, E. Munoz, A. Jana, M. Nelemans

Abstract:

Cities and local governments the world over are interested to employ circular strategies as a means to bring about food security, create employment and increase resilience. The selection and implementation of circular strategies is facilitated by modeling the effects of strategies locally and understanding the impacts such strategies have had in other (comparable) cities and how that would translate locally. Urban areas are heterogeneous because of their geographic, economic, social characteristics, governance, and culture. In order to better understand the effect of circular strategies on urban systems, we create a dataset for over 300 cities around the world designed to facilitate circular strategy scenario modeling. This new dataset integrates data from over 20 prominent global national and urban data sources, such as the Global Human Settlements layer and International Labour Organisation, as well as incorporating employment data from over 150 cities collected bottom up from local departments and data providers. The dataset is made to be reproducible. Various clustering techniques are explored in the paper. The result is sets of clusters of cities, which can be used for further research, analysis, and support comparative, regional, and national policy making on circular cities.

Keywords: data integration, urban innovation, cluster analysis, circular economy, city profiles, scenario modelling

Procedia PDF Downloads 157
219 Toward an Understanding of the Neurofunctional Dissociation between Animal and Tool Concepts: A Graph Theoretical Analysis

Authors: Skiker Kaoutar, Mounir Maouene

Abstract:

Neuroimaging studies have shown that animal and tool concepts rely on distinct networks of brain areas. Animal concepts depend predominantly on temporal areas while tool concepts rely on fronto-temporo-parietal areas. However, the origin of this neurofunctional distinction for processing animal and tool concepts remains still unclear. Here, we address this question from a network perspective suggesting that the neural distinction between animals and tools might reflect the differences in their structural semantic networks. We build semantic networks for animal and tool concepts derived from Mc Rae and colleagues’s behavioral study conducted on a large number of participants. These two networks are thus analyzed through a large number of graph theoretical measures for small-worldness: centrality, clustering coefficient, average shortest path length, as well as resistance to random and targeted attacks. The results indicate that both animal and tool networks have small-world properties. More importantly, the animal network is more vulnerable to targeted attacks compared to the tool network a result that correlates with brain lesions studies.

Keywords: animals, tools, network, semantics, small-world, resilience to damage

Procedia PDF Downloads 510
218 Modelling Impacts of Global Financial Crises on Stock Volatility of Nigeria Banks

Authors: Maruf Ariyo Raheem, Patrick Oseloka Ezepue

Abstract:

This research aimed at determining most appropriate heteroskedastic model to predicting volatility of 10 major Nigerian banks: Access, United Bank for Africa (UBA), Guaranty Trust, Skye, Diamond, Fidelity, Sterling, Union, ETI and Zenith banks using daily closing stock prices of each of the banks from 2004 to 2014. The models employed include ARCH (1), GARCH (1, 1), EGARCH (1, 1) and TARCH (1, 1). The results show that all the banks returns are highly leptokurtic, significantly skewed and thus non-normal across the four periods except for Fidelity bank during financial crises; findings similar to those of other global markets. There is also strong evidence for the presence of heteroscedasticity, and that volatility persistence during crisis is higher than before the crisis across the 10 banks, with that of UBA taking the lead, about 11 times higher during the crisis. Findings further revealed that Asymmetric GARCH models became dominant especially during financial crises and post crises when the second reforms were introduced into the banking industry by the Central Bank of Nigeria (CBN). Generally, one could say that Nigerian banks returns are volatility persistent during and after the crises, and characterised by leverage effects of negative and positive shocks during these periods

Keywords: global financial crisis, leverage effect, persistence, volatility clustering

Procedia PDF Downloads 501
217 Volatility and Stylized Facts

Authors: Kalai Lamia, Jilani Faouzi

Abstract:

Measuring and controlling risk is one of the most attractive issues in finance. With the persistence of uncontrolled and erratic stocks movements, volatility is perceived as a barometer of daily fluctuations. An objective measure of this variable seems then needed to control risks and cover those that are considered the most important. Non-linear autoregressive modeling is our first evaluation approach. In particular, we test the presence of “persistence” of conditional variance and the presence of a degree of a leverage effect. In order to resolve for the problem of “asymmetry” in volatility, the retained specifications point to the importance of stocks reactions in response to news. Effects of shocks on volatility highlight also the need to study the “long term” behaviour of conditional variance of stocks returns and articulate the presence of long memory and dependence of time series in the long run. We note that the integrated fractional autoregressive model allows for representing time series that show long-term conditional variance thanks to fractional integration parameters. In order to stop at the dynamics that manage time series, a comparative study of the results of the different models will allow for better understanding volatility structure over the Tunisia stock market, with the aim of accurately predicting fluctuation risks.

Keywords: asymmetry volatility, clustering, stylised facts, leverage effect

Procedia PDF Downloads 276
216 Heuristic Classification of Hydrophone Recordings

Authors: Daniel M. Wolff, Patricia Gray, Rafael de la Parra Venegas

Abstract:

An unsupervised machine listening system is constructed and applied to a dataset of 17,195 30-second marine hydrophone recordings. The system is then heuristically supplemented with anecdotal listening, contextual recording information, and supervised learning techniques to reduce the number of false positives. Features for classification are assembled by extracting the following data from each of the audio files: the spectral centroid, root-mean-squared values for each frequency band of a 10-octave filter bank, and mel-frequency cepstral coefficients in 5-second frames. In this way both time- and frequency-domain information are contained in the features to be passed to a clustering algorithm. Classification is performed using the k-means algorithm and then a k-nearest neighbors search. Different values of k are experimented with, in addition to different combinations of the available feature sets. Hypothesized class labels are 'primarily anthrophony' and 'primarily biophony', where the best class result conforming to the former label has 104 members after heuristic pruning. This demonstrates how a large audio dataset has been made more tractable with machine learning techniques, forming the foundation of a framework designed to acoustically monitor and gauge biological and anthropogenic activity in a marine environment.

Keywords: anthrophony, hydrophone, k-means, machine learning

Procedia PDF Downloads 136
215 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 391
214 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 165
213 Estimating Precipitable Water Vapour Using the Global Positioning System and Radio Occultation over Ethiopian Regions

Authors: Asmamaw Yehun, Tsegaye Gogie, Martin Vermeer, Addisu Hunegnaw

Abstract:

The Global Positioning System (GPS) is a space-based radio positioning system, which is capable of providing continuous position, velocity, and time information to users anywhere on or near the surface of the Earth. The main objective of this work was to estimate the integrated precipitable water vapour (IPWV) using ground GPS and Low Earth Orbit (LEO) Radio Occultation (RO) to study spatial-temporal variability. For LEO-GPS RO, we used Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) datasets. We estimated the daily and monthly mean of IPWV using six selected ground-based GPS stations over a period of range from 2012 to 2016 (i.e. five-years period). The main perspective for selecting the range period from 2012 to 2016 is that, continuous data were available during these periods at all Ethiopian GPS stations. We studied temporal, seasonal, diurnal, and vertical variations of precipitable water vapour using GPS observables extracted from the precise geodetic GAMIT-GLOBK software package. Finally, we determined the cross-correlation of our GPS-derived IPWV values with those of the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-40 Interim reanalysis and of the second generation National Oceanic and Atmospheric Administration (NOAA) model ensemble Forecast System Reforecast (GEFS/R) for validation and static comparison. There are higher values of the IPWV range from 30 to 37.5 millimetres (mm) in Gambela and Southern Regions of Ethiopia. Some parts of Tigray, Amhara, and Oromia regions had low IPWV ranges from 8.62 to 15.27 mm. The correlation coefficient between GPS-derived IPWV with ECMWF and GEFS/R exceeds 90%. We conclude that there are highly temporal, seasonal, diurnal, and vertical variations of precipitable water vapour in the study area.

Keywords: GNSS, radio occultation, atmosphere, precipitable water vapour

Procedia PDF Downloads 59
212 Molecular Detection and Isolation of Benzimidazole Resistant Haemonchus contortus from Pakistan

Authors: K. Ali, M. F. Qamar, M. A. Zaman, M. Younus, I. Khan, S. Ehtisham-ul-Haque, R. Tamkeen, M. I. Rashid, Q. Ali

Abstract:

This study centers on molecular identification of Haemonchus contortus and isolation of Benz-imidazoles (BZ) resistant strains. Different abattoirs’ of two geographic regions of Punjab (Pakistan) were frequently visited for the collection of worms. Out of 1500 (n=1500) samples that were morphologically confirmed as H. contortus, 30 worms were subjected to molecular procedures for isolation of resistant strains. Resistant worms (n=8) were further subjected to DNA gene sequencing. Bio edit sequence alignment editor software was used to detect the possible mutation, deletion, replacement of nucleotides. Genetic diversity was noticed and genetic variation existing in β-tubulin isotype 1 of the H. contortus population of small ruminants of different regions considered in this study. H. contortus showed three different type of genetic sequences. 75%, 37.5%, 25% and 12.5% of the studied samples showed 100% query cover and identity with isolates and clones of China, UK, Australia and other countries, respectively. Interestingly the neighbor countries such as India and Iran haven’t many similarities with the Pakistani isolates. Thus, it suggests that population density of same genetic makeup H. contortus is scattered worldwide rather than clustering in a single region.

Keywords: Haemonchus contortus, Benzimidazole resistant, β-tubulin-1 gene, abattoirs

Procedia PDF Downloads 152
211 Urbanization Effects on the Food-Water-Energy Nexus within Ecosystem Services: A Case Study of the Beijing-Tianjin-Hebei Urban Agglomeration in China

Authors: Ke Yang, QiHan, Bauke de Veirs

Abstract:

This study addresses the need for coordinated management of natural resources in urban agglomeration. Using ecosystem services theory, The study explore the relationship between land use in the Beijing-Tianjin-Hebei (B-T-H) region and the Food-Water-Energy (F-W-E) nexus from 2000 to 2030. We assess ecosystem services using the InVEST: Habitat Quality (HQ), Water Yield (WY), Carbon Sequestration (CS), Soil Retention (SDR), and Food Production (FP). The study find an annual expansion of construction land alongside a significant decline in cultivated land. Additionally, HQ, CS, and per capita FP decline annually until 2020 and are expected to persist through 2030. In contrast, WY and SDR grow annually but may decline by 2030. Spearman coefficient analysis reveals synergies between HQ and CS, SDR and CS, and SDR and HQ, with trade-offs between CS and WY and HQ and WY. Utilizing the K-means clustering analysis method, we introduce county-based spatial planning for the F-W-E system, offering valuable insights and recommendations for sustainable resource management.

Keywords: food-water-energy (F-W-E), ecosystem services, trade-offs and synergies, ecosystem service bundle, county-based

Procedia PDF Downloads 36
210 Hybrid Algorithm for Non-Negative Matrix Factorization Based on Symmetric Kullback-Leibler Divergence for Signal Dependent Noise: A Case Study

Authors: Ana Serafimovic, Karthik Devarajan

Abstract:

Non-negative matrix factorization approximates a high dimensional non-negative matrix V as the product of two non-negative matrices, W and H, and allows only additive linear combinations of data, enabling it to learn parts with representations in reality. It has been successfully applied in the analysis and interpretation of high dimensional data arising in neuroscience, computational biology, and natural language processing, to name a few. The objective of this paper is to assess a hybrid algorithm for non-negative matrix factorization with multiplicative updates. The method aims to minimize the symmetric version of Kullback-Leibler divergence known as intrinsic information and assumes that the noise is signal-dependent and that it originates from an arbitrary distribution from the exponential family. It is a generalization of currently available algorithms for Gaussian, Poisson, gamma and inverse Gaussian noise. We demonstrate the potential usefulness of the new generalized algorithm by comparing its performance to the baseline methods which also aim to minimize symmetric divergence measures.

Keywords: non-negative matrix factorization, dimension reduction, clustering, intrinsic information, symmetric information divergence, signal-dependent noise, exponential family, generalized Kullback-Leibler divergence, dual divergence

Procedia PDF Downloads 223
209 Ambiguity Resolution for Ground-based Pulse Doppler Radars Using Multiple Medium Pulse Repetition Frequency

Authors: Khue Nguyen Dinh, Loi Nguyen Van, Thanh Nguyen Nhu

Abstract:

In this paper, we propose an adaptive method to resolve ambiguities and a ghost target removal process to extract targets detected by a ground-based pulse-Doppler radar using medium pulse repetition frequency (PRF) waveforms. The ambiguity resolution method is an adaptive implementation of the coincidence algorithm, which is implemented on a two-dimensional (2D) range-velocity matrix to resolve range and velocity ambiguities simultaneously, with a proposed clustering filter to enhance the anti-error ability of the system. Here we consider the scenario of multiple target environments. The ghost target removal process, which is based on the power after Doppler processing, is proposed to mitigate ghosting detections to enhance the performance of ground-based radars using a short PRF schedule in multiple target environments. Simulation results on a ground-based pulsed Doppler radar model will be presented to show the effectiveness of the proposed approach.

Keywords: ambiguity resolution, coincidence algorithm, medium PRF, ghosting removal

Procedia PDF Downloads 116
208 Genetic Diversity of Sorghum bicolor (L.) Moench Genotypes as Revealed by Microsatellite Markers

Authors: Maletsema Alina Mofokeng, Hussein Shimelis, Mark Laing, Pangirayi Tongoona

Abstract:

Sorghum is one of the most important cereal crops grown for food, feed and bioenergy. Knowledge of genetic diversity is important for conservation of genetic resources and improvement of crop plants through breeding. The objective of this study was to assess the level of genetic diversity among sorghum genotypes using microsatellite markers. A total of 103 accessions of sorghum genotypes obtained from the Department of Agriculture, Forestry and Fisheries, the African Centre for Crop Improvement and Agricultural Research Council-Grain Crops Institute collections in South Africa were estimated using 30 microsatellite markers. For all the loci analysed, 306 polymorphic alleles were detected with a mean value of 6.4 per locus. The polymorphic information content had an average value of 0.50 with heterozygosity mean value of 0.55 suggesting an important genetic diversity within the sorghum genotypes used. The unweighted pair group method with arithmetic mean clustering based on Euclidian coefficients revealed two major distinct groups without allocating genotypes based on the source of collection or origin. The genotypes 4154.1.1.1, 2055.1.1.1, 4441.1.1.1, 4442.1.1.1, 4722.1.1.1, and 4606.1.1.1 were the most diverse. The sorghum genotypes with high genetic diversity could serve as important sources of novel alleles for breeding and strategic genetic conservation.

Keywords: Genetic Diversity, Genotypes, Microsatellites, Sorghum

Procedia PDF Downloads 348
207 Study on the Characteristics of Chinese Urban Network Space from the Perspective of Innovative Collaboration

Authors: Wei Wang, Yilun Xu

Abstract:

With the development of knowledge economy era, deepening the mechanism of cooperation and adhering to sharing and win-win cooperation has become new direction of urban development nowadays. In recent years, innovative collaborations between cities are becoming more and more frequent, whose influence on urban network space has aroused many scholars' attention. Taking 46 cities in China as the research object, the paper builds the connectivity of innovative network between cities and the linkages of urban external innovation using patent cooperation data among cities, and explores urban network space in China by the application of GIS, which is a beneficial exploration to the study of social network space in China in the era of information network. The result shows that the urban innovative network space and geographical entity space exist differences, and the linkages of external innovation are not entirely related to the city innovative capacity and the level of economy development. However, urban innovative network space and geographical entity space are similar in hierarchical clustering. They have both formed Beijing-Tianjin-Hebei, Yangtze River Delta, Pearl River Delta three metropolitan areas and Beijing-Shenzhen-Shanghai-Hangzhou four core cities, which lead the development of innovative network space in China.

Keywords: innovative collaboration, urban network space, the connectivity of innovative network, the linkages of external innovation

Procedia PDF Downloads 154
206 Phylogenetic Analysis of the Thunnus Tuna Fish Using Cytochrome C Oxidase Subunit I Gene Sequence

Authors: Yijun Lai, Saber Khederzadeh, Lingshaung Han

Abstract:

Species in Thunnus are organized due to the similarity between them. The closeness between T. maccoyii, T. thynnus, T. Tonggol, T. atlanticus, T. albacares, T. obsesus, T. alalunga, and T. orientails are in different degrees. However, the genetic pattern of differentiation has not been presented based on individuals yet, to the author’s best knowledge. Hence, we aimed to analyze the difference in individuals level of tuna species to identify the factors that contribute to the maternal lineage variety using Cytochrome c oxidase subunit I (COXI) gene sequences. Our analyses provided evidence of sharing lineages in the Thunnus. A phylogenetic analysis revealed that these lineages are basal to the other sequences. We also showed a close connection between the T. tonggol, T. thynnus, and T. albacares populations. Also, the majority of the T. orientalis samples were clustered with the T. alalunga and, then, T. atlanticus populations. Phylogenetic trees and migration modeling revealed high proximity of T. thynnus sequences to a few T. orientalis and suggested possible gene flow with T. tonggol and T. albacares lineages, while all T. obsesus samples indicated unique clustering with each other. Our results support the presence of old maternal lineages in Thunnus, as a legacy of an ancient wave of colonization or migration.

Keywords: Thunnus Tuna, phylogeny, maternal lineage, COXI gene

Procedia PDF Downloads 258
205 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA

Procedia PDF Downloads 116
204 Application of Latent Class Analysis and Self-Organizing Maps for the Prediction of Treatment Outcomes for Chronic Fatigue Syndrome

Authors: Ben Clapperton, Daniel Stahl, Kimberley Goldsmith, Trudie Chalder

Abstract:

Chronic fatigue syndrome (CFS) is a condition characterised by chronic disabling fatigue and other symptoms that currently can't be explained by any underlying medical condition. Although clinical trials support the effectiveness of cognitive behaviour therapy (CBT), the success rate for individual patients is modest. Patients vary in their response and little is known which factors predict or moderate treatment outcomes. The aim of the project is to develop a prediction model from baseline characteristics of patients, such as demographics, clinical and psychological variables, which may predict likely treatment outcome and provide guidance for clinical decision making and help clinicians to recommend the best treatment. The project is aimed at identifying subgroups of patients with similar baseline characteristics that are predictive of treatment effects using modern cluster analyses and data mining machine learning algorithms. The characteristics of these groups will then be used to inform the types of individuals who benefit from a specific treatment. In addition, results will provide a better understanding of for whom the treatment works. The suitability of different clustering methods to identify subgroups and their response to different treatments of CFS patients is compared.

Keywords: chronic fatigue syndrome, latent class analysis, prediction modelling, self-organizing maps

Procedia PDF Downloads 198
203 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 68
202 Screening of Risk Phenotypes among Metabolic Syndrome Subjects in Adult Pakistani Population

Authors: Muhammad Fiaz, Muhammad Saqlain, Abid Mahmood, S. M. Saqlan Naqvi, Rizwan Aziz Qazi, Ghazala Kaukab Raja

Abstract:

Background: Metabolic Syndrome is a clustering of multiple risk factors including central obesity, hypertension, dyslipidemia and hyperglycemia. These risk phenotypes of metabolic syndrome (MetS) prevalent world-wide, Therefore we aimed to identify the frequency of risk phenotypes among metabolic syndrome subjects in local adult Pakistani population. Methods: Screening of subjects visiting out-patient department of medicine, Shaheed Zulfiqar Ali Bhutto Medical University, Islamabad was performed to assess the occurrence of risk phenotypes among MetS subjects in Pakistani population. The Metabolic Syndrome was defined based on International Diabetes Federation (IDF) criteria. Anthropometric and biochemical assay results were recorded. Data was analyzed using SPSS software (16.0). Results: Our results showed that dyslipidemia (31.50%) and hyperglycemia (30.50%) was most population specific risk phenotypes of MetS. The results showed the order of association of metabolic risk phenotypes to MetS as follows hyperglycemia>dyslipidemia>obesity >hypertension. Conclusion: The hyperglycemia and dyslipidemia were found be the major risk phenotypes among the MetS subjects and have greater chances of deceloping MetS among Pakistani Population.

Keywords: dyslipidemia, hypertention, metabolic syndrome, obesity

Procedia PDF Downloads 191
201 A Mixed Integer Programming Model for Optimizing the Layout of an Emergency Department

Authors: Farhood Rismanchian, Seong Hyeon Park, Young Hoon Lee

Abstract:

During the recent years, demand for healthcare services has dramatically increased. As the demand for healthcare services increases, so does the necessity of constructing new healthcare buildings and redesigning and renovating existing ones. Increasing demands necessitate the use of optimization techniques to improve the overall service efficiency in healthcare settings. However, high complexity of care processes remains the major challenge to accomplish this goal. This study proposes a method based on process mining results to address the high complexity of care processes and to find the optimal layout of the various medical centers in an emergency department. ProM framework is used to discover clinical pathway patterns and relationship between activities. Sequence clustering plug-in is used to remove infrequent events and to derive the process model in the form of Markov chain. The process mining results served as an input for the next phase which consists of the development of the optimization model. Comparison of the current ED design with the one obtained from the proposed method indicated that a carefully designed layout can significantly decrease the distances that patients must travel.

Keywords: Mixed Integer programming, Facility layout problem, Process Mining, Healthcare Operation Management

Procedia PDF Downloads 311
200 Revisiting the Swadesh Wordlist: How Long Should It Be

Authors: Feda Negesse

Abstract:

One of the most important indicators of research quality is a good data - collection instrument that can yield reliable and valid data. The Swadesh wordlist has been used for more than half a century for collecting data in comparative and historical linguistics though arbitrariness is observed in its application and size. This research compare s the classification results of the 100 Swadesh wordlist with those of its subsets to determine if reducing the size of the wordlist impact s its effectiveness. In the comparison, the 100, 50 and 40 wordlists were used to compute lexical distances of 29 Cushitic and Semitic languages spoken in Ethiopia and neighbouring countries. Gabmap, a based application, was employed to compute the lexical distances and to divide the languages into related clusters. The study shows that the subsets are not as effective as the 100 wordlist in clustering languages into smaller subgroups but they are equally effective in di viding languages into bigger groups such as subfamilies. It is noted that the subsets may lead to an erroneous classification whereby unrelated languages by chance form a cluster which is not attested by a comparative study. The chance to get a wrong result is higher when the subsets are used to classify languages which are not closely related. Though a further study is still needed to settle the issues around the size of the Swadesh wordlist, this study indicates that the 50 and 40 wordlists cannot be recommended as reliable substitute s for the 100 wordlist under all circumstances. The choice seems to be determined by the objective of a researcher and the degree of affiliation among the languages to be classified.

Keywords: classification, Cushitic, Swadesh, wordlist

Procedia PDF Downloads 275
199 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 64
198 Combined Analysis of m⁶A and m⁵C Modulators on the Prognosis of Hepatocellular Carcinoma

Authors: Hongmeng Su, Luyu Zhao, Yanyan Qian, Hong Fan

Abstract:

Aim: Hepatocellular carcinoma (HCC) is one of the most common malignant tumors that endanger human health seriously. RNA methylation, especially N6-methyladenosine (m⁶A) and 5-methylcytosine (m⁵C), a crucial epigenetic transcriptional regulatory mechanism, plays an important role in tumorigenesis, progression and prognosis. This research aims to systematically evaluate the prognostic value of m⁶A and m⁵C modulators in HCC patients. Methods: Twenty-four modulators of m⁶A and m⁵C were candidates to analyze their expression level and their contribution to predict the prognosis of HCC. Consensus clustering analysis was applied to classify HCC patients. Cox and LASSO regression were used to construct the risk model. According to the risk score, HCC patients were divided into high-risk and low/medium-risk groups. The clinical pathology factors of HCC patients were analyzed by univariate and multivariate Cox regression analysis. Results: The HCC patients were classified into 2 clusters with significant differences in overall survival and clinical characteristics. Nine-gene risk model was constructed including METTL3, VIRMA, YTHDF1, YTHDF2, NOP2, NSUN4, NSUN5, DNMT3A and ALYREF. It was indicated that the risk score could serve as an independent prognostic factor for patients with HCC. Conclusion: This study constructed a Nine-gene risk model by modulators of m⁶A and m⁵C and investigated its effect on the clinical prognosis of HCC. This model may provide important consideration for the therapeutic strategy and prognosis evaluation analysis of patients with HCC.

Keywords: hepatocellular carcinoma, m⁶A, m⁵C, prognosis, RNA methylation

Procedia PDF Downloads 38
197 Study on the Layout of 15-Minute Community-Life Circle in the State of “Community Segregation” Based on Poi: Shengwei Community and Other Two Communities in Chongqing

Authors: Siyuan Cai

Abstract:

This paper takes community segregation during major infectious diseases as the background, based on the physiological needs and safety needs of citizens during home segregation, and based on the selection of convenient facilities and medical facilities as the main research objects. Based on the POI data of public facilities in Chongqing, the spatial distribution characteristics of the convenience and medical facilities in the 15-minute living circle centered on three neighborhoods in Shapingba, namely Shengwei Community, Anju Commmunity and Fengtian Garden Community, were explored by means of GIS spatial analysis. The results show that the spatial distribution of convenience and medical facilities in this area has significant clustering characteristics, with a point-like distribution pattern of "dense in the west and sparse in the east", and a grouped and multi-polar spatial structure. The spatial structure is multi-polar and has an obvious tendency to the intersections and residential areas with dense pedestrian flow. This study provides a preliminary exploration of the distribution of medical and convenience facilities within the 15-minute living circle of a segregated community, which makes up for the lack of spatial research in this area.

Keywords: ArcGIS, community segregation, convenient facilities; distribution pattern, medical facilities, POI, 15-minute community life circle

Procedia PDF Downloads 89
196 Small and Medium Enterprises Owner-Managers/Entrepreneurs and Their Risk Perception in Songkhla Province, Thailand

Authors: Patraporn Kaewkhanitarak, Weerawan Marangkun

Abstract:

The objective of this study was to explore the establishment and to investigate the relationship between the gender (male or female) of SME owner-managers/ entrepreneurs and their risk perception in business activity. The study examines the data by interviewing 76 SME owner-managers/entrepreneurs’ responses (37 males, 39 females) in manufacturing, finance, human resources and marketing sector in the economic regions of Songkhla province, Thailand. This study found that four tools which were operation, cash flow, staff, and new market were perceived by the SME owner-managers/entrepreneurs at high level. However, male and female SME owner-managers/entrepreneurs perceived some factors such as the age of SME owner-managers/entrepreneurs, the duration of firm operation, type of firm, and type of business without significant differences. In contrast, the gender affected the risk perception about increasing cost, fierce competition, leapfrog development of firm, substandard staff, namely that male and female perceived these factors with significant differences. According to the research, SME owner-managers/entrepreneurs should develop their risk management competency to deal with the risk efficiently. Secondly, SME firms should gather into groups. Furthermore, it was shown that the five key tools used to manage these risky situations were the use of managerial competencies and clustering.

Keywords: risk perception, owner-managers/entrepreneurs, SME, Songkhla, Thailand

Procedia PDF Downloads 406
195 Detecting Venomous Files in IDS Using an Approach Based on Data Mining Algorithm

Authors: Sukhleen Kaur

Abstract:

In security groundwork, Intrusion Detection System (IDS) has become an important component. The IDS has received increasing attention in recent years. IDS is one of the effective way to detect different kinds of attacks and malicious codes in a network and help us to secure the network. Data mining techniques can be implemented to IDS, which analyses the large amount of data and gives better results. Data mining can contribute to improving intrusion detection by adding a level of focus to anomaly detection. So far the study has been carried out on finding the attacks but this paper detects the malicious files. Some intruders do not attack directly, but they hide some harmful code inside the files or may corrupt those file and attack the system. These files are detected according to some defined parameters which will form two lists of files as normal files and harmful files. After that data mining will be performed. In this paper a hybrid classifier has been used via Naive Bayes and Ripper classification methods. The results show how the uploaded file in the database will be tested against the parameters and then it is characterised as either normal or harmful file and after that the mining is performed. Moreover, when a user tries to mine on harmful file it will generate an exception that mining cannot be made on corrupted or harmful files.

Keywords: data mining, association, classification, clustering, decision tree, intrusion detection system, misuse detection, anomaly detection, naive Bayes, ripper

Procedia PDF Downloads 392
194 Detection and Quantification of Active Pharmaceutical Ingredients as Adulterants in Garcinia cambogia Slimming Preparations Using NIR Spectroscopy Combined with Chemometrics

Authors: Dina Ahmed Selim, Eman Shawky Anwar, Rasha Mohamed Abu El-Khair

Abstract:

A rapid, simple and efficient method with minimal sample treatment was developed for authentication of Garcinia cambogia fruit peel powder, along with determining undeclared active pharmaceutical ingredients (APIs) in its herbal slimming dietary supplements using near infrared spectroscopy combined with chemometrics. Five featured adulterants, including sibutramine, metformin, orlistat, ephedrine, and theophylline are selected as target compounds. The Near infrared spectral data matrix of authentic Garcinia cambogia fruit peel and specimens degraded by intentional contamination with the five selected APIs was subjected to hierarchical clustering analysis to investigate their bundling figure. SIMCA models were established to ensure the genuiness of Garcinia cambogia fruit peel which resulted in perfect classification of all tested specimens. Adulterated samples were utilized for construction of PLSR models based on different APIs contents at minute levels of fraud practices (LOQ < 0.2% w/w).The suggested approach can be applied to enhance and guarantee the safety and quality of Garcinia fruit peel powder as raw material and in dietary supplements.

Keywords: Garcinia cambogia, Quality control, NIR spectroscopy, Chemometrics

Procedia PDF Downloads 58
193 Scientific Linux Cluster for BIG-DATA Analysis (SLBD): A Case of Fayoum University

Authors: Hassan S. Hussein, Rania A. Abul Seoud, Amr M. Refaat

Abstract:

Scientific researchers face in the analysis of very large data sets that is increasing noticeable rate in today’s and tomorrow’s technologies. Hadoop and Spark are types of software that developed frameworks. Hadoop framework is suitable for many Different hardware platforms. In this research, a scientific Linux cluster for Big Data analysis (SLBD) is presented. SLBD runs open source software with large computational capacity and high performance cluster infrastructure. SLBD composed of one cluster contains identical, commodity-grade computers interconnected via a small LAN. SLBD consists of a fast switch and Gigabit-Ethernet card which connect four (nodes). Cloudera Manager is used to configure and manage an Apache Hadoop stack. Hadoop is a framework allows storing and processing big data across the cluster by using MapReduce algorithm. MapReduce algorithm divides the task into smaller tasks which to be assigned to the network nodes. Algorithm then collects the results and form the final result dataset. SLBD clustering system allows fast and efficient processing of large amount of data resulting from different applications. SLBD also provides high performance, high throughput, high availability, expandability and cluster scalability.

Keywords: big data platforms, cloudera manager, Hadoop, MapReduce

Procedia PDF Downloads 332
192 Mass Polarization in Three-Body System with Two Identical Particles

Authors: Igor Filikhin, Vladimir M. Suslov, Roman Ya. Kezerashvili, Branislav Vlahivic

Abstract:

The mass-polarization term of the three-body kinetic energy operator is evaluated for different systems which include two identical particles: A+A+B. The term has to be taken into account for the analysis of AB- and AA-interactions based on experimental data for two- and three-body ground state energies. In this study, we present three-body calculations within the framework of a potential model for the kaonic clusters K−K−p and ppK−, nucleus 3H and hypernucleus 6 ΛΛHe. The systems are well clustering as A+ (A+B) with a ground state energy E2 for the pair A+B. The calculations are performed using the method of the Faddeev equations in configuration space. The phenomenological pair potentials were used. We show a correlation between the mass ratio mA/mB and the value δB of the mass-polarization term. For bosonic-like systems, this value is defined as δB = 2E2 − E3, where E3 is three-body energy when VAA = 0. For the systems including three particles with spin(isospin), the models with average AB-potentials are used. In this case, the Faddeev equations become a scalar one like for the bosonic-like system αΛΛ. We show that the additional energy conected with the mass-polarization term can be decomposite to a sum of the two parts: exchenge related and reduced mass related. The state of the system can be described as the following: the particle A1 is bound within the A + B pair with the energy E2, and the second particle A2 is bound with the pair with the energy E3 − E2. Due to the identity of A particles, the particles A1 and A2 are interchangeable in the pair A + B. We shown that the mass polarization δB correlates with a type of AB potential using the system αΛΛ as an example.

Keywords: three-body systems, mass polarization, Faddeev equations, nuclear interactions

Procedia PDF Downloads 329
191 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism

Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng

Abstract:

Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.

Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition

Procedia PDF Downloads 147