Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 17250

Search results for: cluster model

17130 Discriminant Analysis of Pacing Behavior on Mass Start Speed Skating

Abstract:

The mass start speed skating (MSSS) is a new event for the 2018 PyeongChang Winter Olympics and will be an official race for the 2022 Beijing Winter Olympics. Considering that the event rankings were based on points gained on laps, it is worthwhile to investigate the pacing behavior on each lap that directly influences the ranking of the race. The aim of this study was to detect the pacing behavior and performance on MSSS regarding skaters’ level (SL), competition stage (semi-final/final) (CS) and gender (G). All the men's and women's races in the World Cup and World Championships were analyzed in the 2018-2019 and 2019-2020 seasons. As a result, a total of 601 skaters from 36 games were observed. ANOVA for repeated measures was applied to compare the pacing behavior on each lap, and the three-way ANOVA for repeated measures was used to identify the influence of SL, CS, and G on pacing behavior and total time spent. In general, the results showed that the pacing behavior from fast to slow were cluster 1—laps 4, 8, 12, 15, 16, cluster 2—laps 5, 9, 13, 14, cluster 3—laps 3, 6, 7, 10, 11, and cluster 4—laps 1 and 2 (p=0.000). For CS, the total time spent in the final was less than the semi-final (p=0.000). For SL, top-level skaters spent less total time than the middle-level and low-level (p≤0.002), while there was no significant difference between the middle-level and low-level (p=0.214). For G, the men’s skaters spent less total time than women on all laps (p≤0.048). This study could help to coach staff better understand the pacing behavior regarding SL, CS, and G, further providing references concerning promoting the pacing strategy and decision making before and during the race.

Keywords: performance analysis, pacing strategy, winning strategy, winter Olympics

Procedia PDF Downloads 189

17129 Using Group Concept Mapping to Identify a Pharmacy-Based Trigger Tool to Detect Adverse Drug Events

Authors: Rodchares Hanrinth, Theerapong Srisil, Peeraya Sriphong, Pawich Paktipat

Abstract:

The trigger tool is the low-cost, low-tech method to detect adverse events through clues called triggers. The Institute for Healthcare Improvement (IHI) has developed the Global Trigger Tool for measuring and preventing adverse events. However, this tool is not specific for detecting adverse drug events. The pharmacy-based trigger tool is needed to detect adverse drug events (ADEs). Group concept mapping is an effective method for conceptualizing various ideas from diverse stakeholders. This technique was used to identify a pharmacy-based trigger to detect adverse drug events (ADEs). The aim of this study was to involve the pharmacists in conceptualizing, developing, and prioritizing a feasible trigger tool to detect adverse drug events in a provincial hospital, the northeastern part of Thailand. The study was conducted during the 6-month period between April 1 and September 30, 2017. Study participants involved 20 pharmacists (17 hospital pharmacists and 3 pharmacy lecturers) engaging in three concept mapping workshops. In this meeting, the concept mapping technique created by Trochim, a highly constructed qualitative group technic for idea generating and sharing, was used to produce and construct participants' views on what triggers were potential to detect ADEs. During the workshops, participants (n = 20) were asked to individually rate the feasibility and potentiality of each trigger and to group them into relevant categories to enable multidimensional scaling and hierarchical cluster analysis. The outputs of analysis included the trigger list, cluster list, point map, point rating map, cluster map, and cluster rating map. The three workshops together resulted in 21 different triggers that were structured in a framework forming 5 clusters: drug allergy, drugs induced diseases, dosage adjustment in renal diseases, potassium concerning, and drug overdose. The first cluster is drug allergy such as the doctor’s orders for dexamethasone injection combined with chlorpheniramine injection. Later, the diagnosis of drug-induced hepatitis in a patient taking anti-tuberculosis drugs is one trigger in the ‘drugs induced diseases’ cluster. Then, for the third cluster, the doctor’s orders for enalapril combined with ibuprofen in a patient with chronic kidney disease is the example of a trigger. The doctor’s orders for digoxin in a patient with hypokalemia is a trigger in a cluster. Finally, the doctor’s orders for naloxone with narcotic overdose was classified as a trigger in a cluster. This study generated triggers that are similar to some of IHI Global trigger tool, especially in the medication module such as drug allergy and drug overdose. However, there are some specific aspects of this tool, including drug-induced diseases, dosage adjustment in renal diseases, and potassium concerning which do not contain in any trigger tools. The pharmacy-based trigger tool is suitable for pharmacists in hospitals to detect potential adverse drug events using clues of triggers.

Keywords: adverse drug events, concept mapping, hospital, pharmacy-based trigger tool

Procedia PDF Downloads 158

17128 Spatio-Temporal Changes of Rainfall in São Paulo, Brazil (1973-2012): A Gamma Distribution and Cluster Analysis

Authors: Guilherme Henrique Gabriel, Lucí Hidalgo Nunes

Abstract:

An important feature of rainfall regimes is the variability, which is subject to the atmosphere’s general and regional dynamics, geographical position and relief. Despite being inherent to the climate system, it can harshly impact virtually all human activities. In turn, global climate change has the ability to significantly affect smaller-scale rainfall regimes by altering their current variability patterns. In this regard, it is useful to know if regional climates are changing over time and whether it is possible to link these variations to climate change trends observed globally. This study is part of an international project (Metropole-FAPESP, Proc. 2012/51876-0 and Proc. 2015/11035-5) and the objective was to identify and evaluate possible changes in rainfall behavior in the state of São Paulo, southeastern Brazil, using rainfall data from 79 rain gauges for the last forty years. Cluster analysis and gamma distribution parameters were used for evaluating spatial and temporal trends, and the outcomes are presented by means of geographic information systems tools. Results show remarkable changes in rainfall distribution patterns in São Paulo over the years: changes in shape and scale parameters of gamma distribution indicate both an increase in the irregularity of rainfall distribution and the probability of occurrence of extreme events. Additionally, the spatial outcome of cluster analysis along with the gamma distribution parameters suggest that changes occurred simultaneously over the whole area, indicating that they could be related to remote causes beyond the local and regional ones, especially in a current global climate change scenario.

Keywords: climate change, cluster analysis, gamma distribution, rainfall

Procedia PDF Downloads 314

17127 Structure Clustering for Milestoning Applications of Complex Conformational Transitions

Authors: Amani Tahat, Serdal Kirmizialtin

Abstract:

Trajectory fragment methods such as Markov State Models (MSM), Milestoning (MS) and Transition Path sampling are the prime choice of extending the timescale of all atom Molecular Dynamics simulations. In these approaches, a set of structures that covers the accessible phase space has to be chosen a priori using cluster analysis. Structural clustering serves to partition the conformational state into natural subgroups based on their similarity, an essential statistical methodology that is used for analyzing numerous sets of empirical data produced by Molecular Dynamics (MD) simulations. Local transition kernel among these clusters later used to connect the metastable states using a Markovian kinetic model in MSM and a non-Markovian model in MS. The choice of clustering approach in constructing such kernel is crucial since the high dimensionality of the biomolecular structures might easily confuse the identification of clusters when using the traditional hierarchical clustering methodology. Of particular interest, in the case of MS where the milestones are very close to each other, accurate determination of the milestone identity of the trajectory becomes a challenging issue. Throughout this work we present two cluster analysis methods applied to the cis–trans isomerism of dinucleotide AA. The choice of nucleic acids to commonly used proteins to study the cluster analysis is two fold: i) the energy landscape is rugged; hence transitions are more complex, enabling a more realistic model to study conformational transitions, ii) Nucleic acids conformational space is high dimensional. A diverse set of internal coordinates is necessary to describe the metastable states in nucleic acids, posing a challenge in studying the conformational transitions. Herein, we need improved clustering methods that accurately identify the AA structure in its metastable states in a robust way for a wide range of confused data conditions. The single linkage approach of the hierarchical clustering available in GROMACS MD-package is the first clustering methodology applied to our data. Self Organizing Map (SOM) neural network, that also known as a Kohonen network, is the second data clustering methodology. The performance comparison of the neural network as well as hierarchical clustering method is studied by means of computing the mean first passage times for the cis-trans conformational rates. Our hope is that this study provides insight into the complexities and need in determining the appropriate clustering algorithm for kinetic analysis. Our results can improve the effectiveness of decisions based on clustering confused empirical data in studying conformational transitions in biomolecules.

Keywords: milestoning, self organizing map, single linkage, structure clustering

Procedia PDF Downloads 220

17126 The Application of Video Segmentation Methods for the Purpose of Action Detection in Videos

Authors: Nassima Noufail, Sara Bouhali

Abstract:

In this work, we develop a semi-supervised solution for the purpose of action detection in videos and propose an efficient algorithm for video segmentation. The approach is divided into video segmentation, feature extraction, and classification. In the first part, a video is segmented into clips, and we used the K-means algorithm for this segmentation; our goal is to find groups based on similarity in the video. The application of k-means clustering into all the frames is time-consuming; therefore, we started by the identification of transition frames where the scene in the video changes significantly, and then we applied K-means clustering into these transition frames. We used two image filters, the gaussian filter and the Laplacian of Gaussian. Each filter extracts a set of features from the frames. The Gaussian filter blurs the image and omits the higher frequencies, and the Laplacian of gaussian detects regions of rapid intensity changes; we then used this vector of filter responses as an input to our k-means algorithm. The output is a set of cluster centers. Each video frame pixel is then mapped to the nearest cluster center and painted with a corresponding color to form a visual map. The resulting visual map had similar pixels grouped. We then computed a cluster score indicating how clusters are near each other and plotted a signal representing frame number vs. clustering score. Our hypothesis was that the evolution of the signal would not change if semantically related events were happening in the scene. We marked the breakpoints at which the root mean square level of the signal changes significantly, and each breakpoint is an indication of the beginning of a new video segment. In the second part, for each segment from part 1, we randomly selected a 16-frame clip, then we extracted spatiotemporal features using convolutional 3D network C3D for every 16 frames using a pre-trained model. The C3D final output is a 512-feature vector dimension; hence we used principal component analysis (PCA) for dimensionality reduction. The final part is the classification. The C3D feature vectors are used as input to a multi-class linear support vector machine (SVM) for the training model, and we used a multi-classifier to detect the action. We evaluated our experiment on the UCF101 dataset, which consists of 101 human action categories, and we achieved an accuracy that outperforms the state of art by 1.2%.

Keywords: video segmentation, action detection, classification, Kmeans, C3D

Procedia PDF Downloads 71

17125 An Approach for Estimation in Hierarchical Clustered Data Applicable to Rare Diseases

Authors: Daniel C. Bonzo

Abstract:

Practical considerations lead to the use of unit of analysis within subjects, e.g., bleeding episodes or treatment-related adverse events, in rare disease settings. This is coupled with data augmentation techniques such as extrapolation to enlarge the subject base. In general, one can think about extrapolation of data as extending information and conclusions from one estimand to another estimand. This approach induces hierarchichal clustered data with varying cluster sizes. Extrapolation of clinical trial data is being accepted increasingly by regulatory agencies as a means of generating data in diverse situations during drug development process. Under certain circumstances, data can be extrapolated to a different population, a different but related indication, and different but similar product. We consider here the problem of estimation (point and interval) using a mixed-models approach under an extrapolation. It is proposed that estimators (point and interval) be constructed using weighting schemes for the clusters, e.g., equally weighted and with weights proportional to cluster size. Simulated data generated under varying scenarios are then used to evaluate the performance of this approach. In conclusion, the evaluation result showed that the approach is a useful means for improving statistical inference in rare disease settings and thus aids not only signal detection but risk-benefit evaluation as well.

Keywords: clustered data, estimand, extrapolation, mixed model

Procedia PDF Downloads 132

17124 Energy Efficient Clustering with Adaptive Particle Swarm Optimization

Authors: KumarShashvat, ArshpreetKaur, RajeshKumar, Raman Chadha

Abstract:

Wireless sensor networks have principal characteristic of having restricted energy and with limitation that energy of the nodes cannot be replenished. To increase the lifetime in this scenario WSN route for data transmission is opted such that utilization of energy along the selected route is negligible. For this energy efficient network, dandy infrastructure is needed because it impinges the network lifespan. Clustering is a technique in which nodes are grouped into disjoints and non–overlapping sets. In this technique data is collected at the cluster head. In this paper, Adaptive-PSO algorithm is proposed which forms energy aware clusters by minimizing the cost of locating the cluster head. The main concern is of the suitability of the swarms by adjusting the learning parameters of PSO. Particle Swarm Optimization converges quickly at the beginning stage of the search but during the course of time, it becomes stable and may be trapped in local optima. In suggested network model swarms are given the intelligence of the spiders which makes them capable enough to avoid earlier convergence and also help them to escape from the local optima. Comparison analysis with traditional PSO shows that new algorithm considerably enhances the performance where multi-dimensional functions are taken into consideration.

Keywords: Particle Swarm Optimization, adaptive – PSO, comparison between PSO and A-PSO, energy efficient clustering

Procedia PDF Downloads 242

17123 On the Cluster of the Families of Hybrid Polynomial Kernels in Kernel Density Estimation

Authors: Benson Ade Eniola Afere

Abstract:

Over the years, kernel density estimation has been extensively studied within the context of nonparametric density estimation. The fundamental components of kernel density estimation are the kernel function and the bandwidth. While the mathematical exploration of the kernel component has been relatively limited, its selection and development remain crucial. The Mean Integrated Squared Error (MISE), serving as a measure of discrepancy, provides a robust framework for assessing the effectiveness of any kernel function. A kernel function with a lower MISE is generally considered to perform better than one with a higher MISE. Hence, the primary aim of this article is to create kernels that exhibit significantly reduced MISE when compared to existing classical kernels. Consequently, this article introduces a cluster of hybrid polynomial kernel families. The construction of these proposed kernel functions is carried out heuristically by combining two kernels from the classical polynomial kernel family using probability axioms. We delve into the analysis of error propagation within these kernels. To assess their performance, simulation experiments, and real-life datasets are employed. The obtained results demonstrate that the proposed hybrid kernels surpass their classical kernel counterparts in terms of performance.

Keywords: classical polynomial kernels, cluster of families, global error, hybrid Kernels, Kernel density estimation, Monte Carlo simulation

Procedia PDF Downloads 85

17122 Estimating Poverty Levels from Satellite Imagery: A Comparison of Human Readers and an Artificial Intelligence Model

Authors: Ola Hall, Ibrahim Wahab, Thorsteinn Rognvaldsson, Mattias Ohlsson

Abstract:

The subfield of poverty and welfare estimation that applies machine learning tools and methods on satellite imagery is a nascent but rapidly growing one. This is in part driven by the sustainable development goal, whose overarching principle is that no region is left behind. Among other things, this requires that welfare levels can be accurately and rapidly estimated at different spatial scales and resolutions. Conventional tools of household surveys and interviews do not suffice in this regard. While they are useful for gaining a longitudinal understanding of the welfare levels of populations, they do not offer adequate spatial coverage for the accuracy that is needed, nor are their implementation sufficiently swift to gain an accurate insight into people and places. It is this void that satellite imagery fills. Previously, this was near-impossible to implement due to the sheer volume of data that needed processing. Recent advances in machine learning, especially the deep learning subtype, such as deep neural networks, have made this a rapidly growing area of scholarship. Despite their unprecedented levels of performance, such models lack transparency and explainability and thus have seen limited downstream applications as humans generally are apprehensive of techniques that are not inherently interpretable and trustworthy. While several studies have demonstrated the superhuman performance of AI models, none has directly compared the performance of such models and human readers in the domain of poverty studies. In the present study, we directly compare the performance of human readers and a DL model using different resolutions of satellite imagery to estimate the welfare levels of demographic and health survey clusters in Tanzania, using the wealth quintile ratings from the same survey as the ground truth data. The cluster-level imagery covers all 608 cluster locations, of which 428 were classified as rural. The imagery for the human readers was sourced from the Google Maps Platform at an ultra-high resolution of 0.6m per pixel at zoom level 18, while that of the machine learning model was sourced from the comparatively lower resolution Sentinel-2 10m per pixel data for the same cluster locations. Rank correlation coefficients of between 0.31 and 0.32 achieved by the human readers were much lower when compared to those attained by the machine learning model – 0.69-0.79. This superhuman performance by the model is even more significant given that it was trained on the relatively lower 10-meter resolution satellite data while the human readers estimated welfare levels from the higher 0.6m spatial resolution data from which key markers of poverty and slums – roofing and road quality – are discernible. It is important to note, however, that the human readers did not receive any training before ratings, and had this been done, their performance might have improved. The stellar performance of the model also comes with the inevitable shortfall relating to limited transparency and explainability. The findings have significant implications for attaining the objective of the current frontier of deep learning models in this domain of scholarship – eXplainable Artificial Intelligence through a collaborative rather than a comparative framework.

Keywords: poverty prediction, satellite imagery, human readers, machine learning, Tanzania

Procedia PDF Downloads 99

17121 Institutional Segmantation and Country Clustering: Implications for Multinational Enterprises Over Standardized Management

Authors: Jung-Hoon Han, Jooyoung Kwak

Abstract:

Distances between cultures, institutions are gaining academic attention once again since the classical debate on the validity of globalization. Despite the incessant efforts to define international segments with various concepts, no significant attempts have been made considering the institutional dimensions. Resource-based theory and institutional theory provides useful insights in assessing market environment and understanding when and how MNEs loose or gain advantages. This study consists of two parts: identifying institutional clusters and predicting the effect of MNEs’ origin on the applicability of competitive advantages. MNEs in one country cluster are expected to use similar management systems.

Keywords: institutional theory, resource-based theory, institutional environment, cultural dimensions, cluster analysis, standardized management

Procedia PDF Downloads 481

17120 Care: A Cluster Based Approach for Reliable and Efficient Routing Protocol in Wireless Sensor Networks

Authors: K. Prasanth, S. Hafeezullah Khan, B. Haribalakrishnan, D. Arun, S. Jayapriya, S. Dhivya, N. Vijayarangan

Abstract:

The main goal of our approach is to find the optimum positions for the sensor nodes, reinforcing the communications in points where certain lack of connectivity is found. Routing is the major problem in sensor network’s data transfer between nodes. We are going to provide an efficient routing technique to make data signal transfer to reach the base station soon without any interruption. Clustering and routing are the two important key factors to be considered in case of WSN. To carry out the communication from the nodes to their cluster head, we propose a parameterizable protocol so that the developer can indicate if the routing has to be sensitive to either the link quality of the nodes or the their battery levels.

Keywords: clusters, routing, wireless sensor networks, three phases, sensor networks

Procedia PDF Downloads 500

17119 A Spatial Autocorrelation Analysis of Women’s Mental Health and Walkability Index in Mashhad City, Iran, and Recommendations to Improve It

Authors: Mohammad Rahim Rahnama, Lia Shaddel

Abstract:

Today, along with the development of urbanism, its negative consequences on the health of citizens are emerging. Mental disorders are common in the big cities, while mental health enables individuals to become active citizens. Meanwhile, women have a larger share of mental problems. Depression and anxiety disorders have a higher prevalence rate among women and these disorders affect the health of future generations, too. Therefore, improving women’s mental health through the potentials offered by urban spaces are of paramount importance. The present study aims to first, evaluate the spatial autocorrelation of women’s mental health and walkable spaces and then present solutions, based on the findings, to improve the walkability index. To determine the spatial distribution of women’s mental health in Mashhad, Moran's I was used and 1000 questionnaire were handed out in various sub-districts of Mashhad. Moran's I was calculated to be 0.18 which indicates a cluster distribution pattern. The walkability index was calculated using the four variables pertaining to the length of walkable routes, mixed land use, retail floor area ratio, and household density. To determine spatial autocorrelation of mental health and the walkability index, bivariate Moran’s I was calculated. Moran's I was determined to be 0.37 which shows a direct spatial relationship between variables; 4 clusters in 9 sub-districts of Mashhad were created. In High-Low cluster, there was a negative spatial relationship and hence, to identify factors affecting walkability in urban spaces semi-structures interviews were conducted with 21 women in this cluster. The findings revealed that security is the major factor influencing women’s walking behavior in this cluster. In accordance with the findings, some suggestions are offered to improve the presence of women in this sub-district.

Keywords: Mashhad, spatial autocorrelation, women’s mental health, walkability index

Procedia PDF Downloads 129

17118 The Design of a Mixed Matrix Model for Activity Levels Extraction and Sub Processes Classification of a Work Project (Case: Great Tehran Electrical Distribution Company)

Authors: Elham Allahmoradi, Bahman Allahmoradi, Ali Bonyadi Naeini

Abstract:

Complex systems have many aspects. A variety of methods have been developed to analyze these systems. The most efficient of these methods should not only be simple, but also provide useful and comprehensive information about many aspects of the system. Matrix methods are considered the most commonly methods used to analyze and design systems. Each matrix method can examine a particular aspect of the system. If these methods are combined, managers can access to more comprehensive and broader information about the system. This study was conducted in four steps. In the first step, a process model of a real project has been extracted through IDEF3. In the second step, activity levels have been attained by writing a process model in the form of a design structure matrix (DSM) and sorting it through triangulation algorithm (TA). In the third step, sub-processes have been obtained by writing the process model in the form of an interface structure matrix (ISM) and clustering it through cluster identification algorithm (CIA). In the fourth step, a mixed model has been developed to provide a unified picture of the project structure through the simultaneous presentation of activities and sub-processes. Finally, the paper is completed with a conclusion.

Keywords: integrated definition for process description capture (IDEF3) method, design structure matrix (DSM), interface structure matrix (ism), mixed matrix model, activity level, sub-process

Procedia PDF Downloads 490

17117 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: cluster analysis, multivariate statistical techniques, river Hindon, water quality

Procedia PDF Downloads 456

17116 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 280

17115 Analysis of Expert Information in Linguistic Terms

Authors: O. Poleshchuk, E. Komarov

Abstract:

In this paper, semantic spaces with the properties of completeness and orthogonality (complete orthogonal semantic spaces) were chosen as models of expert evaluations. As the theoretical and practical studies have shown all the properties of complete orthogonal semantic spaces correspond to the thinking activity of experts that is why these semantic spaces were chosen for modeling. Two methods of construction such spaces were proposed. Models of comparative and fuzzy cluster analysis of expert evaluations were developed. The practical application of the developed methods has demonstrated their viability and validity.

Keywords: expert evaluation, comparative analysis, fuzzy cluster analysis, theoretical and practical studies

Procedia PDF Downloads 526

17114 Investigation of Clusters of MRSA Cases in a Hospital in Western Kenya

Authors: Lillian Musila, Valerie Oundo, Daniel Erwin, Willie Sang

Abstract:

Staphylococcus aureus infections are a major cause of nosocomial infections in Kenya. Methicillin resistant S. aureus (MRSA) infections are a significant burden to public health and are associated with considerable morbidity and mortality. At a hospital in Western Kenya two clusters of MRSA cases emerged within short periods of time. In this study we explored whether these clusters represented a nosocomial outbreak by characterizing the isolates using phenotypic and molecular assays and examining epidemiological data to identify possible transmission patterns. Specimens from the site of infection of the subjects were collected, cultured and S. aureus isolates identified phenotypically and confirmed by APIStaph™. MRSA were identified by cefoxitin disk screening per CLSI guidelines. MRSA were further characterized based on their antibiotic susceptibility patterns and spa gene typing. Characteristics of cases with MRSA isolates were compared with those with MSSA isolated around the same time period. Two cases of MRSA infection were identified in the two week period between 21 April and 4 May 2015. A further 2 MRSA isolates were identified on the same day on 7 September 2015. The antibiotic resistance patterns of the two MRSA isolates in the 1st cluster of cases were different suggesting that these were distinct isolates. One isolate had spa type t2029 and the other had a novel spa type. The 2 isolates were obtained from urine and an open skin wound. In the 2nd cluster of MRSA isolates, the antibiotic susceptibility patterns were similar but isolates had different spa types: one was t037 and the other a novel spa type different from the novel MRSA spa type in the first cluster. Both cases in the second cluster were admitted into the hospital but one infection was community- and the other hospital-acquired. Only one of the four MRSA cases was classified as an HAI from an infection acquired post-operatively. When compared to other S. aureus strains isolated within the same time period from the same hospital only one spa type t2029 was found in both MRSA and non-MRSA strains. None of the cases infected with MRSA in the two clusters shared any common epidemiological characteristic such as age, sex or known risk factors for MRSA such as prolonged hospitalization or institutionalization. These data suggest that the observed MRSA clusters were multi strain clusters and not an outbreak of a single strain. There was no clear relationship between the isolates by spa type suggesting that no transmission was occurring within the hospital between these cluster cases but rather that the majority of the MRSA strains were circulating in the community. There was high diversity of spa types among the MRSA strains with none of the isolates sharing spa types. Identification of disease clusters in space and time is critical for immediate infection control action and patient management. Spa gene typing is a rapid way of confirming or ruling out MRSA outbreaks so that costly interventions are applied only when necessary.

Keywords: cluster, Kenya, MRSA, spa typing

Procedia PDF Downloads 324

17113 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: machine learning, stock market trading, logistic regression, cluster analysis, factor analysis, decision trees, neural networks, automated stock investment system

Procedia PDF Downloads 151

17112 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism

Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng

Abstract:

Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.

Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition

Procedia PDF Downloads 173

17111 A Literature Review on the Effect of Industrial Clusters and the Absorptive Capacity on Innovation

Authors: Enrique Claver Cortés, Bartolomé Marco Lajara, Eduardo Sánchez García, Pedro Seva Larrosa, Encarnación Manresa Marhuenda, Lorena Ruiz Fernández, Esther Poveda Pareja

Abstract:

In recent decades, the analysis of the effects of clustering as an essential factor for the development of innovations and the competitiveness of enterprises has raised great interest in different areas. Nowadays, companies have access to almost all tangible and intangible resources located and/or developed in any country in the world. However, despite the obvious advantages that this situation entails for companies, their geographical location has shown itself, increasingly clearly, to be a fundamental factor that positively influences their innovative performance and competitiveness. Industrial clusters could represent a unique level of analysis, positioned between the individual company and the industry, which makes them an ideal unit of analysis to determine the effects derived from company membership of a cluster. Also, the absorptive capacity (hereinafter 'AC') can mediate the process of innovation development by companies located in a cluster. The transformation and exploitation of knowledge could have a mediating effect between knowledge acquisition and innovative performance. The main objective of this work is to determine the key factors that affect the degree of generation and use of knowledge from the environment by companies and, consequently, their innovative performance and competitiveness. The elements analyzed are the companies' membership of a cluster and the AC. To this end, 30 most relevant papers published on this subject in the "Web of Science" database have been reviewed. Our findings show that, within a cluster, the knowledge coming from the companies' environment can significantly influence their innovative performance and competitiveness, although in this relationship, the degree of access and exploitation of the companies to this knowledge plays a fundamental role, which depends on a series of elements both internal and external to the company.

Keywords: absorptive capacity, clusters, innovation, knowledge

Procedia PDF Downloads 124

17110 Improved Color-Based K-Mean Algorithm for Clustering of Satellite Image

Authors: Sangeeta Yadav, Mantosh Biswas

Abstract:

In this paper, we proposed an improved color based K-mean algorithm for clustering of satellite Image (SAR). Our method comprises of two stages. The first step is an interactive selection process where users are required to input the number of colors (ncolor), number of clusters, and then they are prompted to select the points in each color cluster. In the second step these points are given as input to K-mean clustering algorithm that clusters the image based on color and Minimum Square Euclidean distance. The proposed method reduces the mixed pixel problem to a great extent.

Keywords: cluster, ncolor method, K-mean method, interactive selection process

Procedia PDF Downloads 292

17109 Markov Switching of Conditional Variance

Authors: Josip Arneric, Blanka Skrabic Peric

Abstract:

Forecasting of volatility, i.e. returns fluctuations, has been a topic of interest to portfolio managers, option traders and market makers in order to get higher profits or less risky positions. Based on the fact that volatility is time varying in high frequency data and that periods of high volatility tend to cluster, the most common used models are GARCH type models. As standard GARCH models show high volatility persistence, i.e. integrated behaviour of the conditional variance, it is difficult the predict volatility using standard GARCH models. Due to practical limitations of these models different approaches have been proposed in the literature, based on Markov switching models. In such situations models in which the parameters are allowed to change over time are more appropriate because they allow some part of the model to depend on the state of the economy. The empirical analysis demonstrates that Markov switching GARCH model resolves the problem of excessive persistence and outperforms uni-regime GARCH models in forecasting volatility for selected emerging markets.

Keywords: emerging markets, Markov switching, GARCH model, transition probabilities

Procedia PDF Downloads 450

17108 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 411

17107 Cross-Cultural Analysis of the Impact of Project Atmosphere on Project Success and Failure

Authors: Omer Livvarcin, Mary Kay Park, Michael Miles

Abstract:

The current literature includes a few studies that mention the impact of relations between teams, the business environment, and experiences from previous projects. There is, however, limited research that treats the phenomenon of project atmosphere (PA) as a whole. This is especially true of research identifying parameters and sub-parameters, which allow project management (PM) teams to build a project culture that ultimately imbues project success. This study’s findings identify a number of key project atmosphere parameters and sub-parameters that affect project management success. One key parameter identified in the study is a cluster related to cultural concurrence, including artifacts such as policies and mores, values, perceptions, and assumptions. A second cluster centers on motivational concurrence, including such elements as project goals and team-member expectations, moods, morale, motivation, and organizational support. A third parameter cluster relates to experiential concurrence, with a focus on project and organizational memory, previous internal PM experience, and external environmental PM history and experience). A final cluster of parameters is comprised of those falling in the area of relational concurrence, including inter/intragroup relationships, role conflicts, and trust. International and intercultural project management data was collected and analyzed from the following countries: Canada, China, Nigeria, South Korea and Turkey. The cross-cultural nature of the data set suggests increased confidence that the findings will be generalizable across cultures and thus applicable for future international project management success. The intent of the identification of project atmosphere as a critical project management element is that a clear understanding of the dynamics of its sub-parameters upon projects may significantly improve the odds of success of future international and intercultural projects.

Keywords: project management, project atmosphere, cultural concurrence, motivational concurrence, relational concurrence

Procedia PDF Downloads 311

17106 Genetic Divergence and Morphogenic Analysis of Sugarcane Red Rot Pathogen Colletotrichum falcatum under South Gujarat Condition

Authors: Prittesh Patel, Ramar Krishnamurthy

Abstract:

In the present study, nine strains of C. falcatum obtained from different places and cultivars were characterized for sporulation, growth rate, and 18S rRNA gene sequence. All isolates had characteristic fast-growing sparse and fleecy aerial mycelia on potato dextrose agar with sickle shape conidia (length x width: varied from 20.0 X 3.89 to 25.52 X 5.34 μm) and blackish to orange acervuli with setae (length x width: varied from 112.37X 2.78 to 167.66 X 6.73 μm). They could be divided into two groups on the base of morphology; P1, dense mycelia with concentric growth and P2, sparse mycelia with uneven growth. Genomic DNA isolation followed by PCR amplification with ITS1 and ITS4 primer produced ~550bp amplicons for all isolates. Phylogeny generated by 18S rRNA gene sequence confirmed the variation in isolates and mainly grouped into two clusters; cluster 1 contained CoC671 isolates (cfNAV and cfPAR) and Co86002 isolate (cfTIM). Other isolates cfMAD, cfKAM, and cfMAR were grouped into cluster 2. Remaining isolates did not fall into any cluster. Isolate cfGAN, collected from Co86032 was found highly diverse of all the nine isolates. In a nutshell, we found considerable genetic divergence and morphological variation within C. falcatum accessions collected from different areas of south Gujarat, India and these can be used for the breeding program.

Keywords: Colletotrichum falcatum, ITS, morphology, red rot, sugarcane

Procedia PDF Downloads 121

17105 Evaluating the Factors Controlling the Hydrochemistry of Gaza Coastal Aquifer Using Hydrochemical and Multivariate Statistical Analysis

Authors: Madhat Abu Al-Naeem, Ismail Yusoff, Ng Tham Fatt, Yatimah Alias

Abstract:

Groundwater in Gaza strip is increasingly being exposed to anthropic and natural factors that seriously impacted the groundwater quality. Physiochemical data of groundwater can offer important information on changes in groundwater quality that can be useful in improving water management tactics. An integrative hydrochemical and statistical techniques (Hierarchical cluster analysis (HCA) and factor analysis (FA)) have been applied on the existence ten physiochemical data of 84 samples collected in (2000/2001) using STATA, AquaChem, and Surfer softwares to: 1) Provide valuable insight into the salinization sources and the hydrochemical processes controlling the chemistry of groundwater. 2) Differentiate the influence of natural processes and man-made activities. The recorded large diversity in water facies with dominance Na-Cl type that reveals a highly saline aquifer impacted by multiple complex hydrochemical processes. Based on WHO standards, only (15.5%) of the wells were suitable for drinking. HCA yielded three clusters. Cluster 1 is the highest in salinity, mainly due to the impact of Eocene saline water invasion mixed with human inputs. Cluster 2 is the lowest in salinity also due to Eocene saline water invasion but mixed with recent rainfall recharge and limited carbonate dissolution and nitrate pollution. Cluster 3 is similar in salinity to Cluster 2, but with a high diversity of facies due to the impact of many sources of salinity as sea water invasion, carbonate dissolution and human inputs. Factor analysis yielded two factors accounting for 88% of the total variance. Factor 1 (59%) is a salinization factor demonstrating the mixing contribution of natural saline water with human inputs. Factor 2 measure the hardness and pollution which explained 29% of the total variance. The negative relationship between the NO3- and pH may reveal a denitrification process in a heavy polluted aquifer recharged by a limited oxygenated rainfall. Multivariate statistical analysis combined with hydrochemical analysis indicate that the main factors controlling groundwater chemistry were Eocene saline invasion, seawater invasion, sewage invasion and rainfall recharge and the main hydrochemical processes were base ion and reverse ion exchange processes with clay minerals (water rock interactions), nitrification, carbonate dissolution and a limited denitrification process.

Keywords: dendrogram and cluster analysis, water facies, Eocene saline invasion and sea water invasion, nitrification and denitrification

Procedia PDF Downloads 358

17104 Analysis of Ozone Episodes in the Forest and Vegetation Areas with Using HYSPLIT Model: A Case Study of the North-West Side of Biga Peninsula, Turkey

Authors: Deniz Sari, Selahattin İncecik, Nesimi Ozkurt

Abstract:

Surface ozone, which named as one of the most critical pollutants in the 21th century, threats to human health, forest and vegetation. Specifically, in rural areas surface ozone cause significant influences on agricultural productions and trees. In this study, in order to understand to the surface ozone levels in rural areas we focus on the north-western side of Biga Peninsula which covers by the mountainous and forested area. Ozone concentrations were measured for the first time with passive sampling at 10 sites and two online monitoring stations in this rural area from 2013 and 2015. Using with the daytime hourly O3 measurements during light hours (08:00–20:00) exceeding the threshold of 40 ppb over the 3 months (May, June and July) for agricultural crops, and over the six months (April to September) for forest trees AOT40 (Accumulated hourly O3 concentrations Over a Threshold of 40 ppb) cumulative index was calculated. AOT40 is defined by EU Directive 2008/50/EC to evaluate whether ozone pollution is a risk for vegetation, and is calculated by using hourly ozone concentrations from monitoring systems. In the present study, we performed the trajectory analysis by The Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model to follow the long-range transport sources contributing to the high ozone levels in the region. The ozone episodes observed between 2013 and 2015 were analysed using the HYSPLIT model developed by the NOAA-ARL. In addition, the cluster analysis is used to identify homogeneous groups of air mass transport patterns can be conducted through air trajectory clustering by grouping similar trajectories in terms of air mass movement. Backward trajectories produced for 3 years by HYSPLIT model were assigned to different clusters according to their moving speed and direction using a k-means clustering algorithm. According to cluster analysis results, northerly flows to study area cause to high ozone levels in the region. The results present that the ozone values in the study area are above the critical levels for forest and vegetation based on EU Directive 2008/50/EC.

Keywords: AOT40, Biga Peninsula, HYSPLIT, surface ozone

Procedia PDF Downloads 249

17103 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo

Abstract:

Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 132

17102 The Effect of Hypertrophy Strength Training Using Traditional Set vs. Cluster Set on Maximum Strength and Sprinting Speed

Authors: Bjornar Kjellstadli, Shaher A. I. Shalfawi

Abstract:

The aim of this study was to investigate the effect of strength training Cluster set-method compared to traditional set-method 30 m sprinting time and maximum strength in squats and bench-press. Thirteen Physical Education students, 7 males and 6 females between the age of 19-28 years old were recruited. The students were random divided in three groups. Traditional set group (TSG) consist of 2 males and 2 females aged (±SD) (22.3 ± 1.5 years), body mass (79.2 ± 15.4 kg) and height (177.5 ± 11.3 cm). Cluster set group (CSG) consist of 3 males and 2 females aged (22.4 ± 3.29 years), body mass (81.0 ± 24.0 kg) and height (179.2 ± 11.8 cm) and a control group (CG) consist of 2 males and 2 females aged (21.5 ± 2.4 years), body mass (82.1 ± 17.4 kg) and height (175.5 ± 6.7 cm). The intervention consisted of performing squat and bench press at 70% of 1RM (twice a week) for 8 weeks using 10 repetition and 4 sets. Two types of strength-training methods were used , cluster set (CS) where the participants (CSG) performed 2 reps 5 times with a 10 s recovery in between reps and 50 s recovery between sets, and traditional set (TS) where the participants (TSG) performed 10 reps each set with 90 s recovery in between sets. The pre-tests and post-tests conducted were 1 RM in both squats and bench press, and 10 and 30 m sprint time. The 1RM test were performed with Eleiko XF barbell (20 kg), Eleiko weight plates, rack and bench from Hammerstrength. The speed test was measured with the Brower speed trap II testing system (Brower Timing Systems, Utah, USA). The participants received an individualized training program based on the pre-test of the 1RM. In addition, a mid-term test of 1RM was carried out to adjust training intensity. Each training session were supervised by the researchers. Beast sensors (Milano, Italy) were also used to monitor and quantify the training load for the participants. All groups had a statistical significant improvement in bench press 1RM (TSG 1RM from 56.3 ± 28.9 to 66 ± 28.5 kg; CSG 1RM from 69.8 ± 33.5 to 77.2 ± 34.1 kg and CG 1RM from 67.8 ± 26.6 to 72.2 ± 29.1 kg), whereas only the TSG (1RM from 84.3 ± 26.8 to 114.3 ± 26.5 kg) and CSG (1RM from 100.4 ± 33.9 to 129 ± 35.1 kg) had a statistical significant improvement in Squats 1RM (P < 0.05). However, a between groups examination reveals that there were no marked differences in 1RM squat performance between TSG and CSG (P > 0.05) and both groups had a marked improvements compared to the CG (P < 0.05). On the other hand, no differences between groups were observed in Bench press 1RM. The within groups results indicate that none of the groups had any marked improvement in the distances from 0-10 m and 10-30 m except the CSG which had a notable improvement in the distance from 10-30 m (-0.07 s; P < 0.05). Furthermore, no differences in sprinting abilities were observed between groups. The results from this investigation indicate that traditional set strength training at 70% of 1RM gave close results compared to Cluster set strength training at the same intensity. However, the results indicate that the cluster set had an effect on flying time (10-30 m) indicating that the velocity at which those repetitions were performed could be the explanation factor of this this improvement.

Keywords: physical performance, 1RM, pushing velocity, velocity based training

Procedia PDF Downloads 160

17101 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

Authors: Rishabh Srivastav, Divyam Sharma

Abstract:

We propose an algorithm to cluster categorical data named as ‘Genetic algorithm initialized rough set based fuzzy K-Modes for categorical data’. We propose an amalgamation of the simple K-modes algorithm, the Rough and Fuzzy set based K-modes and the Genetic Algorithm to form a new algorithm,which we hypothesise, will provide better Centroid Model clustering results, than existing standard algorithms. In the proposed algorithm, the initialization and updation of modes is done by the use of genetic algorithms while the membership values are calculated using the rough set and fuzzy logic.

Keywords: categorical data, fuzzy logic, genetic algorithm, K modes clustering, rough sets

Procedia PDF Downloads 241