Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 58

Search results for: Cluster analysis

58 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: Machine learning, stock market trading, logistic principal component analysis, automated stock investment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36
57 Intellectual Capital Disclosure: Profiles of Spanish Public Universities

Authors: Yolanda Ramírez, Ángel Tejada, Agustín Baidez

Abstract:

In the higher education setting, there is a current trend in society toward greater openness and transparency. The economic, social and political changes that have occurred in recent years in public sector universities (particularly the New Public Management, the Bologna Process and the emergence of the “third mission”) call for a wider disclosure of value created by universities to support fundraising activities, to ensure accountability in the use of public funds and the outcomes of research and teaching, as well as close relationships with industries and territories. The paper has two purposes: 1) to explore the intellectual capital (IC) disclosure in Spanish universities through their websites, and 2) to identify university profiles. This study applies a content analysis to analyze the institutional websites of Spanish public universities and a cluster analysis. The analysis reveals that Spanish universities’ website content usually relates to human capital, while structural and relational capitals are less widely disclosed. Our research identifies three behavioral profiles of Spanish universities with regard to the online disclosure of IC (universities more proactive, universities less proactive and universities adopt a middle position in this regard. The results can serve as encouragement to university managers to enhance online IC disclosure to meet the information needs of university stakeholders.

Keywords: Universities, intellectual capital, disclosure, Internet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 82
56 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: Cluster analysis, education, mathematics, profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 358
55 A Design for Customer Preferences Model by Cluster Analysis of Geometric Features and Customer Preferences

Authors: Yuan-Jye Tseng, Ching-Yen Chen

Abstract:

In the design cycle, a main design task is to determine the external shape of the product. The external shape of a product is one of the key factors that can affect the customers’ preferences linking to the motivation to buy the product, especially in the case of a consumer electronic product such as a mobile phone. The relationship between the external shape and the customer preferences needs to be studied to enhance the customer’s purchase desire and action. In this research, a design for customer preferences model is developed for investigating the relationships between the external shape and the customer preferences of a product. In the first stage, the names of the geometric features are collected and evaluated from the data of the specified internet web pages using the developed text miner. The key geometric features can be determined if the number of occurrence on the web pages is relatively high. For each key geometric feature, the numerical values are explored using the text miner to collect the internet data from the web pages. In the second stage, a cluster analysis model is developed to evaluate the numerical values of the key geometric features to divide the external shapes into several groups. Several design suggestion cases can be proposed, for example, large model, mid-size model, and mini model, for designing a mobile phone. A customer preference index is developed by evaluating the numerical data of each of the key geometric features of the design suggestion cases. The design suggestion case with the top ranking of the customer preference index can be selected as the final design of the product. In this paper, an example product of a notebook computer is illustrated. It shows that the external shape of a product can be used to drive customer preferences. The presented design for customer preferences model is useful for determining a suitable external shape of the product to increase customer preferences.

Keywords: Cluster analysis, customer preferences, design evaluation, design for customer preferences, product design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 256
54 A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Authors: Natalia Rudeli, Elisabeth Viles, Adrian Santilli

Abstract:

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Keywords: Cluster analysis, construction management, earned value, schedule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 529
53 Cluster Analysis of Customer Churn in Telecom Industry

Authors: Abbas Al-Refaie

Abstract:

The research examines the factors that affect customer churn (CC) in the Jordanian telecom industry. A total of 700 surveys were distributed. Cluster analysis revealed three main clusters. Results showed that CC and customer satisfaction (CS) were the key determinants in forming the three clusters. In two clusters, the center values of CC were high, indicating that the customers were loyal and SC was expensive and time- and energy-consuming. Still, the mobile service provider (MSP) should enhance its communication (COM), and value added services (VASs), as well as customer complaint management systems (CCMS). Finally, for the third cluster the center of the CC indicates a poor level of loyalty, which facilitates customers churn to another MSP. The results of this study provide valuable feedback for MSP decision makers regarding approaches to improving their performance and reducing CC.

Keywords: Cluster analysis, telecom industry, switching cost, customer churn.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
52 Cluster Analysis of Retailers’ Benefits from Their Cooperation with Manufacturers: Business Models Perspective

Authors: M. K. Witek-Hajduk, T. M. Napiórkowski

Abstract:

A number of studies discussed the topic of benefits of retailers-manufacturers cooperation and coopetition. However, there are only few publications focused on the benefits of cooperation and coopetition between retailers and their suppliers of durable consumer goods; especially in the context of business model of cooperating partners. This paper aims to provide a clustering approach to segment retailers selling consumer durables according to the benefits they obtain from their cooperation with key manufacturers and differentiate the said retailers’ in term of the business models of cooperating partners. For the purpose of the study, a survey (with a CATI method) collected data on 603 consumer durables retailers present on the Polish market. Retailers are clustered both, with hierarchical and non-hierarchical methods. Five distinctive groups of consumer durables’ retailers are (based on the studied benefits) identified using the two-stage clustering approach. The clusters are then characterized with a set of exogenous variables, key of which are business models employed by the retailer and its partnering key manufacturer. The paper finds that the a combination of a medium sized retailer classified as an Integrator with a chiefly domestic capital and a manufacturer categorized as a Market Player will yield the highest benefits. On the other side of the spectrum is medium sized Distributor retailer with solely domestic capital – in this case, the business model of the cooperating manufactrer appears to be irreleveant. This paper is the one of the first empirical study using cluster analysis on primary data that defines the types of cooperation between consumer durables’ retailers and manufacturers – their key suppliers. The analysis integrates a perspective of both retailers’ and manufacturers’ business models and matches them with individual and joint benefits.

Keywords: Business model, cooperation, cluster analysis, retailer-manufacturer relationships.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683
51 Genetic Diversity Based Population Study of Freshwater Mud Eel (Monopterus cuchia) in Bangladesh

Authors: M. F. Miah, K. M. A. Zinnah, M. J. Raihan, H. Ali, M. N. Naser

Abstract:

As genetic diversity is most important for existing, breeding and production of any fish; this study was undertaken for investigating genetic diversity of freshwater mud eel, Monopterus cuchia at population level where three ecological populations such as flooded area of Sylhet (P1), open water of Moulvibazar (P2) and open water of Sunamganj (P3) districts of Bangladesh were considered. Four arbitrary RAPD primers (OPB-12, C0-4, B-03 and OPB-08) were screened and RAPD banding patterns were analyzed among the populations considering 15 individuals of each population. In total 174, 138 and 149 bands were detected in the populations of P1, P2 and P3 respectively; however, each primer revealed less number of bands in each population. 100% polymorphic loci were recorded in P2 and P3 whereas only one monomorphic locus was observed in P1, recorded 97.5% polymorphism. Different genetic parameters such as inter-individual pairwise similarity, genetic distance, Nei genetic similarity, linkage distances, cluster analysis and allelic information, etc. were considered for measuring genetic diversity. The average inter-individual pairwise similarity was recorded 2.98, 1.47 and 1.35 in P1, P2 and P3 respectively. Considering genetic distance analysis, the highest distance 1 was recorded in P2 and P3 and the lowest genetic distance 0.444 was found in P2. The average Nei genetic similarity was observed 0.19, 0.16 and 0.13 in P1, P2 and P3, respectively; however, the average linkage distance was recorded 24.92, 17.14 and 15.28 in P1, P3 and P2 respectively. Based on linkage distance, genetic clusters were generated in three populations where 6 clades and 7 clusters were found in P1, 3 clades and 5 clusters were observed in P2 and 4 clades and 7 clusters were detected in P3. In addition, allelic information was observed where the frequency of p and q alleles were observed 0.093 and 0.907 in P1, 0.076 and 0.924 in P2, 0.074 and 0.926 in P3 respectively. The average gene diversity was observed highest in P2 (0.132) followed by P3 (0.131) and P1 (0.121) respectively.

Keywords: Genetic diversity, Monopterus cuchia, population, RAPD, Bangladesh.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1221
50 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: Share of electricity generation, CO2 emission, targets, multivariate methods, hierarchical clustering, K-means clustering, discriminant analyzed, correlation, EU member countries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 705
49 Authenticity of Lipid and Soluble Sugar Profiles of Various Oat Cultivars (Avena sativa)

Authors: Marijana M. Ačanski, Kristian A. Pastor, Djura N. Vujić

Abstract:

The identification of lipid and soluble sugar components in flour samples of different cultivars belonging to common oat species (Avena sativa L.) was performed: spring oat, winter oat and hulless oat. Fatty acids were extracted from flour samples with n-hexane, and derivatized into volatile methyl esters, using TMSH (trimethylsulfonium hydroxide in methanol). Soluble sugars were then extracted from defatted and dried samples of oat flour with 96% ethanol, and further derivatized into corresponding TMS-oximes, using hydroxylamine hydrochloride solution and BSTFA (N,O-bis-(trimethylsilyl)-trifluoroacetamide). The hexane and ethanol extracts of each oat cultivar were analyzed using GC-MS system. Lipid and simple sugar compositions are very similar in all samples of investigated cultivars. Chemometric tool was applied to numeric values of automatically integrated surface areas of detected lipid and simple sugar components in their corresponding derivatized forms. Hierarchical cluster analysis shows a very high similarity between the investigated flour samples of oat cultivars, according to the fatty acid content (0.9955). Moderate similarity was observed according to the content of soluble sugars (0.50). These preliminary results support the idea of establishing methods for oat flour authentication, and provide the means for distinguishing oat flour samples, regardless of the variety, from flour samples made of other cereal species, just by lipid and simple sugar profile analysis.

Keywords: Authentication, chemometrics, GC-MS, lipid and soluble sugar composition, oat cultivars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 979
48 The Role of Knowledge Management in Innovation: Spanish Evidence

Authors: María Jesús Luengo-Valderrey, Mónica Moso-Díez

Abstract:

In the knowledge-based economy, innovation is considered essential in order to achieve survival and growth in organizations. On the other hand, knowledge management is currently understood as one of the keys to innovation process. Both factors are generally admitted as generators of competitive advantage in organizations. Specifically, activities on R&D&I and those that generate internal knowledge have a positive influence in innovation results. This paper examines this effect and if it is similar or not is what we aimed to quantify in this paper. We focus on the impact that proportion of knowledge workers, the R&D&I investment, the amounts destined for ICTs and training for innovation have on the variation of tangible and intangibles returns for the sector of high and medium technology in Spain. To do this, we have performed an empirical analysis on the results of questionnaires about innovation in enterprises in Spain, collected by the National Statistics Institute. First, using clusters methodology, the behavior of these enterprises regarding knowledge management is identified. Then, using SEM methodology, we performed, for each cluster, the study about cause-effect relationships among constructs defined through variables, setting its type and quantification. The cluster analysis results in four groups in which cluster number 1 and 3 presents the best performance in innovation with differentiating nuances among them, while clusters 2 and 4 obtained divergent results to a similar innovative effort. However, the results of SEM analysis for each cluster show that, in all cases, knowledge workers are those that affect innovation performance most, regardless of the level of investment, and that there is a strong correlation between knowledge workers and investment in knowledge generation. The main findings reached is that Spanish high and medium technology companies improve their innovation performance investing in internal knowledge generation measures, specially, in terms of R&D activities, and underinvest in external ones. This, and the strong correlation between knowledge workers and the set of activities that promote the knowledge generation, should be taken into account by managers of companies, when making decisions about their investments for innovation, since they are key for improving their opportunities in the global market.

Keywords: High and medium technology sector, innovation, knowledge management, Spanish companies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1757
47 A Multivariate Statistical Approach for Water Quality Assessment of River Hindon, India

Authors: Nida Rizvi, Deeksha Katyal, Varun Joshi

Abstract:

River Hindon is an important river catering the demand of highly populated rural and industrial cluster of western Uttar Pradesh, India. Water quality of river Hindon is deteriorating at an alarming rate due to various industrial, municipal and agricultural activities. The present study aimed at identifying the pollution sources and quantifying the degree to which these sources are responsible for the deteriorating water quality of the river. Various water quality parameters, like pH, temperature, electrical conductivity, total dissolved solids, total hardness, calcium, chloride, nitrate, sulphate, biological oxygen demand, chemical oxygen demand, and total alkalinity were assessed. Water quality data obtained from eight study sites for one year has been subjected to the two multivariate techniques, namely, principal component analysis and cluster analysis. Principal component analysis was applied with the aim to find out spatial variability and to identify the sources responsible for the water quality of the river. Three Varifactors were obtained after varimax rotation of initial principal components using principal component analysis. Cluster analysis was carried out to classify sampling stations of certain similarity, which grouped eight different sites into two clusters. The study reveals that the anthropogenic influence (municipal, industrial, waste water and agricultural runoff) was the major source of river water pollution. Thus, this study illustrates the utility of multivariate statistical techniques for analysis and elucidation of multifaceted data sets, recognition of pollution sources/factors and understanding temporal/spatial variations in water quality for effective river water quality management.

Keywords: Cluster analysis, multivariate statistical technique, river Hindon, water Quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2763
46 Comparative Correlation Investigation of Polynuclear Aromatic Hydrocarbons (PAHs) in Soils of Different Land Use: Sources Evaluation Perspective

Authors: O. Onoriode Emoyan, E. Eyitemi Akporhonor, Charles Otobrise

Abstract:

Polycyclic Aromatic Hydrocarbons (PAHs) are formed mainly because of incomplete combustion of organic materials during industrial, domestic activities or natural occurrence. Their toxicity and contamination of terrestrial and aquatic ecosystem have been established. However, with limited validity index, previous research has focused on PAHs isomer pair ratios of variable physicochemical properties in source identification. The objective of this investigation was to determine the empirical validity of Pearson Correlation Coefficient (PCC) and Cluster Analysis (CA) in PAHs source identification along soil samples of different land uses. Therefore, 16 PAHs grouped, as Endocrine Disruption Substances (EDSs) were determined in 10 sample stations in top and sub soils seasonally. PAHs was determined the use of Varian 300 gas chromatograph interfaced with flame ionization detector. Instruments and reagents used are of standard and chromatographic grades respectively. PCC and CA results showed that the classification of PAHs along pyrolitic and petrogenic organics used in source signature is about the predominance PAHs in environmental matrix. Therefore, the distribution of PAHs in the studied stations revealed the presence of trace quantities of the vast majority of the sixteen PAHs, which may ultimately inhabit the actual source signature authentication. Therefore, factors to be considered when evaluating possible sources of PAHs could be; type and extent of bacterial metabolism, transformation products/substrates, and environmental factors such as salinity, pH, oxygen concentration, nutrients, light intensity, temperature, co-substrates, and environmental medium are hereby recommended as factors to be considered when evaluating possible sources of PAHs.

Keywords: Comparative correlation, kinetically, polynuclear aromatic hydrocarbons, thermodynamically- favored PAHs, sources evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1355
45 Customer Segmentation Model in E-commerce Using Clustering Techniques and LRFM Model: The Case of Online Stores in Morocco

Authors: Rachid Ait daoud, Abdellah Amine, Belaid Bouikhalene, Rachid Lbibb

Abstract:

Given the increase in the number of e-commerce sites, the number of competitors has become very important. This means that companies have to take appropriate decisions in order to meet the expectations of their customers and satisfy their needs. In this paper, we present a case study of applying LRFM (length, recency, frequency and monetary) model and clustering techniques in the sector of electronic commerce with a view to evaluating customers’ values of the Moroccan e-commerce websites and then developing effective marketing strategies. To achieve these objectives, we adopt LRFM model by applying a two-stage clustering method. In the first stage, the self-organizing maps method is used to determine the best number of clusters and the initial centroid. In the second stage, kmeans method is applied to segment 730 customers into nine clusters according to their L, R, F and M values. The results show that the cluster 6 is the most important cluster because the average values of L, R, F and M are higher than the overall average value. In addition, this study has considered another variable that describes the mode of payment used by customers to improve and strengthen clusters’ analysis. The clusters’ analysis demonstrates that the payment method is one of the key indicators of a new index which allows to assess the level of customers’ confidence in the company's Website.

Keywords: Customer value, LRFM model, Cluster analysis, Self-Organizing Maps method (SOM), K-means algorithm, loyalty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4745
44 Off-Line Detection of “Pannon Wheat” Milling Fractions by Near-Infrared Spectroscopic Methods

Authors: E. Izsó, M. Bartalné-Berceli, Sz. Gergely, A. Salgó

Abstract:

The aim of this investigation is to elaborate nearinfrared methods for testing and recognition of chemical components and quality in “Pannon wheat” allied (i.e. true to variety or variety identified) milling fractions as well as to develop spectroscopic methods following the milling processes and evaluate the stability of the milling technology by different types of milling products and according to sampling times, respectively. These wheat categories produced under industrial conditions where samples were collected versus sampling time and maximum or minimum yields. The changes of the main chemical components (such as starch, protein, lipid) and physical properties of fractions (particle size) were analysed by dispersive spectrophotometers using visible (VIS) and near-infrared (NIR) regions of the electromagnetic radiation. Close correlation were obtained between the data of spectroscopic measurement techniques processed by various chemometric methods (e.g. principal component analysis [PCA], cluster analysis [CA]) and operation condition of milling technology. It is obvious that NIR methods are able to detect the deviation of the yield parameters and differences of the sampling times by a wide variety of fractions, respectively. NIR technology can be used in the sensitive monitoring of milling technology.

Keywords: Allied wheat fractions, CA, milling process, nearinfrared spectroscopy, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
43 A Study on the Relation among Primary Care Professionals Serving the Disadvantaged Community, Socioeconomic Status, and Adverse Health Outcome

Authors: Chau-Kuang Chen, Juanita Buford, Colette Davis, Raisha Allen, John Hughes, Jr., James Tyus, Dexter Samuels

Abstract:

During the post-Civil War era, the city of Nashville, Tennessee, had the highest mortality rate in the United States. The elevated death and disease rates among former slaves were attributable to lack of quality healthcare. To address the paucity of healthcare services, Meharry Medical College, an institution with the mission of educating minority professionals and serving the underserved population, was established in 1876. Purpose: The social ecological framework and partial least squares (PLS) path modeling were used to quantify the impact of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Thus, the study results could demonstrate the accomplishment of the College’s mission of training primary care professionals to serve in underserved areas. Methods: Various statistical methods were used to analyze alumni data from 1975 – 2013. K-means cluster analysis was utilized to identify individual medical and dental graduates in the cluster groups of the practice communities (Disadvantaged or Non-disadvantaged Communities). Discriminant analysis was implemented to verify the classification accuracy of cluster analysis. The independent t-test was performed to detect the significant mean differences of respective clustering and criterion variables. Chi-square test was used to test if the proportions of primary care and non-primary care specialists are consistent with those of medical and dental graduates practicing in the designated community clusters. Finally, the PLS path model was constructed to explore the construct validity of analytic model by providing the magnitude effects of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Results: Approximately 83% (3,192/3,864) of Meharry Medical College’s medical and dental graduates from 1975 to 2013 were practicing in disadvantaged communities. Independent t-test confirmed the content validity of the cluster analysis model. Also, the PLS path modeling demonstrated that alumni served as primary care professionals in communities with significantly lower socioeconomic status and higher adverse health outcome (p < .001). The PLS path modeling exhibited the meaningful interrelation between primary care professionals practicing communities and surrounding environments (socioeconomic statues and adverse health outcome), which yielded model reliability, validity, and applicability. Conclusion: This study applied social ecological theory and analytic modeling approaches to assess the attainment of Meharry Medical College’s mission of training primary care professionals to serve in underserved areas, particularly in communities with low socioeconomic status and high rates of adverse health outcomes. In summary, the majority of medical and dental graduates from Meharry Medical College provided primary care services to disadvantaged communities with low socioeconomic status and high adverse health outcome, which demonstrated that Meharry Medical College has fulfilled its mission. The high reliability, validity, and applicability of this model imply that it could be replicated for comparable universities and colleges elsewhere.

Keywords: Disadvantaged Community, K-means Cluster Analysis, PLS Path Modeling, Primary care.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1638
42 A Study of Semantic Analysis of LED Illustrated Traffic Directional Arrow in Different Style

Authors: Chia-Chen Wu, Chih-Fu Wu, Pey-Weng Lien, Kai-Chieh Lin

Abstract:

In the past, the most comprehensively adopted light source was incandescent light bulbs, but with the appearance of LED light sources, traditional light sources have been gradually replaced by LEDs because of its numerous superior characteristics. However, many of the standards do not apply to LEDs as the two light sources are characterized differently. This also intensifies the significance of studies on LEDs. As a Kansei design study investigating the visual glare produced by traffic arrows implemented with LEDs, this study conducted a semantic analysis on the styles of traffic arrows used in domestic and international occasions. The results will be able to reduce drivers’ misrecognition that results in the unsuccessful arrival at the destination, or in traffic accidents. This study started with a literature review and surveyed the status quo before conducting experiments that were divided in two parts. The first part involved a screening experiment of arrow samples, where cluster analysis was conducted to choose five representative samples of LED displays. The second part was a semantic experiment on the display of arrows using LEDs, where the five representative samples and the selected ten adjectives were incorporated. Analyzing the results with Quantification Theory Type I, it was found that among the composition of arrows, fletching was the most significant factor that influenced the adjectives. In contrast, a “no fletching” design was more abstract and vague. It lacked the ability to convey the intended message and might bear psychological negative connotation including “dangerous,” “forbidden,” and “unreliable.” The arrow design consisting of “> shaped fletching” was found to be more concrete and definite, showing positive connotation including “safe,” “cautious,” and “reliable.” When a stimulus was placed at a farther distance, the glare could be significantly reduced; moreover, the visual evaluation scores would be higher. On the contrary, if the fletching and the shaft had a similar proportion, looking at the stimuli caused higher evaluation at a closer distance. The above results will be able to be applied to the design of traffic arrows by conveying information definitely and rapidly. In addition, drivers’ safety could be enhanced by understanding the cause of glare and improving visual recognizability.

Keywords: LED, arrow, Kansei research, preferred imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1640
41 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1689
40 RAPD Analysis of Genetic Diversity of Castor Bean

Authors: M. Vivodík, Ž. Balážová, Z. Gálová

Abstract:

The aim of this work was to detect genetic variability among the set of 40 castor genotypes using 8 RAPD markers. Amplification of genomic DNA of 40 genotypes, using RAPD analysis, yielded in 66 fragments, with an average of 8.25 polymorphic fragments per primer. Number of amplified fragments ranged from 3 to 13, with the size of amplicons ranging from 100 to 1200 bp. Values of the polymorphic information content (PIC) value ranged from 0.556 to 0.895 with an average of 0.784 and diversity index (DI) value ranged from 0.621 to 0.896 with an average of 0.798. The dendrogram based on hierarchical cluster analysis using UPGMA algorithm was prepared and analyzed genotypes were grouped into two main clusters and only two genotypes could not be distinguished. Knowledge on the genetic diversity of castor can be used for future breeding programs for increased oil production for industrial uses.

Keywords: Dendrogram, polymorphism, RAPD technique, Ricinus communis L.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2300
39 RAPD Analysis of the Genetic Polymorphism in the Collection of Rye Cultivars

Authors: L. Petrovičová, Ž. Balážová, Z. Gálová, M. Wójcik-Jagła, M. Rapacz

Abstract:

In the present study, RAPD-PCR was used to assess genetic diversity of the rye including landrances and new rye cultivars coming from Central Europe and the Union of Soviet Socialist Republics (SUN). Five arbitrary random primers were used to determine RAPD polymorphism in the set of 38 rye genotypes. These primers amplified altogether 43 different DNA fragments with an average number of 8.6 fragments per genotypes. The number of fragments ranged from 7 (RLZ 8, RLZ 9 and RLZ 10) to 12 (RLZ 6). DI and PIC values of all RAPD markers were higher than 0.8 that generally means high level of polymorphism detected between rye genotypes. The dendrogram based on hierarchical cluster analysis using UPGMA algorithm was prepared. The cultivars were grouped into two main clusters. In this experiment, RAPD proved to be a rapid, reliable and practicable method for revealing of polymorphism in the rye cultivars.

Keywords: Genetic diversity, polymorphism, RAPD markers, Secalecereale L.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2283
38 Various Advanced Statistical Analyses of Index Values Extracted from Outdoor Agricultural Workers Motion Data

Authors: Shinji Kawakura, Ryosuke Shibasaki

Abstract:

We have been grouping and developing various kinds of practical, promising sensing applied systems concerning agricultural advancement and technical tradition (guidance). These include advanced devices to secure real-time data related to worker motion, and we analyze by methods of various advanced statistics and human dynamics (e.g. primary component analysis, Ward system based cluster analysis, and mapping). What is more, we have been considering worker daily health and safety issues. Targeted fields are mainly common farms, meadows, and gardens. After then, we observed and discussed time-line style, changing data. And, we made some suggestions. The entire plan makes it possible to improve both the aforementioned applied systems and farms.

Keywords: Advanced statistical analysis, wearable sensing system, tradition of skill, supporting for workers, detecting crisis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1021
37 The Relevance of Intellectual Capital: An Analysis of Spanish Universities

Authors: Yolanda Ramírez, Ángel Tejada, Agustín Baidez

Abstract:

In recent years, the intellectual capital reporting in higher education institutions has been acquiring progressive importance worldwide. Intellectual capital approaches becomes critical at universities, mainly due to the fact that knowledge is the main output as well as input in these institutions. Universities produce knowledge, either through scientific and technical research (the results of investigation, publications, etc.) or through teaching (students trained and productive relationships with their stakeholders). The purpose of the present paper is to identify the intangible elements about which university stakeholders demand most information. The results of a study done at Spanish universities are used to see which groups of universities have stakeholders who are more proactive to the disclosure of intellectual capital.

Keywords: Intellectual capital, universities, Spain, cluster analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
36 Physical-Chemical Parameters of Latvian Apple Juices and Their Suitability for Cider Production

Authors: Rita Riekstina-Dolge, Zanda Kruma, Daina Karklina, Fredijs Dimins

Abstract:

Apple juice is the main raw material for cider production. In this study apple juices obtained from 14 dessert and crab variety apples grown in Latvia were investigated. For all samples soluble solids, titratable acidity, pH and sugar content were determined. Crab apples produce more dry matter, total sugar and acid content compared to the dessert apples but it depends on the apple variety. Total sugar content of crab apple juices was 1.3 to 1.8 times larger than in dessert apple juices. Titratable acidity of dessert apple juices is in the range of 4.1g L-1 to 10.83g L-1 and in crab apple juices titratable acidity is from 7.87g L-1 to 19.6g L-1. Fructose was detected as the main sugar whereas glucose level varied depending on the variety. The highest titratable acidity and content of sugars was detected in ‘Cornelia’ apples juice.

Keywords: Apple juice, hierarchical cluster analysis, sugars, titratable acidity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3053
35 Diversity Analysis of a Quinoa (Chenopodium quinoa Willd.) Germplasm during Two Seasons

Authors: M. Mhada, E. N. Jellen, S. E. Jacobsen, O. Benlhabib

Abstract:

The present work has been carried out to evaluate the diversity of a collection of 78 quinoa accessions developed through recurrent selection from Andean germplasm introduced to Morocco in the winter of 2000. Twenty-three quantitative and qualitative characters were used for the evaluation of genetic diversity and the relationship between the accessions, and also for the establishment of a core collection in Morocco. Important variation was found among the accessions in terms of plant morphology and growth behavior. Data analysis showed positive correlation of the plant height, the plant fresh and the dry weight with the grain yield, while days to flowering was found to be negatively correlated with grain yield. The first four PCs contributed 74.76% of the variability; the first PC showed significant variation with 42.86% of the total variation, PC2 with 15.37%, PC3 with 9.05% and PC4 contributed 7.49% of the total variation. Plant size, days to grain filling and days to maturity are correlated to the PC1; and seed size, inflorescence density and mildew resistance are correlated to the PC2. Hierarchical cluster analysis rearranged the 78 quinoa accessions into four main groups and ten sub-clusters. Clustering was found in associations with days to maturity and also with plant size and seed-size traits.

Keywords: Character association, Chenopodium quinoa, Diversity analysis, Morphotypic cluster, Multivariate analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2088
34 How Virtualization, Decentralization and Network Building Change the Manufacturing Landscape: An Industry 4.0 Perspective

Authors: Malte Brettel, Niklas Friederichsen, Michael Keller, Marius Rosenberg

Abstract:

The German manufacturing industry has to withstand an increasing global competition on product quality and production costs. As labor costs are high, several industries have suffered severely under the relocation of production facilities towards aspiring countries, which have managed to close the productivity and quality gap substantially. Established manufacturing companies have recognized that customers are not willing to pay large price premiums for incremental quality improvements. As a consequence, many companies from the German manufacturing industry adjust their production focusing on customized products and fast time to market. Leveraging the advantages of novel production strategies such as Agile Manufacturing and Mass Customization, manufacturing companies transform into integrated networks, in which companies unite their core competencies. Hereby, virtualization of the process- and supply-chain ensures smooth inter-company operations providing real-time access to relevant product and production information for all participating entities. Boundaries of companies deteriorate, as autonomous systems exchange data, gained by embedded systems throughout the entire value chain. By including Cyber-Physical-Systems, advanced communication between machines is tantamount to their dialogue with humans. The increasing utilization of information and communication technology allows digital engineering of products and production processes alike. Modular simulation and modeling techniques allow decentralized units to flexibly alter products and thereby enable rapid product innovation. The present article describes the developments of Industry 4.0 within the literature and reviews the associated research streams. Hereby, we analyze eight scientific journals with regards to the following research fields: Individualized production, end-to-end engineering in a virtual process chain and production networks. We employ cluster analysis to assign sub-topics into the respective research field. To assess the practical implications, we conducted face-to-face interviews with managers from the industry as well as from the consulting business using a structured interview guideline. The results reveal reasons for the adaption and refusal of Industry 4.0 practices from a managerial point of view. Our findings contribute to the upcoming research stream of Industry 4.0 and support decision-makers to assess their need for transformation towards Industry 4.0 practices. 

Keywords: Industry 4.0., Mass Customization, Production networks, Virtual Process-Chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26118
33 Text Mining Analysis of the Reconstruction Plans after the Great East Japan Earthquake

Authors: Minami Ito, Akihiro Iijima

Abstract:

On March 11, 2011, the Great East Japan Earthquake occurred off the coast of Sanriku, Japan. It is important to build a sustainable society through the reconstruction process rather than simply restoring the infrastructure. To compare the goals of reconstruction plans of quake-stricken municipalities, Japanese language morphological analysis was performed by using text mining techniques. Frequently-used nouns were sorted into four main categories of “life”, “disaster prevention”, “economy”, and “harmony with environment”. Because Soma City is affected by nuclear accident, sentences tagged to “harmony with environment” tended to be frequent compared to the other municipalities. Results from cluster analysis and principle component analysis clearly indicated that the local government reinforces the efforts to reduce risks from radiation exposure as a top priority.

Keywords: Eco-friendly reconstruction, harmony with environment, decontamination, nuclear disaster.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
32 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2061
31 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1742
30 Assessment of EU Competitiveness Factors by Multivariate Methods

Authors: L. Melecký

Abstract:

Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis. 

Keywords: Competitiveness, cluster analysis, EU, factor analysis, multivariate methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1599
29 DEA Method for Evaluation of EU Performance

Authors: M. Staníčková

Abstract:

The paper deals with an application of quantitative analysis – the Data Envelopment Analysis (DEA) method to performance evaluation of the European Union Member States, in the reference years 2000 and 2011. The main aim of the paper is to measure efficiency changes over the reference years and to analyze a level of productivity in individual countries based on DEA method and to classify the EU Member States to homogeneous units (clusters) according to efficiency results. The theoretical part is devoted to the fundamental basis of performance theory and the methodology of DEA. The empirical part is aimed at measuring degree of productivity and level of efficiency changes of evaluated countries by basic DEA model – CCR CRS model, and specialized DEA approach – the Malmquist Index measuring the change of technical efficiency and the movement of production possibility frontier. Here, DEA method becomes a suitable tool for setting a competitive/uncompetitive position of each country because there is not only one factor evaluated, but a set of different factors that determine the degree of economic development.

Keywords: CCR CRS model, cluster analysis, DEA method, efficiency, EU, Malmquist index, performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117