Search results for: Individual patient data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8292

Search results for: Individual patient data

7662 Study of Efficiency and Capability LZW++ Technique in Data Compression

Authors: Yusof. Mohd Kamir, Mat Deris. Mohd Sufian, Abidin. Ahmad Faisal Amri

Abstract:

The purpose of this paper is to show efficiency and capability LZWµ in data compression. The LZWµ technique is enhancement from existing LZW technique. The modification the existing LZW is needed to produce LZWµ technique. LZW read one by one character at one time. Differ with LZWµ technique, where the LZWµ read three characters at one time. This paper focuses on data compression and tested efficiency and capability LZWµ by different data format such as doc type, pdf type and text type. Several experiments have been done by different types of data format. The results shows LZWµ technique is better compared to existing LZW technique in term of file size.

Keywords: Data Compression, Huffman Encoding, LZW, LZWµ, RLL, Size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2083
7661 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: Hit rate, Locality of program, Stack cache, and Stack data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502
7660 Family Carers' Experiences in Striving for Medical Care and Finding Their Solutions for Family Members with Mental Illnesses

Authors: Yu-Yu Wang, Shih-Hua Hsieh, Ru-Shian Hsieh

Abstract:

Wishes and choices being respected, and the right to be supported rather than coerced, have been internationally recognized as the human rights of persons with mental illness. In Taiwan, ‘coerced hospitalization’ has become difficult since the revision of the mental health legislation in 2007. Despite trend towards human rights, the real problem families face when their family members are in mental health crisis is the lack of alternative services. This study aims to explore: 1) When is hospitalization seen as the only solution by family members? 2) What are the barriers for arranging hospitalization, and how are they managed? 3) What have family carers learned, in their experiences of caring for their family members with mental illness? To answer these questions, qualitative approach was adopted, and focus group interviews were taken to collect data. This study includes 24 family carers. The main findings of this research include: First, hospital is the last resort for carers in helplessness. Family carers tend to do everything they could to provide care at home for their family members with mental illness. Carers seek hospitalization only when a patient’s behavior is too violent, weird, and/or abnormal, and beyond their ability to manage. Hospitalization, nevertheless, is never an easy choice. Obstacles emanate from the attitudes of the medical doctors, the restricted areas of ambulance service, and insufficient information from the carers’ part. On the other hand, with some professionals’ proactive assistance, access to medical care while in crisis becomes possible. Some family carers obtained help from the medical doctor, nurse, therapist and social workers. Some experienced good help from policemen, taxi drivers, and security guards at the hospital. The difficulty in accessing medical care prompts carers to work harder on assisting their family members with mental illness to stay in stable states. Carers found different ways of helping the ‘person’ to get along with the ‘illness’ and have better quality of life. Taking back ‘the right to control’ in utilizing medication, from passiveness to negotiating with medical doctors and seeking alternative therapies, are seen in many carers’ efforts. Besides, trying to maintain regular activities in daily life and play normal family roles are also experienced as important. Furthermore, talking with the patient as a person is also important. The authors conclude that in order to protect the human rights of persons with mental illness, it is crucial to make the medical care system more flexible and to make the services more humane: sufficient information should be provided and communicated, and efforts should be made to maintain the person’s social roles and to support the family.

Keywords: Family carers, coercive treatment, independent living, mental health crisis, persons with mental illness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1006
7659 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2539
7658 Cooperative Learning: A Case Study on Teamwork through Community Service Project

Authors: Priyadharshini Ahrumugam

Abstract:

Cooperative groups through much research have been recognized to churn remarkable achievements instead of solitary or individualistic efforts. Based on Johnson and Johnson’s model of cooperative learning, the five key components of cooperation are positive interdependence, face-to-face promotive interaction, individual accountability, social skills, and group processing. In 2011, the Malaysian Ministry of Higher Education (MOHE) introduced the Holistic Student Development policy with the aim to develop morally sound individuals equipped with lifelong learning skills. The Community Service project was included in the improvement initiative. The purpose of this study is to assess the relationship of team-based learning in facilitating particularly students’ positive interdependence and face-to-face promotive interaction. The research methods involve in-depth interviews with the team leaders and selected team members, and a content analysis of the undergraduate students’ reflective journals. A significant positive relationship was found between students’ progressive outlook towards teamwork and the highlighted two components. The key findings show that students have gained in their individual learning and work results through teamwork and interaction with other students. The inclusion of Community Service as a MOHE subject resonates with cooperative learning methods that enhances supportive relationships and develops students’ social skills together with their professional skills.

Keywords: Community service, cooperative learning, positive interdependence, teamwork.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2196
7657 An Experimental Design Approach to Determine Effects of The Operating Parameters on The Rate of Ru promoted Ir Carbonylation of Methanol

Authors: Vahid Hosseinpour, Mohammad Kazemini, Alireza Mohammadrezaee

Abstract:

carbonylation of methanol in homogenous phase is one of the major routesfor production of acetic acid. Amongst group VIII metal catalysts used in this process iridium has displayed the best capabilities. To investigate effect of operating parameters like: temperature, pressure, methyl iodide, methyl acetate, iridium, ruthenium, and water concentrations on the reaction rate, experimental design for this system based upon central composite design (CCD) was utilized. Statistical rate equation developed by this method contained individual, interactions and curvature effects of parameters on the reaction rate. The model with p-value less than 0.0001 and R2 values greater than 0.9; confirmeda satisfactory fitness of the experimental and theoretical studies. In other words, the developed model and experimental data obtained passed all diagnostic tests establishing this model as a statistically significant.

Keywords: Acetic Acid, Carbonylation of Methanol, Central Composite Design, Experimental Design, Iridium/Ruthenium

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3650
7656 Identification of Arousal and Relaxation by using SVM-Based Fusion of PPG Features

Authors: Chi Jung Kim, Mincheol Whang, Eui Chul Lee

Abstract:

In this paper, we propose a new method to distinguish between arousal and relaxation states by using multiple features acquired from a photoplethysmogram (PPG) and support vector machine (SVM). To induce arousal and relaxation states in subjects, 2 kinds of sound stimuli are used, and their corresponding biosignals are obtained using the PPG sensor. Two features–pulse to pulse interval (PPI) and pulse amplitude (PA)–are extracted from acquired PPG data, and a nonlinear classification between arousal and relaxation is performed using SVM. This methodology has several advantages when compared with previous similar studies. Firstly, we extracted 2 separate features from PPG, i.e., PPI and PA. Secondly, in order to improve the classification accuracy, SVM-based nonlinear classification was performed. Thirdly, to solve classification problems caused by generalized features of whole subjects, we defined each threshold according to individual features. Experimental results showed that the average classification accuracy was 74.67%. Also, the proposed method showed the better identification performance than the single feature based methods. From this result, we confirmed that arousal and relaxation can be classified using SVM and PPG features.

Keywords: Support Vector Machine, PPG, Emotion Recognition, Arousal, Relaxation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2476
7655 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.

Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2099
7654 Architecture Integrating Wireless Body Area Networks with Web Services for Ubiquitous Healthcare Service Provisioning

Authors: Ogunduyile O. Oluwgbenga

Abstract:

Recent advancements in sensor technologies and Wireless Body Area Networks (WBANs) have led to the development of cost-effective healthcare devices which can be used to monitor and analyse a person-s physiological parameters from remote locations. These advancements provides a unique opportunity to overcome current healthcare challenges of low quality service provisioning, lack of easy accessibility to service varieties, high costs of services and increasing population of the elderly experienced globally. This paper reports on a prototype implementation of an architecture that seamlessly integrates Wireless Body Area Network (WBAN) with Web services (WS) to proactively collect physiological data of remote patients to recommend diagnostic services. Technologies based upon WBAN and WS can provide ubiquitous accessibility to a variety of services by allowing distributed healthcare resources to be massively reused to provide cost-effective services without individuals physically moving to the locations of those resources. In addition, these technologies can reduce costs of healthcare services by allowing individuals to access services to support their healthcare. The prototype uses WBAN body sensors implemented on arduino fio platforms to be worn by the patient and an android smart phone as a personal server. The physiological data are collected and uploaded through GPRS/internet to the Medical Health Server (MHS) to be analysed. The prototype monitors the activities, location and physiological parameters such as SpO2 and Heart Rate of the elderly and patients in rehabilitation. Medical practitioners would have real time access to the uploaded information through a web application.

Keywords: Android Smart phone, Arduino Fio, Web application server, Wireless Body Area Networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2538
7653 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: Data Mining, Environmental Modeling, Sustainability, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1779
7652 An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Authors: Minsoo Lee, Yun-mi Kim, Yearn Jeong Kim, Yoon-kyung Lee, Hyejung Yoon

Abstract:

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Keywords: Ant colony system, biological data, clustering, DNA chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1968
7651 Development of a Catchment Water Quality Model for Continuous Simulations of Pollutants Build-up and Wash-off

Authors: Iqbal Hossain, Dr. Monzur Imteaz, Dr. Shirley Gato-Trinidad, Prof. Abdallah Shanableh

Abstract:

Estimation of runoff water quality parameters is required to determine appropriate water quality management options. Various models are used to estimate runoff water quality parameters. However, most models provide event-based estimates of water quality parameters for specific sites. The work presented in this paper describes the development of a model that continuously simulates the accumulation and wash-off of water quality pollutants in a catchment. The model allows estimation of pollutants build-up during dry periods and pollutants wash-off during storm events. The model was developed by integrating two individual models; rainfall-runoff model, and catchment water quality model. The rainfall-runoff model is based on the time-area runoff estimation method. The model allows users to estimate the time of concentration using a range of established methods. The model also allows estimation of the continuing runoff losses using any of the available estimation methods (i.e., constant, linearly varying or exponentially varying). Pollutants build-up in a catchment was represented by one of three pre-defined functions; power, exponential, or saturation. Similarly, pollutants wash-off was represented by one of three different functions; power, rating-curve, or exponential. The developed runoff water quality model was set-up to simulate the build-up and wash-off of total suspended solids (TSS), total phosphorus (TP) and total nitrogen (TN). The application of the model was demonstrated using available runoff and TSS field data from road and roof surfaces in the Gold Coast, Australia. The model provided excellent representation of the field data demonstrating the simplicity yet effectiveness of the proposed model.

Keywords: Catchment, continuous pollutants build-up, pollutants wash-off, runoff, runoff water quality model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3117
7650 XML Data Management in Compressed Relational Database

Authors: Hongzhi Wang, Jianzhong Li, Hong Gao

Abstract:

XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..

Keywords: XML, compression, query processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797
7649 A System for Analyzing and Eliciting Public Grievances Using Cache Enabled Big Data

Authors: P. Kaladevi, N. Giridharan

Abstract:

The system for analyzing and eliciting public grievances serves its main purpose to receive and process all sorts of complaints from the public and respond to users. Due to the more number of complaint data becomes big data which is difficult to store and process. The proposed system uses HDFS to store the big data and uses MapReduce to process the big data. The concept of cache was applied in the system to provide immediate response and timely action using big data analytics. Cache enabled big data increases the response time of the system. The unstructured data provided by the users are efficiently handled through map reduce algorithm. The processing of complaints takes place in the order of the hierarchy of the authority. The drawbacks of the traditional database system used in the existing system are set forth by our system by using Cache enabled Hadoop Distributed File System. MapReduce framework codes have the possible to leak the sensitive data through computation process. We propose a system that add noise to the output of the reduce phase to avoid signaling the presence of sensitive data. If the complaints are not processed in the ample time, then automatically it is forwarded to the higher authority. Hence it ensures assurance in processing. A copy of the filed complaint is sent as a digitally signed PDF document to the user mail id which serves as a proof. The system report serves to be an essential data while making important decisions based on legislation.

Keywords: Big Data, Hadoop, HDFS, Caching, MapReduce, web personalization, e-governance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587
7648 Comparison of Statins Dose Intensity on HbA1c Control in Outpatients with Type 2 Diabetes: A Prospective Cohort Study

Authors: Mohamed A. Hammad, Dzul Azri Mohamed Noor, Syed Azhar Syed Sulaiman, Ahmed A. Khamis, Abeer Kharshid, Nor Azizah Aziz

Abstract:

The effect of statins dose intensity (SDI) on glycemic control in patients with existing diabetes is unclear. Also, there are many contradictory findings were reported in the literature; thus, it is limiting the possibility to draw conclusions. This project was designed to compare the effect of SDI on glycated hemoglobin (HbA1c%) control in outpatients with Type 2 diabetes in the endocrine clinic at Hospital Pulau Pinang, Malaysia, between July 2015 and August 2016. A prospective cohort study was conducted, where records of 345 patients with Type 2 diabetes (Moderate-SDI group 289 patients and high-SDI cohort 56 patients) were reviewed to identify demographics and laboratory tests. The target of glycemic control (HbA1c < 7% for patient < 65 years, and < 8% for patient ≥ 65 years) was estimated, and the results were presented as descriptive statistics. From 289 moderate-SDI cohorts with a mean age of 57.3 ± 12.4 years, only 86 (29.8%) cases were shown to have controlled glycemia, while there were 203 (70.2%) cases with uncontrolled glycemia with confidence interval (CI) of 95% (6.2–10.8). On the other hand, the high-SDI group of 56 patients with Type 2 diabetes with a mean age 57.7±12.4 years is distributed among 11 (19.6%) patients with controlled diabetes, and 45 (80.4%) of them had uncontrolled glycemia, CI: 95% (7.1–11.9). The study has demonstrated that the relative risk (RR) of uncontrolled glycemia in patients with Type 2 diabetes that used high-SDI is 1.15, and the excessive relative risk (ERR) is 15%. The absolute risk (AR) is 10.2%, and the number needed to harm (NNH) is 10. Outpatients with Type 2 diabetes who use high-SDI of statin have a higher risk of uncontrolled glycemia than outpatients who had been treated with a moderate-SDI.

Keywords: Cohort study, diabetes control, dose intensity, HbA1c, Malaysia, statin, Type 2 diabetes mellitus, uncontrolled glycemia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1454
7647 DEA Method for Evaluation of EU Performance

Authors: M. Staníčková

Abstract:

The paper deals with an application of quantitative analysis – the Data Envelopment Analysis (DEA) method to performance evaluation of the European Union Member States, in the reference years 2000 and 2011. The main aim of the paper is to measure efficiency changes over the reference years and to analyze a level of productivity in individual countries based on DEA method and to classify the EU Member States to homogeneous units (clusters) according to efficiency results. The theoretical part is devoted to the fundamental basis of performance theory and the methodology of DEA. The empirical part is aimed at measuring degree of productivity and level of efficiency changes of evaluated countries by basic DEA model – CCR CRS model, and specialized DEA approach – the Malmquist Index measuring the change of technical efficiency and the movement of production possibility frontier. Here, DEA method becomes a suitable tool for setting a competitive/uncompetitive position of each country because there is not only one factor evaluated, but a set of different factors that determine the degree of economic development.

Keywords: CCR CRS model, cluster analysis, DEA method, efficiency, EU, Malmquist index, performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612
7646 Spatial Distribution of Socio-Economic Factors in Kogi State, Nigeria: Development Issues and Implication(s)

Authors: Yahya A. Sadiq, Grace F. Balogun, Olufemi J. Anjorin

Abstract:

This study analyzed the spatial distribution of socio-economic factors in Kogi state with a view to examining its implications on the development of the state. Consequently, questionnaires were administered on both the selected individual respondents (784) in the state and on the administrative offices (local council offices, 21) to solicit relevant information on the spatial distribution of socio-economic factors in their areas. The collected data were tabulated and analyzed using percentages. The study revealed commerce/trade, education, and health care, etc. as the major socio-economic factors in the state but with marked variation/imbalance in their spatial distribution across the study area. The rural-based local government areas have far less of such important facilities. Conclusively, it was recommended that there is need for socio-economic transformation of living conditions of people in the study area especially by positively redistributing local political power and the resources that are abound in the state will be felt by everybody including the commoners.

Keywords: Development, local government areas, socio-economic factors, spatial distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
7645 Computational Study of Blood Flow Analysis for Coronary Artery Disease

Authors: Radhe Tado, Ashish B. Deoghare, K. M. Pandey

Abstract:

The aim of this study is to estimate the effect of blood flow through the coronary artery in human heart so as to assess the coronary artery disease.Velocity, wall shear stress (WSS), strain rate and wall pressure distribution are some of the important hemodynamic parameters that are non-invasively assessed with computational fluid dynamics (CFD). These parameters are used to identify the mechanical factors responsible for the plaque progression and/or rupture in left coronary arteries (LCA) in coronary arteries.The initial step for CFD simulations was the construction of a geometrical model of the LCA. Patient specific artery model is constructed using computed tomography (CT) scan data with the help of MIMICS Research 19.0. For CFD analysis ANSYS FLUENT-14.5 is used.Hemodynamic parameters were quantified and flow patterns were visualized both in the absence and presence of coronary plaques. The wall pressure continuously decreased towards distal segments and showed pressure drops in stenotic segments. Areas of high WSS and high flow velocities were found adjacent to plaques deposition.

Keywords: Computational fluid dynamics, hemodynamics, velocity, strain rate, wall pressure, wall shear stress.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470
7644 Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with K-Modes and K-representative, which show that the new measure generates clusters with high purity.

Keywords: Clustering, categorical data, K-Modes, weighted dissimilarity measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3684
7643 Mobile Phone as a Tool for Data Collection in Field Research

Authors: Sandro Mourão, Karla Okada

Abstract:

The necessity of accurate and timely field data is shared among organizations engaged in fundamentally different activities, public services or commercial operations. Basically, there are three major components in the process of the qualitative research: data collection, interpretation and organization of data, and analytic process. Representative technological advancements in terms of innovation have been made in mobile devices (mobile phone, PDA-s, tablets, laptops, etc). Resources that can be potentially applied on the data collection activity for field researches in order to improve this process. This paper presents and discuss the main features of a mobile phone based solution for field data collection, composed of basically three modules: a survey editor, a server web application and a client mobile application. The data gathering process begins with the survey creation module, which enables the production of tailored questionnaires. The field workforce receives the questionnaire(s) on their mobile phones to collect the interviews responses and sending them back to a server for immediate analysis.

Keywords: Data Gathering, Field Research, Mobile Phone, Survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
7642 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 822
7641 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: Cluster analysis, education, mathematics, profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 887
7640 DIVAD: A Dynamic and Interactive Visual Analytical Dashboard for Exploring and Analyzing Transport Data

Authors: Tin Seong Kam, Ketan Barshikar, Shaun Tan

Abstract:

The advances in location-based data collection technologies such as GPS, RFID etc. and the rapid reduction of their costs provide us with a huge and continuously increasing amount of data about movement of vehicles, people and goods in an urban area. This explosive growth of geospatially-referenced data has far outpaced the planner-s ability to utilize and transform the data into insightful information thus creating an adverse impact on the return on the investment made to collect and manage this data. Addressing this pressing need, we designed and developed DIVAD, a dynamic and interactive visual analytics dashboard to allow city planners to explore and analyze city-s transportation data to gain valuable insights about city-s traffic flow and transportation requirements. We demonstrate the potential of DIVAD through the use of interactive choropleth and hexagon binning maps to explore and analyze large taxi-transportation data of Singapore for different geographic and time zones.

Keywords: Geographic Information System (GIS), MovementData, GeoVisual Analytics, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384
7639 Gene Expression Data Classification Using Discriminatively Regularized Sparse Subspace Learning

Authors: Chunming Xu

Abstract:

Sparse representation which can represent high dimensional data effectively has been successfully used in computer vision and pattern recognition problems. However, it doesn-t consider the label information of data samples. To overcome this limitation, we develop a novel dimensionality reduction algorithm namely dscriminatively regularized sparse subspace learning(DR-SSL) in this paper. The proposed DR-SSL algorithm can not only make use of the sparse representation to model the data, but also can effective employ the label information to guide the procedure of dimensionality reduction. In addition,the presented algorithm can effectively deal with the out-of-sample problem.The experiments on gene-expression data sets show that the proposed algorithm is an effective tool for dimensionality reduction and gene-expression data classification.

Keywords: sparse representation, dimensionality reduction, labelinformation, sparse subspace learning, gene-expression data classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
7638 The Modification of the Mixed Flow Pump with Respect to Stability of the Head Curve

Authors: Roman Klas, František Pochylý, Pavel Rudolf

Abstract:

This paper is focused on the CFD simulation of the radiaxial pump (i.e. mixed flow pump) with the aim to detect the reasons of Y-Q characteristic instability. The main reasons of pressure pulsations were detected by means of the analysis of velocity and pressure fields within the pump combined with the theoretical approach. Consequently, the modifications of spiral case and pump suction area were made based on the knowledge of flow conditions and the shape of dissipation function. The primary design of pump geometry was created as the base model serving for the comparison of individual modification influences. The basic experimental data are available for this geometry. This approach replaced the more complicated and with respect to convergence of all computational tasks more difficult calculation for the compressible liquid flow. The modification of primary pump consisted in inserting the three fins types. Subsequently, the evaluation of pressure pulsations, specific energy curves and visualization of velocity fields were chosen as the criterion for successful design. 

Keywords: CFD, radiaxial pump, spiral case, stability

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565
7637 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710
7636 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: Predictive analysis, big data, predictive analysis algorithms. CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1067
7635 The Advantages of Integration for Social Systems – Evidence from the Automobile Industry

Authors: Waldemiro Francisco Sorte Junior

Abstract:

The Japanese integrative approach to social systems can be observed in supply chain management as well as in the relationship between public and private sectors. Both the Lean Production System and the Developmental State Model are characterized by efforts towards the achievement of mutual goals, resulting in initiatives for capacity building which emphasize the system level. In Brazil, although organizations undertake efforts to build capabilities at the individual and organizational levels, the system level is being neglected. Fieldwork data confirmed the findings of other studies in terms of the lack of integration in supply chain management in the Brazilian automobile industry. Moreover, due to the absence of an active role of the Brazilian state in its relationship with the private sector, automakers are not fully exploiting the opportunities in the domestic and regional markets. For promoting a higher level of economic growth as well as to increase the degree of spill-over of technologies and techniques, a more integrative approach is needed.

Keywords: Integration, Lean Production System, DevelopmentalState Model, Brazilian automobile industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1521
7634 Constructing an Attitude Scale: Attitudes toward Violence on Televisions

Authors: Göksu Gözen Citak

Abstract:

The process of constructing a scale measuring the attitudes of youth toward violence on televisions is reported. A 30-item draft attitude scale was applied to a working group of 232 students attending the Faculty of Educational Sciences at Ankara University between the years 2005-2006. To introduce the construct validity and dimensionality of the scale, exploratory and confirmatory factor analysis was applied to the data. Results of the exploratory factor analysis showed that the scale had three factors that accounted for 58,44% (22,46% for the first, 22,15% for the second and 13,83% for the third factor) of the common variance. It is determined that the first factor considered issues related individual effects of violence on televisions, the second factor concerned issues related social effects of violence on televisions and the third factor concerned issues related violence on television programs. Results of the confirmatory factor analysis showed that all the items under each factor are fitting the concerning factors structure. An alpha reliability of 0,90 was estimated for the whole scale. It is concluded that the scale is valid and reliable.

Keywords: Attitudes toward violence, confirmatory factor analysis, constructing attitude scale, exploratory factor analysis, violence on televisions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951
7633 A Business-to-Business Collaboration System That Promotes Data Utilization While Encrypting Information on the Blockchain

Authors: Hiroaki Nasu, Ryota Miyamoto, Yuta Kodera, Yasuyuki Nogami

Abstract:

To promote Industry 4.0 and Society 5.0 and so on, it is important to connect and share data so that every member can trust it. Blockchain (BC) technology is currently attracting attention as the most advanced tool and has been used in the financial field and so on. However, the data collaboration using BC has not progressed sufficiently among companies on the supply chain of the manufacturing industry that handle sensitive data such as product quality, manufacturing conditions, etc. There are two main reasons why data utilization is not sufficiently advanced in the industrial supply chain. The first reason is that manufacturing information is top secret and a source for companies to generate profits. It is difficult to disclose data even between companies with transactions in the supply chain. Blockchain mechanism such as Bitcoin using Public Key Infrastructure (PKI) requires plaintext to be shared between companies in order to verify the identity of the company that sent the data. Another reason is that the merits (scenarios) of collaboration data between companies are not specifically specified in the industrial supply chain. For these problems, this paper proposes a Business to Business (B2B) collaboration system using homomorphic encryption and BC technique. Using the proposed system, each company on the supply chain can exchange confidential information on encrypted data and utilize the data for their own business. In addition, this paper considers a scenario focusing on quality data, which was difficult to collaborate because it is top-secret. In this scenario, we show an implementation scheme and a benefit of concrete data collaboration by proposing a comparison protocol that can grasp the change in quality while hiding the numerical value of quality data.

Keywords: Business to business data collaboration, industrial supply chain, blockchain, homomorphic encryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 802