Search results for: microarray data analysis

13063 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1991

13062 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: Data envelopment analysis, interval DEA, efficiency classification, efficiency prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 891

13061 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: Cluster analysis, education, mathematics, profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 833

13060 A Comparison of Image Data Representations for Local Stereo Matching

Authors: André Smith, Amr Abdel-Dayem

Abstract:

The stereo matching problem, while having been present for several decades, continues to be an active area of research. The goal of this research is to find correspondences between elements found in a set of stereoscopic images. With these pairings, it is possible to infer the distance of objects within a scene, relative to the observer. Advancements in this field have led to experimentations with various techniques, from graph-cut energy minimization to artificial neural networks. At the basis of these techniques is a cost function, which is used to evaluate the likelihood of a particular match between points in each image. While at its core, the cost is based on comparing the image pixel data; there is a general lack of consistency as to what image data representation to use. This paper presents an experimental analysis to compare the effectiveness of more common image data representations. The goal is to determine the effectiveness of these data representations to reduce the cost for the correct correspondence relative to other possible matches.

Keywords: Colour data, local stereo matching, stereo correspondence, disparity map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 861

13059 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —

Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno

Abstract:

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data

Keywords: Rule induction, decision table, missing data, noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416

13058 Eye Tracking: Biometric Evaluations of Instructional Materials for Improved Learning

Authors: Janet Holland

Abstract:

Eye tracking is a great way to triangulate multiple data sources for deeper, more complete knowledge of how instructional materials are really being used and emotional connections made. Using sensor based biometrics provides a detailed local analysis in real time expanding our ability to collect science based data for a more comprehensive level of understanding, not previously possible, for teaching and learning. The knowledge gained will be used to make future improvements to instructional materials, tools, and interactions. The literature has been examined and a preliminary pilot test was implemented to develop a methodology for research in Instructional Design and Technology. Eye tracking now offers the addition of objective metrics obtained from eye tracking and other biometric data collection with analysis for a fresh perspective.

Keywords: Area of interest, eye tracking, biometrics, fixation, fixation count, fixation sequence, fixation time, gaze points, heat map, saccades, time to first fixation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 831

13057 Energy Efficient In-Network Data Processing in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

The Sensor Network consists of densely deployed sensor nodes. Energy optimization is one of the most important aspects of sensor application design. Data acquisition and aggregation techniques for processing data in-network should be energy efficient. Due to the cross-layer design, resource-limited and noisy nature of Wireless Sensor Networks(WSNs), it is challenging to study the performance of these systems in a realistic setting. In this paper, we propose optimizing queries by aggregation of data and data redundancy to reduce energy consumption without requiring all sensed data and directed diffusion communication paradigm to achieve power savings, robust communication and processing data in-network. To estimate the per-node power consumption POWERTossim mica2 energy model is used, which provides scalable and accurate results. The performance analysis shows that the proposed methods overcomes the existing methods in the aspects of energy consumption in wireless sensor networks.

Keywords: Data Aggregation, Directed Diffusion, Partial Aggregation, Packet Merging, Query Plan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1790

13056 Preliminary Analysis of Energy Efficiency in Data Center: Case Study

Authors: Xiaoshu Lu, Tao Lu, Matias Remes, Martti Viljanen

Abstract:

As the data-driven economy is growing faster than ever and the demand for energy is being spurred, we are facing unprecedented challenges of improving energy efficiency in data centers. Effectively maximizing energy efficiency or minimising the cooling energy demand is becoming pervasive for data centers. This paper investigates overall energy consumption and the energy efficiency of cooling system for a data center in Finland as a case study. The power, cooling and energy consumption characteristics and operation condition of facilities are examined and analysed. Potential energy and cooling saving opportunities are identified and further suggestions for improving the performance of cooling system are put forward. Results are presented as a comprehensive evaluation of both the energy performance and good practices of energy efficient cooling operations for the data center. Utilization of an energy recovery concept for cooling system is proposed. The conclusion we can draw is that even though the analysed data center demonstrated relatively high energy efficiency, based on its power usage effectiveness value, there is still a significant potential for energy saving from its cooling systems.

Keywords: Data center, case study, cooling system, energyefficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1496

13055 An Automation of Check Focusing on CRUD for Requirements Analysis Model in UML

Authors: Shinpei Ogata, Yoshitaka Aoki, Hirotaka Okuda, Saeko Matsuura

Abstract:

A key to success of high quality software development is to define valid and feasible requirements specification. We have proposed a method of model-driven requirements analysis using Unified Modeling Language (UML). The main feature of our method is to automatically generate a Web user interface mock-up from UML requirements analysis model so that we can confirm validity of input/output data for each page and page transition on the system by directly operating the mock-up. This paper proposes a support method to check the validity of a data life cycle by using a model checking tool “UPPAAL" focusing on CRUD (Create, Read, Update and Delete). Exhaustive checking improves the quality of requirements analysis model which are validated by the customers through automatically generated mock-up. The effectiveness of our method is discussed by a case study of requirements modeling of two small projects which are a library management system and a supportive sales system for text books in a university.

Keywords: CRUD, Model Checking, Model Driven Development, Requirements Analysis, Unified Modeling Language, UPPAAL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628

13054 Empirical and Indian Automotive Equity Portfolio Decision Support

Authors: P. Sankar, P. James Daniel Paul, Siddhant Sahu

Abstract:

A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.

Keywords: Indian Automotive Sector, Stock Market Decisions, Equity Portfolio Analysis, Decision Tree Classifiers, Statistical Data Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1990

13053 Content Analysis and Attitude of Thai Students towards Thai Series “Hormones: Season 2”

Authors: Siriporn Meenanan

Abstract:

The objective of this study is to investigate the attitude of Thai students towards the Thai series "Hormones the Series Season 2". This study was conducted in the quantitative research, and the questionnaires were used to collect data from 400 people of the sample group. Descriptive statistics were used in data analysis. The findings reveal that most participants have positive comments regarding the series. They strongly agreed that the series reflects on the way of life and problems of teenagers in Thailand. Hence, the participants believe that if adults have a chance to watch the series, they will have the better understanding of the teenagers. In addition, the participants also agreed that the contents of the play are appropriate and satisfiable as the contents of “Hormones the Series Season 2” will raise awareness among the teens and use it as a guide to prevent problems that might happen during their teenage life.

Keywords: Content analysis, attitude, Thai series, Hormones the series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909

13052 Spatial Data Science for Data Driven Urban Planning: The Youth Economic Discomfort Index for Rome

Authors: Iacopo Testi, Diego Pajarito, Nicoletta Roberto, Carmen Greco

Abstract:

Today, a consistent segment of the world’s population lives in urban areas, and this proportion will vastly increase in the next decades. Therefore, understanding the key trends in urbanization, likely to unfold over the coming years, is crucial to the implementation of sustainable urban strategies. In parallel, the daily amount of digital data produced will be expanding at an exponential rate during the following years. The analysis of various types of data sets and its derived applications have incredible potential across different crucial sectors such as healthcare, housing, transportation, energy, and education. Nevertheless, in city development, architects and urban planners appear to rely mostly on traditional and analogical techniques of data collection. This paper investigates the prospective of the data science field, appearing to be a formidable resource to assist city managers in identifying strategies to enhance the social, economic, and environmental sustainability of our urban areas. The collection of different new layers of information would definitely enhance planners' capabilities to comprehend more in-depth urban phenomena such as gentrification, land use definition, mobility, or critical infrastructural issues. Specifically, the research results correlate economic, commercial, demographic, and housing data with the purpose of defining the youth economic discomfort index. The statistical composite index provides insights regarding the economic disadvantage of citizens aged between 18 years and 29 years, and results clearly display that central urban zones and more disadvantaged than peripheral ones. The experimental set up selected the city of Rome as the testing ground of the whole investigation. The methodology aims at applying statistical and spatial analysis to construct a composite index supporting informed data-driven decisions for urban planning.

Keywords: Data science, spatial analysis, composite index, Rome, urban planning, youth economic discomfort index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 814

13051 Lineup Optimization Model of Basketball Players Based on the Prediction of Recursive Neural Networks

Authors: Wang Yichen, Haruka Yamashita

Abstract:

In recent years, in the field of sports, decision making such as member in the game and strategy of the game based on then analysis of the accumulated sports data are widely attempted. In fact, in the NBA basketball league where the world's highest level players gather, to win the games, teams analyze the data using various statistical techniques. However, it is difficult to analyze the game data for each play such as the ball tracking or motion of the players in the game, because the situation of the game changes rapidly, and the structure of the data should be complicated. Therefore, it is considered that the analysis method for real time game play data is proposed. In this research, we propose an analytical model for "determining the optimal lineup composition" using the real time play data, which is considered to be difficult for all coaches. In this study, because replacing the entire lineup is too complicated, and the actual question for the replacement of players is "whether or not the lineup should be changed", and “whether or not Small Ball lineup is adopted”. Therefore, we propose an analytical model for the optimal player selection problem based on Small Ball lineups. In basketball, we can accumulate scoring data for each play, which indicates a player's contribution to the game, and the scoring data can be considered as a time series data. In order to compare the importance of players in different situations and lineups, we combine RNN (Recurrent Neural Network) model, which can analyze time series data, and NN (Neural Network) model, which can analyze the situation on the field, to build the prediction model of score. This model is capable to identify the current optimal lineup for different situations. In this research, we collected all the data of accumulated data of NBA from 2019-2020. Then we apply the method to the actual basketball play data to verify the reliability of the proposed model.

Keywords: Recurrent Neural Network, players lineup, basketball data, decision making model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 744

13050 Measurement of Operational and Environmental Performance of the Coal-Fired Power Plants in India by Using Data Envelopment Analysis

Authors: Vijay Kumar Bajpai, Sudhir Kumar Singh

Abstract:

In this study, the performance analyses of the twenty five Coal-Fired Power Plants (CFPPs) used for electricity generation are carried out through various Data Envelopment Analysis (DEA) models. Three efficiency indices are defined and pursued. During the calculation of the operational performance, energy and non-energy variables are used as input, and net electricity produced is used as desired output (Model-1). CO2 emitted to the environment is used as the undesired output (Model-2) in the computation of the pure environmental performance while in Model-3 CO2 emissions is considered as detrimental input in the calculation of operational and environmental performance. Empirical results show that most of the plants are operating in increasing returns to scale region and Mettur plant is efficient one with regards to energy use and environment. The result also indicates that the undesirable output effect is insignificant in the research sample. The present study will provide clues to plant operators towards raising the operational and environmental performance of CFPPs.

Keywords: Coal fired power plants, environmental performance, data envelopment analysis, operational performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318

13049 Retail Strategy to Reduce Waste Keeping High Profit Utilizing Taylor's Law in Point-of-Sales Data

Authors: Gen Sakoda, Hideki Takayasu, Misako Takayasu

Abstract:

Waste reduction is a fundamental problem for sustainability. Methods for waste reduction with point-of-sales (POS) data are proposed, utilizing the knowledge of a recent econophysics study on a statistical property of POS data. Concretely, the non-stationary time series analysis method based on the Particle Filter is developed, which considers abnormal fluctuation scaling known as Taylor's law. This method is extended for handling incomplete sales data because of stock-outs by introducing maximum likelihood estimation for censored data. The way for optimal stock determination with pricing the cost of waste reduction is also proposed. This study focuses on the examination of the methods for large sales numbers where Taylor's law is obvious. Numerical analysis using aggregated POS data shows the effectiveness of the methods to reduce food waste maintaining a high profit for large sales numbers. Moreover, the way of pricing the cost of waste reduction reveals that a small profit loss realizes substantial waste reduction, especially in the case that the proportionality constant of Taylor’s law is small. Specifically, around 1% profit loss realizes half disposal at =0.12, which is the actual value of processed food items used in this research. The methods provide practical and effective solutions for waste reduction keeping a high profit, especially with large sales numbers.

Keywords: Food waste reduction, particle filter, point of sales, sustainable development goals, Taylor's Law, time series analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 804

13048 Road Safety in Great Britain: An Exploratory Data Analysis

Authors: Jatin Kumar Choudhary, Naren Rayala, Abbas Eslami Kiasari, Fahimeh Jafari

Abstract:

Great Britain has one of the safest road networks in the world. However, the consequences of any death or serious injury are devastating for loved ones, as well as for those who help the severely injured. This paper aims to analyse Great Britain's road safety situation and show the response measures for areas where the total damage caused by accidents can be significantly and quickly reduced. For the past 30 years, the UK has had a good record in reducing fatalities over the past 30 years, there is still a considerable number of road deaths. The government continues to scale back road deaths empowering responsible road users by identifying and prosecuting the parameters that make the roads less safe. This study represents an exploratory analysis with deep insights which could provide policy makers with invaluable insights into how accidents happen and how they can be mitigated. We use STATS19 data published by the UK government. Since we need more information about locations which is not provided in STATA19, we first expand the features of the dataset using OpenStreetMap and Visual Crossing. This paper also provides a discussion regarding new road safety methods.

Keywords: Road safety, data analysis, OpenStreetMap, feature expanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 240

13047 Analysis and Classification of Hiv-1 Sub- Type Viruses by AR Model through Artificial Neural Networks

Authors: O. Yavuz, L. Ozyilmaz

Abstract:

HIV-1 genome is highly heterogeneous. Due to this variation, features of HIV-I genome is in a wide range. For this reason, the ability to infection of the virus changes depending on different chemokine receptors. From this point of view, R5 HIV viruses use CCR5 coreceptor while X4 viruses use CXCR5 and R5X4 viruses can utilize both coreceptors. Recently, in Bioinformatics, R5X4 viruses have been studied to classify by using the experiments on HIV-1 genome. In this study, R5X4 type of HIV viruses were classified using Auto Regressive (AR) model through Artificial Neural Networks (ANNs). The statistical data of R5X4, R5 and X4 viruses was analyzed by using signal processing methods and ANNs. Accessible residues of these virus sequences were obtained and modeled by AR model since the dimension of residues is large and different from each other. Finally the pre-processed data was used to evolve various ANN structures for determining R5X4 viruses. Furthermore ROC analysis was applied to ANNs to show their real performances. The results indicate that R5X4 viruses successfully classified with high sensitivity and specificity values training and testing ROC analysis for RBF, which gives the best performance among ANN structures.

Keywords: Auto-Regressive Model, HIV, Neural Networks, ROC Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140

13046 On Methodologies for Analysing Sickness Absence Data: An Insight into a New Method

Authors: Xiaoshu Lu, Päivi Leino-Arjas, Kustaa Piha, Akseli Aittomäki, Peppiina Saastamoinen, Ossi Rahkonen, Eero Lahelma

Abstract:

Sickness absence represents a major economic and social issue. Analysis of sick leave data is a recurrent challenge to analysts because of the complexity of the data structure which is often time dependent, highly skewed and clumped at zero. Ignoring these features to make statistical inference is likely to be inefficient and misguided. Traditional approaches do not address these problems. In this study, we discuss model methodologies in terms of statistical techniques for addressing the difficulties with sick leave data. We also introduce and demonstrate a new method by performing a longitudinal assessment of long-term absenteeism using a large registration dataset as a working example available from the Helsinki Health Study for municipal employees from Finland during the period of 1990-1999. We present a comparative study on model selection and a critical analysis of the temporal trends, the occurrence and degree of long-term sickness absences among municipal employees. The strengths of this working example include the large sample size over a long follow-up period providing strong evidence in supporting of the new model. Our main goal is to propose a way to select an appropriate model and to introduce a new methodology for analysing sickness absence data as well as to demonstrate model applicability to complicated longitudinal data.

Keywords: Sickness absence, longitudinal data, methodologies, mix-distribution model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2227

13045 Self Organizing Mixture Network in Mixture Discriminant Analysis: An Experimental Study

Authors: Nazif Çalış, Murat Erişoğlu, Hamza Erol, Tayfun Servi

Abstract:

In the recent works related with mixture discriminant analysis (MDA), expectation and maximization (EM) algorithm is used to estimate parameters of Gaussian mixtures. But, initial values of EM algorithm affect the final parameters- estimates. Also, when EM algorithm is applied two times, for the same data set, it can be give different results for the estimate of parameters and this affect the classification accuracy of MDA. Forthcoming this problem, we use Self Organizing Mixture Network (SOMN) algorithm to estimate parameters of Gaussians mixtures in MDA that SOMN is more robust when random the initial values of the parameters are used [5]. We show effectiveness of this method on popular simulated waveform datasets and real glass data set.

Keywords: Self Organizing Mixture Network, MixtureDiscriminant Analysis, Waveform Datasets, Glass Identification, Mixture of Multivariate Normal Distributions

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1477

13044 A Forecast Model for Projecting the Amount of Hazardous Waste

Authors: J. Vilgerts, L. Timma, D. Blumberga

Abstract:

The objective of the paper is to develop the forecast model for the HW flows. The methodology of the research included 6 modules: historical data, assumptions, choose of indicators, data processing, and data analysis with STATGRAPHICS, and forecast models. The proposed methodology was validated for the case study for Latvia. Hypothesis on the changes in HW for time period of 2010-2020 have been developed and mathematically described with confidence level of 95.0% and 50.0%. Sensitivity analysis for the analyzed scenarios was done. The results show that the growth of GDP affects the total amount of HW in the country. The total amount of the HW is projected to be within the corridor of – 27.7% in the optimistic scenario up to +87.8% in the pessimistic scenario with confidence level of 50.0% for period of 2010-2020. The optimistic scenario has shown to be the least flexible to the changes in the GDP growth.

Keywords: Forecast models, hazardous waste management, sustainable development, waste management indicators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811

13043 Application of Mutual Information based Least dependent Component Analysis (MILCA) for Removal of Ocular Artifacts from Electroencephalogram

Authors: V Krishnaveni, S Jayaraman, K Ramadoss

Abstract:

The electrical potentials generated during eye movements and blinks are one of the main sources of artifacts in Electroencephalogram (EEG) recording and can propagate much across the scalp, masking and distorting brain signals. In recent times, signal separation algorithms are used widely for removing artifacts from the observed EEG data. In this paper, a recently introduced signal separation algorithm Mutual Information based Least dependent Component Analysis (MILCA) is employed to separate ocular artifacts from EEG. The aim of MILCA is to minimize the Mutual Information (MI) between the independent components (estimated sources) under a pure rotation. Performance of this algorithm is compared with eleven popular algorithms (Infomax, Extended Infomax, Fast ICA, SOBI, TDSEP, JADE, OGWE, MS-ICA, SHIBBS, Kernel-ICA, and RADICAL) for the actual independence and uniqueness of the estimated source components obtained for different sets of EEG data with ocular artifacts by using a reliable MI Estimator. Results show that MILCA is best in separating the ocular artifacts and EEG and is recommended for further analysis.

Keywords: Electroencephalogram, Ocular Artifacts (OA), Independent Component Analysis (ICA), Mutual Information (MI), Mutual Information based Least dependent Component Analysis(MILCA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2156

13042 Performance Analysis of the Subgroup Method for Collective I/O

Authors: Kwangho Cha, Hyeyoung Cho, Sungho Kim

Abstract:

As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using collective I/O of MPI effectively. From the experimental results, we found that the subgroup method showed good performance with small data size.

Keywords: Collective I/O, MPI, parallel file system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528

13041 Novelist Calls Out Poemist: A Psycholinguistic and Contrastive Analysis of the Errors in Turkish EFL Learners- Interlanguage

Authors: Mehmet Ozcan

Abstract:

This study is designed to investigate errors emerged in written texts produced by 30 Turkish EFL learners with an explanatory, and thus, qualitative perspective. Erroneous language elements were identified by the researcher first and then their grammaticality and intelligibility were checked by five native speakers of English. The analysis of the data showed that it is difficult to claim that an error stems from only one single factor since different features of an error are triggered by different factors. Our findings revealed two different types of errors: those which stem from the interference of L1 with L2 and those which are developmental ones. The former type contains more global errors whereas the errors in latter type are more intelligible.

Keywords: Contrastive analysis, Error analysis, Language acquisition, Language transfer, Turkish

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055

13040 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592

13039 Blood Glucose Measurement and Analysis: Methodology

Authors: I. M. Abd Rahim, H. Abdul Rahim, R. Ghazali

Abstract:

There is numerous non-invasive blood glucose measurement technique developed by researchers, and near infrared (NIR) is the potential technique nowadays. However, there are some disagreements on the optimal wavelength range that is suitable to be used as the reference of the glucose substance in the blood. This paper focuses on the experimental data collection technique and also the analysis method used to analyze the data gained from the experiment. The selection of suitable linear and non-linear model structure is essential in prediction system, as the system developed need to be conceivably accurate.

Keywords: Invasive, linear, near-infrared (Nir), non-invasive, non-linear, prediction system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 809

13038 Numerical Analysis of the SIR-SI Differential Equations with Application to Dengue Disease Mapping in Kuala Lumpur, Malaysia

Authors: N. A. Samat, D. F. Percy

Abstract:

The main aim of this study is to describe and introduce a method of numerical analysis in obtaining approximate solutions for the SIR-SI differential equations (susceptible-infectiverecovered for human populations; susceptible-infective for vector populations) that represent a model for dengue disease transmission. Firstly, we describe the ordinary differential equations for the SIR-SI disease transmission models. Then, we introduce the numerical analysis of solutions of this continuous time, discrete space SIR-SI model by simplifying the continuous time scale to a densely populated, discrete time scale. This is followed by the application of this numerical analysis of solutions of the SIR-SI differential equations to the estimation of relative risk using continuous time, discrete space dengue data of Kuala Lumpur, Malaysia. Finally, we present the results of the analysis, comparing and displaying the results in graphs, table and maps. Results of the numerical analysis of solutions that we implemented offers a useful and potentially superior model for estimating relative risks based on continuous time, discrete space data for vector borne infectious diseases specifically for dengue disease.

Keywords: Dengue disease, disease mapping, numerical analysis, SIR-SI differential equations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2637

13037 Automatic Camera Calibration for Images of Soccer Match

Authors: Qihe Li, Yupin Luo

Abstract:

Camera calibration plays an important role in the domain of the analysis of sports video. Considering soccer video, in most cases, the cross-points can be used for calibration at the center of the soccer field are not sufficient, so this paper introduces a new automatic camera calibration algorithm focus on solving this problem by using the properties of images of the center circle, halfway line and a touch line. After the theoretical analysis, a practicable automatic algorithm is proposed. Very little information used though, results of experiments with both synthetic data and real data show that the algorithm is applicable.

Keywords: Absolute conic, camera calibration, circular points, line at infinity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318

13036 Computer Software Applicable in Rehabilitation, Cardiology and Molecular Biology

Authors: P. Kowalska, P. Gabka, K. Kamieniarz, M. Kamieniarz, W. Stryla, P. Guzik, T. Krauze

Abstract:

We have developed a computer program consisting of 6 subtests assessing the children hand dexterity applicable in the rehabilitation medicine. We have carried out a normative study on a representative sample of 285 children aged from 7 to 15 (mean age 11.3) and we have proposed clinical standards for three age groups (7-9, 9-11, 12-15 years). We have shown statistical significance of differences among the corresponding mean values of the task time completion. We have also found a strong correlation between the task time completion and the age of the subjects, as well as we have performed the test-retest reliability checks in the sample of 84 children, giving the high values of the Pearson coefficients for the dominant and non-dominant hand in the range 0.740.97 and 0.620.93, respectively. A new MATLAB-based programming tool aiming at analysis of cardiologic RR intervals and blood pressure descriptors, is worked out, too. For each set of data, ten different parameters are extracted: 2 in time domain, 4 in frequency domain and 4 in Poincaré plot analysis. In addition twelve different parameters of baroreflex sensitivity are calculated. All these data sets can be visualized in time domain together with their power spectra and Poincaré plots. If available, the respiratory oscillation curves can be also plotted for comparison. Another application processes biological data obtained from BLAST analysis.

Keywords: Biomedical data base processing, Computer software, Hand dexterity, Heart rate and blood pressure variability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1426

13035 Analyzing Methods of the Relation between Concepts based on a Concept Hierarchy

Authors: Ke Lu, Tetsuya Furukawa

Abstract:

Data objects are usually organized hierarchically, and the relations between them are analyzed based on a corresponding concept hierarchy. The relation between data objects, for example how similar they are, are usually analyzed based on the conceptual distance in the hierarchy. If a node is an ancestor of another node, it is enough to analyze how close they are by calculating the distance vertically. However, if there is not such relation between two nodes, the vertical distance cannot express their relation explicitly. This paper tries to fill this gap by improving the analysis method for data objects based on hierarchy. The contributions of this paper include: (1) proposing an improved method to evaluate the vertical distance between concepts; (2) defining the concept horizontal distance and a method to calculate the horizontal distance; and (3) discussing the methods to confine a range by the horizontal distance and the vertical distance, and evaluating the relation between concepts.

Keywords: Concept Hierarchy, Horizontal Distance, Relation Analysis, Vertical Distance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1189

13034 Risk Factors’ Analysis on Shanghai Carbon Trading

Authors: Zhaojun Wang, Zongdi Sun, Zhiyuan Liu

Abstract:

First of all, the carbon trading price and trading volume in Shanghai are transformed by Fourier transform, and the frequency response diagram is obtained. Then, the frequency response diagram is analyzed and the Blackman filter is designed. The Blackman filter is used to filter, and the carbon trading time domain and frequency response diagram are obtained. After wavelet analysis, the carbon trading data were processed; respectively, we got the average value for each 5 days, 10 days, 20 days, 30 days, and 60 days. Finally, the data are used as input of the Back Propagation Neural Network model for prediction.

Keywords: Shanghai carbon trading, carbon trading price, carbon trading volume, wavelet analysis, BP neural network model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 920