Search results for: Statistical Data Analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13810

Search results for: Statistical Data Analysis.

13360 Research on the Problems of Housing Prices in Qingdao from a Macro Perspective

Authors: Liu Zhiyuan, Sun Zongdi, Liu Zhiyuan, Sun Zongdi

Abstract:

Qingdao is a seaside city. Taking into account the characteristics of Qingdao, this article established a multiple linear regression model to analyze the impact of macroeconomic factors on housing prices. We used stepwise regression method to make multiple linear regression analysis, and made statistical analysis of F test values and T test values. According to the analysis results, the model is continuously optimized. Finally, this article obtained the multiple linear regression equation and the influencing factors, and the reliability of the model was verified by F test and T test.

Keywords: Housing prices, multiple linear regression model, macroeconomic factors, Qingdao City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1169
13359 Students- uses of Wiki in Teacher Education: A Statistical Analysis

Authors: Said Hadjerrouit

Abstract:

Wikis are considered to be part of Web 2.0 technologies that potentially support collaborative learning and writing. Wikis provide opportunities for multiple users to work on the same document simultaneously. Most wikis have also a page for written group discussion. Nevertheless, wikis may be used in different ways depending on the pedagogy being used, and the constraints imposed by the course design. This work explores students- uses of wiki in teacher education. The analysis is based on a taxonomy for classifying students- activities and actions carried out on the wiki. The article also discusses the implications for using wikis as collaborative writing tools in teacher education.

Keywords: Behaviorism, collaborative writing, socioconstructivism, taxonomy, web 2.0 technology, wiki

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
13358 Using Combination of Optimized Recurrent Neural Network with Design of Experiments and Regression for Control Chart Forecasting

Authors: R. Behmanesh, I. Rahimi

Abstract:

recurrent neural network (RNN) is an efficient tool for modeling production control process as well as modeling services. In this paper one RNN was combined with regression model and were employed in order to be checked whether the obtained data by the model in comparison with actual data, are valid for variable process control chart. Therefore, one maintenance process in workshop of Esfahan Oil Refining Co. (EORC) was taken for illustration of models. First, the regression was made for predicting the response time of process based upon determined factors, and then the error between actual and predicted response time as output and also the same factors as input were used in RNN. Finally, according to predicted data from combined model, it is scrutinized for test values in statistical process control whether forecasting efficiency is acceptable. Meanwhile, in training process of RNN, design of experiments was set so as to optimize the RNN.

Keywords: RNN, DOE, regression, control chart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1654
13357 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640
13356 Analysis of Palm Perspiration Effect with SVM for Diabetes in People

Authors: Hamdi Melih Saraoğlu, Muhlis Yıldırım, Abdurrahman Özbeyaz, Feyzullah Temurtas

Abstract:

In this research, the diabetes conditions of people (healthy, prediabete and diabete) were tried to be identified with noninvasive palm perspiration measurements. Data clusters gathered from 200 subjects were used (1.Individual Attributes Cluster and 2. Palm Perspiration Attributes Cluster). To decrase the dimensions of these data clusters, Principal Component Analysis Method was used. Data clusters, prepared in that way, were classified with Support Vector Machines. Classifications with highest success were 82% for Glucose parameters and 84% for HbA1c parametres.

Keywords: Palm perspiration, Diabetes, Support Vector Machine, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
13355 Infrastructure Change Monitoring Using Multitemporal Multispectral Satellite Images

Authors: U. Datta

Abstract:

The main objective of this study is to find a suitable approach to monitor the land infrastructure growth over a period of time using multispectral satellite images. Bi-temporal change detection method is unable to indicate the continuous change occurring over a long period of time. To achieve this objective, the approach used here estimates a statistical model from series of multispectral image data over a long period of time, assuming there is no considerable change during that time period and then compare it with the multispectral image data obtained at a later time. The change is estimated pixel-wise. Statistical composite hypothesis technique is used for estimating pixel based change detection in a defined region. The generalized likelihood ratio test (GLRT) is used to detect the changed pixel from probabilistic estimated model of the corresponding pixel. The changed pixel is detected assuming that the images have been co-registered prior to estimation. To minimize error due to co-registration, 8-neighborhood pixels around the pixel under test are also considered. The multispectral images from Sentinel-2 and Landsat-8 from 2015 to 2018 are used for this purpose. There are different challenges in this method. First and foremost challenge is to get quite a large number of datasets for multivariate distribution modelling. A large number of images are always discarded due to cloud coverage. Due to imperfect modelling there will be high probability of false alarm. Overall conclusion that can be drawn from this work is that the probabilistic method described in this paper has given some promising results, which need to be pursued further.

Keywords: Co-registration, GLRT, infrastructure growth, multispectral, multitemporal, pixel-based change detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 710
13354 Dimension Reduction of Microarray Data Based on Local Principal Component

Authors: Ali Anaissi, Paul J. Kennedy, Madhu Goyal

Abstract:

Analysis and visualization of microarraydata is veryassistantfor biologists and clinicians in the field of diagnosis and treatment of patients. It allows Clinicians to better understand the structure of microarray and facilitates understanding gene expression in cells. However, microarray dataset is a complex data set and has thousands of features and a very small number of observations. This very high dimensional data set often contains some noise, non-useful information and a small number of relevant features for disease or genotype. This paper proposes a non-linear dimensionality reduction algorithm Local Principal Component (LPC) which aims to maps high dimensional data to a lower dimensional space. The reduced data represents the most important variables underlying the original data. Experimental results and comparisons are presented to show the quality of the proposed algorithm. Moreover, experiments also show how this algorithm reduces high dimensional data whilst preserving the neighbourhoods of the points in the low dimensional space as in the high dimensional space.

Keywords: Linear Dimension Reduction, Non-Linear Dimension Reduction, Principal Component Analysis, Biologists.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566
13353 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
13352 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
13351 Performance Analysis of Routing Protocol for WSN Using Data Centric Approach

Authors: A. H. Azni, Madihah Mohd Saudi, Azreen Azman, Ariff Syah Johari

Abstract:

Sensor Network are emerging as a new tool for important application in diverse fields like military surveillance, habitat monitoring, weather, home electrical appliances and others. Technically, sensor network nodes are limited in respect to energy supply, computational capacity and communication bandwidth. In order to prolong the lifetime of the sensor nodes, designing efficient routing protocol is very critical. In this paper, we illustrate the existing routing protocol for wireless sensor network using data centric approach and present performance analysis of these protocols. The paper focuses in the performance analysis of specific protocol namely Directed Diffusion and SPIN. This analysis reveals that the energy usage is important features which need to be taken into consideration while designing routing protocol for wireless sensor network.

Keywords: Data Centric Approach, Directed Diffusion, SPIN WSN Routing Protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2528
13350 The Effect of Entrepreneurship on Foreign Direct Investment

Authors: Wissam B. Fahed

Abstract:

Entrepreneurship has become an important and extensively researched concept in business studies. Research on foreign direct investment (FDI) has become widespread due to the growth of FDI and its importance in globalization. Most entrepreneurship studies examined the importance and influence of entrepreneurial orientation in a micro-level context. On the other hand, studies and research concerning FDI used statistical techniques to analyze the effect, determinants, and motives of FDI on a macroeconomic level, ignoring empirical studies on other noneconomic determinants. In order to bridge the gap between the theory and empirical evidence on FDI and the theory and research on entrepreneurship, this study examines the impact of entrepreneurship on inward foreign direct investment. The relationship between entrepreneurship and foreign direct investment is investigated through regression analysis of pooled time-series and cross-sectional data. The results suggest that entrepreneurship has a significant effect on FDI.

Keywords: Entrepreneurship, foreign direct investment, globalization, economic freedom.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3877
13349 Territorial Availability of Social and Economic Infrastructure in Kazakhstan: Comparative Analysis of Urban and Rural Households

Authors: Nazym Shedenova, Aigul Beimisheva

Abstract:

The market transformation in Kazakhstan during the last two decades has essentially strengthened a gap between development of urban and rural areas. Implementation of market institutes, transition from public financing to paid rendering of social services, change of forms of financing of social and economic infrastructure have led to strengthening of an economic inequality of social groups, including growth of stratification of the city and the village. Sociological survey of urban and rural households in Almaty city and villages of Almaty region has been carried out within the international research project “Livelihoods Strategies of Private Households in Central Asia: A Rural–Urban Comparison in Kazakhstan and Kyrgyzstan" (Germany, Kazakhstan, Kyrgyzstan). The analysis of statistical data and results of sociological research of urban and rural households allows us to reveal issues of territorial development, to investigate an availability of medical, educational and other services in the city and the village, to reveal an evaluation urban and rural dwellers of living conditions, to compare economic strategies of households in the city and the village.

Keywords: Urban and rural households, social and economic infrastructure, territorial availability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2159
13348 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1154
13347 A Development of the Multiple Intelligences Measurement of Elementary Students

Authors: Chaiwat Waree

Abstract:

This research aims at development of the Multiple Intelligences Measurement of Elementary Students. The structural accuracy test and normality establishment are based on the Multiple Intelligences Theory of Gardner. This theory consists of eight aspects namely linguistics, logic and mathematics, visual-spatial relations, body and movement, music, human relations, self-realization/selfunderstanding and nature. The sample used in this research consists of elementary school students (aged between 5-11 years). The size of the sample group was determined by Yamane Table. The group has 2,504 students. Multistage Sampling was used. Basic statistical analysis and construct validity testing were done using confirmatory factor analysis. The research can be summarized as follows; 1. Multiple Intelligences Measurement consisting of 120 items is content-accurate. Internal consistent reliability according to the method of Kuder-Richardson of the whole Multiple Intelligences Measurement equals .91. The difficulty of the measurement test is between .39-.83. Discrimination is between .21-.85. 2). The Multiple Intelligences Measurement has construct validity in a good range, that is 8 components and all 120 test items have statistical significance level at .01. Chi-square value equals 4357.7; p=.00 at the degree of freedom of 244 and Goodness of Fit Index equals 1.00. Adjusted Goodness of Fit Index equals .92. Comparative Fit Index (CFI) equals .68. Root Mean Squared Residual (RMR) equals 0.064 and Root Mean Square Error of Approximation equals 0.82. 3). The normality of the Multiple Intelligences Measurement is categorized into 3 levels. Those with high intelligence are those with percentiles of more than 78. Those with moderate/medium intelligence are those with percentiles between 24 and 77.9. Those with low intelligence are those with percentiles from 23.9 downwards.

Keywords: Multiple Intelligences, Measurement, Elementary Students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2951
13346 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: Crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1169
13345 Director Compensation, CEO Duality, State Ownership, and Firm Performance in China: Proof from Panel Data of Publicly Listed Enterprises from 1999 to 2020

Authors: Wanda Luen-Wun Siu, Xiaowen Zhang

Abstract:

This paper offered the primary methodical proof on how director remuneration related to enterprise earnings in listed firms in China in light of most evidence focusing on cross-sectional data or data in a short span of time. Using full economic and business panel data on China’s publicly listed enterprise from 1999 to 2020 over two decades in the China Stock Market & Accounting Research database, we found statistically significant positive associations between director pay and firm performance in privately owned firms over this period, supporting the agency theory. In contrast, among the state-owned enterprises, there was a reverse relation between director compensation and firm financial performance, contributing to the existing literature. But the results also revealed that state-owned enterprises financially performed as well as private enterprises. Such findings suggested that state ownership might line up officials’ career incentives with party prime concern rather than pecuniary incentives. Also, CEO duality enhanced firm performance. As such, allegiance to the party and possible advancement to an upper-level political position would motivate company directors in state-owned enterprises. On the other hand, directors in privately owned enterprises might be motivated by monetary incentives. In addition, a statistical regression model was proposed and tested to get the results of the performance of state-owned enterprises. Finally, some suggestions were made about how to improve the institutional management of government-owned corporations in China.

Keywords: China’s listed Firm, director compensation, CEO duality, firm performance, panel analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 473
13344 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: Welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 794
13343 Modern Trends in Foreign Direct Investments in Georgia

Authors: Rusudan Kinkladze, Guguli Kurashvili, Ketevan Chitaladze

Abstract:

Foreign direct investment is a driving force in the development of the interdependent national economies, and the study and analysis of investments is an urgent problem. It is particularly important for transitional economies, such as Georgia, and the study and analysis of investments is an urgent problem. Consequently, the goal of the research is the study and analysis of direct foreign investments in Georgia, and identification and forecasting of modern trends, and covers the period of 2006-2015. The study uses the methods of statistical observation, grouping and analysis, the methods of analytical indicators of time series, trend identification and the predicted values are calculated, as well as various literary and Internet sources relevant to the research. The findings showed that modern investment policy In Georgia is favorable for domestic as well as foreign investors. Georgia is still a net importer of investments. In 2015, the top 10 investing countries was led by Azerbaijan, United Kingdom and Netherlands, and the largest share of FDIs were allocated in the transport and communication sector; the financial sector was the second, followed by the health and social work sector, and the same trend will continue in the future. 

Keywords: Foreign Direct Investments, methods, statistics, analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 907
13342 Stakeholder Analysis of Agricultural Drone Policy: A Case Study of the Agricultural Drone Ecosystem of Thailand

Authors: Thanomsin Chakreeves, Atichat Preittigun, Ajchara Phu-ang

Abstract:

This paper presents a stakeholder analysis of agricultural drone policies that meet the government's goal of building an agricultural drone ecosystem in Thailand. Firstly, case studies from other countries are reviewed. The stakeholder analysis method and qualitative data from the interviews are then presented including data from the Institute of Innovation and Management, the Office of National Higher Education Science Research and Innovation Policy Council, agricultural entrepreneurs and farmers. Study and interview data are then employed to describe the current ecosystem and to guide the implementation of agricultural drone policies that are suitable for the ecosystem of Thailand. Finally, policy recommendations are then made that the Thai government should adopt in the future.

Keywords: Drone public policy, drone ecosystem, policy development, agricultural drone.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 788
13341 Seismic Vulnerability Assessment of Buildings in Algiers Area

Authors: F. Lazzali, M. Farsi

Abstract:

Several models of vulnerability assessment have been proposed. The selection of one of these models depends on the objectives of the study. The classical methodologies for seismic vulnerability analysis, as a part of seismic risk analysis, have been formulated with statistical criteria based on a rapid observation. The information relating to the buildings performance is statistically elaborated. In this paper, we use the European Macroseismic Scale EMS-98 to define the relationship between damage and macroseismic intensity to assess the seismic vulnerability. Applying to Algiers area, the first step is to identify building typologies and to assign vulnerability classes. In the second step, damages are investigated according to EMS-98.

Keywords: Damage, EMS-98, inventory building, vulnerability classes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
13340 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 769
13339 Parallelization of Ensemble Kalman Filter (EnKF) for Oil Reservoirs with Time-lapse Seismic Data

Authors: Md Khairullah, Hai-Xiang Lin, Remus G. Hanea, Arnold W. Heemink

Abstract:

In this paper we describe the design and implementation of a parallel algorithm for data assimilation with ensemble Kalman filter (EnKF) for oil reservoir history matching problem. The use of large number of observations from time-lapse seismic leads to a large turnaround time for the analysis step, in addition to the time consuming simulations of the realizations. For efficient parallelization it is important to consider parallel computation at the analysis step. Our experiments show that parallelization of the analysis step in addition to the forecast step has good scalability, exploiting the same set of resources with some additional efforts.

Keywords: EnKF, Data assimilation, Parallel computing, Parallel efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2272
13338 Analysis of Users’ Behavior on Book Loan Log Based On Association Rule Mining

Authors: Kanyarat Bussaban, Kunyanuth Kularbphettong

Abstract:

This research aims to create a model for analysis of student behavior using Library resources based on data mining technique in case of Suan Sunandha Rajabhat University. The model was created under association rules, Apriori algorithm. The results were found 14 rules and the rules were tested with testing data set and it showed that the ability of classify data was 79.24percent and the MSE was 22.91. The results showed that the user’s behavior model by using association rule technique can use to manage the library resources.

Keywords: Behavior, data mining technique, Apriori algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2298
13337 Predicting DHF Incidence in Northern Thailand using Time Series Analysis Technique

Authors: S. Wongkoon, M. Pollar, M. Jaroensutasinee, K. Jaroensutasinee

Abstract:

This study aimed at developing a forecasting model on the number of Dengue Haemorrhagic Fever (DHF) incidence in Northern Thailand using time series analysis. We developed Seasonal Autoregressive Integrated Moving Average (SARIMA) models on the data collected between 2003-2006 and then validated the models using the data collected between January-September 2007. The results showed that the regressive forecast curves were consistent with the pattern of actual values. The most suitable model was the SARIMA(2,0,1)(0,2,0)12 model with a Akaike Information Criterion (AIC) of 12.2931 and a Mean Absolute Percent Error (MAPE) of 8.91713. The SARIMA(2,0,1)(0,2,0)12 model fitting was adequate for the data with the Portmanteau statistic Q20 = 8.98644 ( x20,95= 27.5871, P>0.05). This indicated that there was no significant autocorrelation between residuals at different lag times in the SARIMA(2,0,1)(0,2,0)12 model.

Keywords: Dengue, SARIMA, Time Series Analysis, Northern Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1984
13336 Geospatial Network Analysis Using Particle Swarm Optimization

Authors: Varun Singh, Mainak Bandyopadhyay, Maharana Pratap Singh

Abstract:

The shortest path (SP) problem concerns with finding the shortest path from a specific origin to a specified destination in a given network while minimizing the total cost associated with the path. This problem has widespread applications. Important applications of the SP problem include vehicle routing in transportation systems particularly in the field of in-vehicle Route Guidance System (RGS) and traffic assignment problem (in transportation planning). Well known applications of evolutionary methods like Genetic Algorithms (GA), Ant Colony Optimization, Particle Swarm Optimization (PSO) have come up to solve complex optimization problems to overcome the shortcomings of existing shortest path analysis methods. It has been reported by various researchers that PSO performs better than other evolutionary optimization algorithms in terms of success rate and solution quality. Further Geographic Information Systems (GIS) have emerged as key information systems for geospatial data analysis and visualization. This research paper is focused towards the application of PSO for solving the shortest path problem between multiple points of interest (POI) based on spatial data of Allahabad City and traffic speed data collected using GPS. Geovisualization of results of analysis is carried out in GIS.

Keywords: GIS, Outliers, PSO, Traffic Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2886
13335 Numerical Optimization Design of PEM Fuel Cell Performance Applying the Taguchi Method

Authors: Shan-Jen Cheng, Jr-Ming Miao, Sheng-Ju Wu

Abstract:

The purpose of this paper is applied Taguchi method on the optimization for PEMFC performance, and a representative Computational Fluid Dynamics (CFD) model is selectively performed for statistical analysis. The studied factors in this paper are pressure of fuel cell, operating temperature, the relative humidity of anode and cathode, porosity of gas diffusion electrode (GDE) and conductivity of GDE. The optimal combination for maximum power density is gained by using a three-level statistical method. The results confirmed that the robustness of the optimum design parameters influencing the performance of fuel cell are founded by pressure of fuel cell, 3atm; operating temperature, 353K; the relative humidity of anode, 50%; conductivity of GDE, 1000 S/m, but the relative humidity of cathode and porosity of GDE are pooled as error due to a small sum of squares. The present simulation results give designers the ideas ratify the effectiveness of the proposed robust design methodology for the performance of fuel cell.

Keywords: PEMFC, numerical simulation, optimization, Taguchi method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2540
13334 Establishing a Probabilistic Model of Extrapolated Wind Speed Data for Wind Energy Prediction

Authors: Mussa I. Mgwatu, Reuben R. M. Kainkwa

Abstract:

Wind is among the potential energy resources which can be harnessed to generate wind energy for conversion into electrical power. Due to the variability of wind speed with time and height, it becomes difficult to predict the generated wind energy more optimally. In this paper, an attempt is made to establish a probabilistic model fitting the wind speed data recorded at Makambako site in Tanzania. Wind speeds and direction were respectively measured using anemometer (type AN1) and wind Vane (type WD1) both supplied by Delta-T-Devices at a measurement height of 2 m. Wind speeds were then extrapolated for the height of 10 m using power law equation with an exponent of 0.47. Data were analysed using MINITAB statistical software to show the variability of wind speeds with time and height, and to determine the underlying probability model of the extrapolated wind speed data. The results show that wind speeds at Makambako site vary cyclically over time; and they conform to the Weibull probability distribution. From these results, Weibull probability density function can be used to predict the wind energy.

Keywords: Probabilistic models, wind speed, wind energy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2340
13333 The Effect of Smartphones on Human Health Relative to User’s Addiction: A Study on a Wide Range of Audiences in Jordan

Authors: T. Qasim, M. Obeidat, S. Al-Sharairi

Abstract:

The objective of this study is to investigate the effect of the excessive use of smartphones. Smartphones have enormous effects on the human body in that some musculoskeletal disorders (MSDs) and health problems might evolve. These days, there is a wide use of the smartphones among all age groups of society, thus, the focus on smartphone effects on human behavior and health, especially on the young and elderly people, becomes a crucial issue. This study was conducted in Jordan on smartphone users for different genders and ages, by conducting a survey to collect data related to the symptoms and MSDs that are resulted from the excessive use of smartphones. A total of 357 responses were used in the analysis. The main related symptoms were numbness, fingers pain, and pain in arm, all linked to age and gender for comparative reasons. A statistical analysis was performed to find the effects of extensive usage of a smartphone for long periods of time on the human body. Results show that the significant variables were the vision problems and the time spent when using the smartphone that cause vision problems. Other variables including age of user and ear problems due to the use of the headsets were found to be a border line significant.

Keywords: Smartphone, age group, musculoskeletal disorders (MSDs), health problems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2038
13332 Automatic Adjustment of Thresholds via Closed-Loop Feedback Mechanism for Solder Paste Inspection

Authors: Chia-Chen Wei, Pack Hsieh, Jeffrey Chen

Abstract:

Surface Mount Technology (SMT) is widely used in the area of the electronic assembly in which the electronic components are mounted to the surface of the printed circuit board (PCB). Most of the defects in the SMT process are mainly related to the quality of solder paste printing. These defects lead to considerable manufacturing costs in the electronics assembly industry. Therefore, the solder paste inspection (SPI) machine for controlling and monitoring the amount of solder paste printing has become an important part of the production process. So far, the setting of the SPI threshold is based on statistical analysis and experts’ experiences to determine the appropriate threshold settings. Because the production data are not normal distribution and there are various variations in the production processes, defects related to solder paste printing still occur. In order to solve this problem, this paper proposes an online machine learning algorithm, called the automatic threshold adjustment (ATA) algorithm, and closed-loop architecture in the SMT process to determine the best threshold settings. Simulation experiments prove that our proposed threshold settings improve the accuracy from 99.85% to 100%.

Keywords: Big data analytics, Industry 4.0, SPI threshold setting, surface mount technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805
13331 Exploring the Activity Fabric of an Intelligent Environment with Hierarchical Hidden Markov Theory

Authors: Chiung-Hui Chen

Abstract:

The Internet of Things (IoT) was designed for widespread convenience. With the smart tag and the sensing network, a large quantity of dynamic information is immediately presented in the IoT. Through the internal communication and interaction, meaningful objects provide real-time services for users. Therefore, the service with appropriate decision-making has become an essential issue. Based on the science of human behavior, this study employed the environment model to record the time sequences and locations of different behaviors and adopted the probability module of the hierarchical Hidden Markov Model for the inference. The statistical analysis was conducted to achieve the following objectives: First, define user behaviors and predict the user behavior routes with the environment model to analyze user purposes. Second, construct the hierarchical Hidden Markov Model according to the logic framework, and establish the sequential intensity among behaviors to get acquainted with the use and activity fabric of the intelligent environment. Third, establish the intensity of the relation between the probability of objects’ being used and the objects. The indicator can describe the possible limitations of the mechanism. As the process is recorded in the information of the system created in this study, these data can be reused to adjust the procedure of intelligent design services.

Keywords: Behavior, big data, hierarchical Hidden Markov Model, intelligent object.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 754