Search results for: Statistical Data Analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13713

Search results for: Statistical Data Analysis.

13293 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7936
13292 Using Artificial Neural Network to Predict Collisions on Horizontal Tangents of 3D Two-Lane Highways

Authors: Omer F. Cansiz, Said M. Easa

Abstract:

The purpose of this study is mainly to predict collision frequency on the horizontal tangents combined with vertical curves using artificial neural network methods. The proposed ANN models are compared with existing regression models. First, the variables that affect collision frequency were investigated. It was found that only the annual average daily traffic, section length, access density, the rate of vertical curvature, smaller curve radius before and after the tangent were statistically significant according to related combinations. Second, three statistical models (negative binomial, zero inflated Poisson and zero inflated negative binomial) were developed using the significant variables for three alignment combinations. Third, ANN models are developed by applying the same variables for each combination. The results clearly show that the ANN models have the lowest mean square error value than those of the statistical models. Similarly, the AIC values of the ANN models are smaller to those of the regression models for all the combinations. Consequently, the ANN models have better statistical performances than statistical models for estimating collision frequency. The ANN models presented in this paper are recommended for evaluating the safety impacts 3D alignment elements on horizontal tangents.

Keywords: Collision frequency, horizontal tangent, 3D two-lane highway, negative binomial, zero inflated Poisson, artificial neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1612
13291 A Study of Panel Logit Model and Adaptive Neuro-Fuzzy Inference System in the Prediction of Financial Distress Periods

Authors: Ε. Giovanis

Abstract:

The purpose of this paper is to present two different approaches of financial distress pre-warning models appropriate for risk supervisors, investors and policy makers. We examine a sample of the financial institutions and electronic companies of Taiwan Security Exchange (TSE) market from 2002 through 2008. We present a binary logistic regression with paned data analysis. With the pooled binary logistic regression we build a model including more variables in the regression than with random effects, while the in-sample and out-sample forecasting performance is higher in random effects estimation than in pooled regression. On the other hand we estimate an Adaptive Neuro-Fuzzy Inference System (ANFIS) with Gaussian and Generalized Bell (Gbell) functions and we find that ANFIS outperforms significant Logit regressions in both in-sample and out-of-sample periods, indicating that ANFIS is a more appropriate tool for financial risk managers and for the economic policy makers in central banks and national statistical services.

Keywords: ANFIS, Binary logistic regression, Financialdistress, Panel data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318
13290 Statistical Characteristics of Distribution of Radiation-Induced Defects under Random Generation

Authors: Pavlo Selyshchev

Abstract:

We consider fluctuations of defects density taking into account their interaction. Stochastic field of displacement generation rate gives random defect distribution. We determinate statistical characteristics (mean and dispersion) of random field of point defect distribution as function of defect generation parameters, temperature and properties of irradiated crystal.

 

Keywords: Irradiation, Primary Defects, Interaction, Fluctuations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
13289 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: Analysis of optimization, artificial intelligence-based optimization, optimization for learning and data analysis, global optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 857
13288 Global Kinetics of Direct Dimethyl Ether Synthesis Process from Syngas in Slurry Reactor over a Novel Cu-Zn-Al-Zr Slurry Catalyst

Authors: Zhen Chen, Haitao Zhang, Weiyong Ying, Dingye Fang

Abstract:

The direct synthesis process of dimethyl ether (DME) from syngas in slurry reactors is considered to be promising because of its advantages in caloric transfer. In this paper, the influences of operating conditions (temperature, pressure and weight hourly space velocity) on the conversion of CO, selectivity of DME and methanol were studied in a stirred autoclave over Cu-Zn-Al-Zr slurry catalyst, which is far more suitable to liquid phase dimethyl ether synthesis process than bifunctional catalyst commercially. A Langmuir- Hinshelwood mechanism type global kinetics model for liquid phase DME direct synthesis based on methanol synthesis models and a methanol dehydration model has been investigated by fitting our experimental data. The model parameters were estimated with MATLAB program based on general Genetic Algorithms and Levenberg-Marquardt method, which is suitably fitting experimental data and its reliability was verified by statistical test and residual error analysis.

Keywords: alcohol/ether fuel, Cu-Zn-Al-Zr slurry catalyst, global kinetics, slurry reactor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5475
13287 Investigation of Anti-diabetic and Hypocholesterolemic Potential of Psyllium Husk Fiber (Plantago psyllium) in Diabetic and Hypercholesterolemic Albino Rats

Authors: Ishtiaq Ahmed, Muhammad Naeem, Abdul Shakoor, Zaheer Ahmed, Hafiz Muhammad Nasir Iqbal

Abstract:

The present study was conducted to observe the effect of Plantago psyllium on blood glucose and cholesterol levels in normal and alloxan induced diabetic rats. To investigate the effect of Plantago psyllium 40 rats were included in this study divided into four groups of ten rats in each group. One group A was normal, second group B was diabetic, third group C was non diabetic and hypercholesterolemic and fourth group D was diabetic and hypercholesterolemic. Two groups B and D were made diabetic by intraperitonial injection of alloxan dissolved in 1mL distilled water at a dose of 125mg/Kg of body weight. Two groups C and D were made hypercholesterolemic by oral administration of powder cholesterol (1g/Kg of body weight). The blood samples from all the rats were collected from coccygial vein on 1st day, then on 21st and 42nd day respectively. All the samples were analyzed for blood glucose and cholesterol level by using enzymatic kits. The blood glucose and cholesterol levels of treated groups of rats showed significant reduction after 7 weeks of treatment with Plantago psyllium. By statistical analysis of results it was found that Plantago psyllium has anti-diabetic and hypocholesterolemic activity in diabetic and hypercholesterolemic albino rats.

Keywords: Albino rats, alloxan, Plantago psyllium, statistical analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
13286 A Mixture Model of Two Different Distributions Approach to the Analysis of Heterogeneous Survival Data

Authors: Ülkü Erişoğlu, Murat Erişoğlu, Hamza Erol

Abstract:

In this paper we propose a mixture of two different distributions such as Exponential-Gamma, Exponential-Weibull and Gamma-Weibull to model heterogeneous survival data. Various properties of the proposed mixture of two different distributions are discussed. Maximum likelihood estimations of the parameters are obtained by using the EM algorithm. Illustrative example based on real data are also given.

Keywords: Exponential-Gamma, Exponential-Weibull, Gamma-Weibull, EM Algorithm, Survival Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4034
13285 Detection of Bias in GPS satellites- Measurements for Enhanced Measurement Integrity

Authors: Mamoun F. Abdel-Hafez

Abstract:

In this paper, the detection of a fault in the Global Positioning System (GPS) measurement is addressed. The class of faults considered is a bias in the GPS pseudorange measurements. This bias is modeled as an unknown constant. The fault could be the result of a receiver fault or signal fault such as multipath error. A bias bank is constructed based on set of possible fault hypotheses. Initially, there is equal probability of occurrence for any of the biases in the bank. Subsequently, as the measurements are processed, the probability of occurrence for each of the biases is sequentially updated. The fault with a probability approaching unity will be declared as the current fault in the GPS measurement. The residual formed from the GPS and Inertial Measurement Unit (IMU) measurements is used to update the probability of each fault. Results will be presented to show the performance of the presented algorithm.

Keywords: Estimation and filtering, Statistical data analysis, Faultdetection and identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1936
13284 Parametric and Nonparametric Analysis of Breast Cancer Treatments

Authors: Chunling Cong, Chris.P.Tsokos

Abstract:

The objective of the present research manuscript is to perform parametric, nonparametric, and decision tree analysis to evaluate two treatments that are being used for breast cancer patients. Our study is based on utilizing real data which was initially used in “Tamoxifen with or without breast irradiation in women of 50 years of age or older with early breast cancer" [1], and the data is supplied to us by N.A. Ibrahim “Decision tree for competing risks survival probability in breast cancer study" [2]. We agree upon certain aspects of our findings with the published results. However, in this manuscript, we focus on relapse time of breast cancer patients instead of survival time and parametric analysis instead of semi-parametric decision tree analysis is applied to provide more precise recommendations of effectiveness of the two treatments with respect to reoccurrence of breast cancer.

Keywords: decision tree, breast cancer treatments, parametricanalysis, non-parametric analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2017
13283 Effect of Social Media on the Study Habits of Students of Alvan Ikoku Federal College of Education, Owerri

Authors: Perpetua O. Ezeji, Kelechi E. Ezeji

Abstract:

There has been considerable anxiety in society that social media distracts from education and reduces the social skills of young people. Following this, educators have sought ways to mitigate its negative effects on educational attainment while incorporating its positive aspects into the learning process. This study sought to examine the impact of social media on the study habits of students of Alvan Ikoku Federal College of Education, Owerri. The research design involved survey technique where questionnaires were used to collect data from a sample of the student population. Statistical package for social sciences (SPSS) was used to analyse the data. Spearman’s Rho was the specific tool used for analysis. It was presented in frequency tables and bar charts. Findings from variables investigated showed that at p<0.5, social media usage had a significant impact on the study habits of students of Alvan Ikoku Federal College of Education, Owerri. This indicated the need for stakeholders in the community to employ counselling and other proactive measures to ensure that students maintained proper focus on their primary assignment for schooling.

Keywords: Education, social media, study habits, technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8741
13282 Application of Scanning Electron Microscopy and X-Ray Evaluation of the Main Digestion Methods for Determination of Macroelements in Plant Tissue

Authors: Krasimir I. Ivanov, Penka S. Zapryanova, Stefan V. Krustev, Violina R. Angelova

Abstract:

Three commonly used digestion methods (dry ashing, acid digestion, and microwave digestion) in different variants were compared for digestion of tobacco leaves. Three main macroelements (K, Ca and Mg) were analysed using AAS Spectrometer Spectra АА 220, Varian, Australia. The accuracy and precision of the measurements were evaluated by using Polish reference material CTR-VTL-2 (Virginia tobacco leaves). To elucidate the problems with elemental recovery X-Ray and SEM–EDS analysis of all residues after digestion were performed. The X-ray investigation showed a formation of KClO4 when HClO4 was used as a part of the acids mixture. The use of HF at Ca and Mg determination led to the formation of CaF2 and MgF2. The results were confirmed by energy dispersive X-ray microanalysis. SPSS program for Windows was used for statistical data processing.

Keywords: Digestion methods, determination of macroelements, plant tissue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 905
13281 Analysis of Precipitation and Temperature Trends in Sefid-Roud Basin

Authors: Amir Gandomkar, Tahereh Soltani Gord faramarzi, Parisa Safaripour Chafi, Abdol-Reza Amani

Abstract:

Temperature, humidity and precipitation in an area, are parameters proved influential in the climate of that area, and one should recognize them so that he can determine the climate of that area. Climate changes are of primary importance in climatology, and in recent years, have been of great concern to researchers and even politicians and organizations, for they can play an important role in social, political and economic activities. Even though the real cause of climate changes or their stability is not yet fully recognized, they are a matter of concern to researchers and their importance for countries has prompted them to investigate climate changes in different levels, especially in regional, national and continental level. This issue has less been investigated in our country. However, in recent years, there have been some researches and conferences on climate changes. This study is also in line with such researches and tries to investigate and analyze the trends of climate changes (temperature and precipitation) in Sefid-roud (the name of a river) basin. Three parameters of mean annual precipitation, temperature, and maximum and minimum temperatures in 36 synoptic and climatology stations in a statistical period of 49 years (1956-2005) in the stations of Sefid-roud basin were analyzed by Mann-Kendall test. The results obtained by data analysis show that climate changes are short term and have a trend. The analysis of mean temperature revealed that changes have a significantly rising trend, besides the precipitation has a significantly falling trend.

Keywords: Trend, Climate changes, Sefid-roud, Mann-Kendall

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719
13280 Analysis of Palm Perspiration Effect with SVM for Diabetes in People

Authors: Hamdi Melih Saraoğlu, Muhlis Yıldırım, Abdurrahman Özbeyaz, Feyzullah Temurtas

Abstract:

In this research, the diabetes conditions of people (healthy, prediabete and diabete) were tried to be identified with noninvasive palm perspiration measurements. Data clusters gathered from 200 subjects were used (1.Individual Attributes Cluster and 2. Palm Perspiration Attributes Cluster). To decrase the dimensions of these data clusters, Principal Component Analysis Method was used. Data clusters, prepared in that way, were classified with Support Vector Machines. Classifications with highest success were 82% for Glucose parameters and 84% for HbA1c parametres.

Keywords: Palm perspiration, Diabetes, Support Vector Machine, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890
13279 An Assessment of Ozone Levels in Typical Urban Areas in the Malaysian Peninsular

Authors: Negar Banan, Mohd Talib Latif, Liew Juneng

Abstract:

Air quality studies were carried out in the towns of Putrajaya, Petaling Jaya and Nilai in the Malaysian Peninsular. In this study, the variations of Ozone (O3) concentrations over a four year period (2008-2011) were investigated using data obtained from the Malaysian Department of the Environment (DOE). This study aims to identify and describe the daily and monthly variations of O3 concentrations at the monitoring sites mentioned. The SPPS program (Statistical Package for the Social Science) was used to analyze this data in order to obtain the variations of O3 and also to clarify the relationship between the stations. The findings of the study revealed that the highest concentration of O3 occurred during the midday and afternoon (between 13:00-15:00 hrs). The comparison between stations also showed that highest O3 concentrations were recorded in Putrajaya. The comparisons of average and maximum concentrations of O3 for the three stations showed that the strongest significant correlation was recorded in the Petaling Jaya station with the value R2= 0.667. Results from this study indicate that in the urban areas of Peninsular Malaysia, the concentration of O3 depends on the concentration of NOx. Furthermore, HYSPLIT back trajectories (-72h) indicated that air-mass transport patterns can also influence the O3 concentration in the areas studied.

Keywords: Ozone, Precursors, Urban, HYSPLIT trajectory analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1702
13278 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 592
13277 Effects of Knowledge of Results on Specified Skill Acquisition among Fresh Cricket Players

Authors: Rasheed O. Oloyede, Joseph O. Adelusi, Peter O. Akinbile

Abstract:

This study was conducted to investigate the extent with which knowledge of results influences the performance of cricket players. A sample of 160 fresh students in the Department of Physical and Health Education who are novice in the game were randomly assigned into two groups. The first group of eighty (80) subjects was classified as experimental group while the second group of eighty (80) subjects was the control group. Subjects in both groups were asked to bowl and bat ten times each for a period of six weeks. After the first round, the subjects in the experimental group were allowed feedback on their performance in the first trial while those in the control group were denied feedback. Two null hypotheses generated for the study were tested using percentages and chi-square statistical analysis at 0.05 level of significance. Analysis of data showed that knowledge of results influenced the performance of cricket players. It was concluded that knowledge of results is pertinent for effective skill acquisition and could enhance better performance among unskilled cricket players. Hence, it is suggested that immediate feedback on the level of skill acquisition by the prospective and unskilled cricket players would inspire them for better performance in cricket tournaments.

Keywords: Batting, Bowling, Knowledge of Results, Performance, Skill Acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2862
13276 Quantitative Estimation of Periodicities in Lyari River Flow Routing

Authors: Rana Khalid Naeem, Asif Mansoor

Abstract:

The hydrologic time series data display periodic structure and periodic autoregressive process receives considerable attention in modeling of such series. In this communication long term record of monthly waste flow of Lyari river is utilized to quantify by using PAR modeling technique. The parameters of model are estimated by using Frances & Paap methodology. This study shows that periodic autoregressive model of order 2 is the most parsimonious model for assessing periodicity in waste flow of the river. A careful statistical analysis of residuals of PAR (2) model is used for establishing goodness of fit. The forecast by using proposed model confirms significance and effectiveness of the model.

Keywords: Diagnostic checks, Lyari river, Model selection, Monthly waste flow, Periodicity, Periodic autoregressive model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
13275 Values as a Predictor of Cyber-bullying Among Secondary School Students

Authors: Bülent Dilmaç, Didem Aydoğan

Abstract:

The use of new technologies such internet (e-mail, chat rooms) and cell phones has steeply increased in recent years. Especially among children and young people, use of technological tools and equipments is widespread. Although many teachers and administrators now recognize the problem of school bullying, few are aware that students are being harassed through electronic communication. Referred to as electronic bullying, cyber bullying, or online social cruelty, this phenomenon includes bullying through email, instant messaging, in a chat room, on a website, or through digital messages or images sent to a cell phone. Cyber bullying is defined as causing deliberate/intentional harm to others using internet or other digital technologies. It has a quantitative research design nd uses relational survey as its method. The participants consisted of 300 secondary school students in the city of Konya, Turkey. 195 (64.8%) participants were female and 105 (35.2%) were male. 39 (13%) students were at grade 1, 187 (62.1%) were at grade 2 and 74 (24.6%) were at grade 3. The “Cyber Bullying Question List" developed by Ar─▒cak (2009) was given to students. Following questions about demographics, a functional definition of cyber bullying was provided. In order to specify students- human values, “Human Values Scale (HVS)" developed by Dilmaç (2007) for secondary school students was administered. The scale consists of 42 items in six dimensions. Data analysis was conducted by the primary investigator of the study using SPSS 14.00 statistical analysis software. Descriptive statistics were calculated for the analysis of students- cyber bullying behaviour and simple regression analysis was conducted in order to test whether each value in the scale could explain cyber bullying behaviour.

Keywords: Cyber bullying, Values, Secondary SchoolStudents

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3790
13274 Dimension Reduction of Microarray Data Based on Local Principal Component

Authors: Ali Anaissi, Paul J. Kennedy, Madhu Goyal

Abstract:

Analysis and visualization of microarraydata is veryassistantfor biologists and clinicians in the field of diagnosis and treatment of patients. It allows Clinicians to better understand the structure of microarray and facilitates understanding gene expression in cells. However, microarray dataset is a complex data set and has thousands of features and a very small number of observations. This very high dimensional data set often contains some noise, non-useful information and a small number of relevant features for disease or genotype. This paper proposes a non-linear dimensionality reduction algorithm Local Principal Component (LPC) which aims to maps high dimensional data to a lower dimensional space. The reduced data represents the most important variables underlying the original data. Experimental results and comparisons are presented to show the quality of the proposed algorithm. Moreover, experiments also show how this algorithm reduces high dimensional data whilst preserving the neighbourhoods of the points in the low dimensional space as in the high dimensional space.

Keywords: Linear Dimension Reduction, Non-Linear Dimension Reduction, Principal Component Analysis, Biologists.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
13273 Performance Analysis of Routing Protocol for WSN Using Data Centric Approach

Authors: A. H. Azni, Madihah Mohd Saudi, Azreen Azman, Ariff Syah Johari

Abstract:

Sensor Network are emerging as a new tool for important application in diverse fields like military surveillance, habitat monitoring, weather, home electrical appliances and others. Technically, sensor network nodes are limited in respect to energy supply, computational capacity and communication bandwidth. In order to prolong the lifetime of the sensor nodes, designing efficient routing protocol is very critical. In this paper, we illustrate the existing routing protocol for wireless sensor network using data centric approach and present performance analysis of these protocols. The paper focuses in the performance analysis of specific protocol namely Directed Diffusion and SPIN. This analysis reveals that the energy usage is important features which need to be taken into consideration while designing routing protocol for wireless sensor network.

Keywords: Data Centric Approach, Directed Diffusion, SPIN WSN Routing Protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2508
13272 Spatial Disparity in Education and Medical Facilities: A Case Study of Barddhaman District, West Bengal, India

Authors: Amit Bhattacharyya

Abstract:

The economic scenario of any region does not show the real picture for the measurement of overall development. Therefore, economic development must be accompanied by social development to be able to make an assessment to measure the level of development. The spatial variation with respect to social development has been discussed taking into account the quality of functioning of a social system in a specific area. In this paper, an attempt has been made to study the spatial distribution of social infrastructural facilities and analyze the magnitude of regional disparities at inter- block level in Barddhman district. It starts with the detailed account of the selection process of social infrastructure indicators and describes the methodology employed in the empirical analysis. Analyzing the block level data, this paper tries to identify the disparity among the blocks in the levels of social development. The results have been subsequently explained using both statistical analysis and geo spatial technique. The paper reveals that the social development is not going on at the same rate in every part of the district. Health facilities and educational facilities are concentrated at some selected point. So overall development activities come to be concentrated in a few centres and the disparity is seen over the blocks.

Keywords: Disparity, inter-block, social development, spatial variation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 611
13271 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926
13270 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1469
13269 Spatial Variability of Some Soil Properties in Mountain Rangelands of Northern Iran

Authors: Zeinab Jafarian Jeloudar, Hossien Kavianpoor, Abazar Esmali Ouri, Ataollah Kavian

Abstract:

In this paper spatial variability of some chemical and physical soil properties were investigated in mountain rangelands of Nesho, Mazandaran province, Iran. 110 soil samples from 0-30 cm depth were taken with systematic method on grid 30×30 m2 in regions with different vegetation cover and transported to laboratory. Then soil chemical and physical parameters including Acidity (pH), Electrical conductivity, Caco3, Bulk density, Particle density, total phosphorus, total Nitrogen, available potassium, Organic matter, Saturation moisture, Soil texture (percentage of sand, silt and clay), Sodium, Calcium, magnesium were measured in laboratory. Data normalization was performed then was done statistical analysis for description of soil properties and geostatistical analysis for indication spatial correlation between these properties and were perpetrated maps of spatial distribution of soil properties using Kriging method. Results indicated that in the study area Saturation moisture and percentage of Sand had highest and lowest spatial correlation respectively.

Keywords: Chemical and physical soil properties, Iran, Spatial variability, Nesho Rangeland

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1995
13268 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1134
13267 Welding Process Selection for Storage Tank by Integrated Data Envelopment Analysis and Fuzzy Credibility Constrained Programming Approach

Authors: Rahmad Wisnu Wardana, Eakachai Warinsiriruk, Sutep Joy-A-Ka

Abstract:

Selecting the most suitable welding process usually depends on experiences or common application in similar companies. However, this approach generally ignores many criteria that can be affecting the suitable welding process selection. Therefore, knowledge automation through knowledge-based systems will significantly improve the decision-making process. The aims of this research propose integrated data envelopment analysis (DEA) and fuzzy credibility constrained programming approach for identifying the best welding process for stainless steel storage tank in the food and beverage industry. The proposed approach uses fuzzy concept and credibility measure to deal with uncertain data from experts' judgment. Furthermore, 12 parameters are used to determine the most appropriate welding processes among six competitive welding processes.

Keywords: Welding process selection, data envelopment analysis, fuzzy credibility constrained programming, storage tank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 765
13266 Wheat Yield Prediction through Agro Meteorological Indices for Ardebil District

Authors: Fariba Esfandiary, Ghafoor Aghaie, Ali Dolati Mehr

Abstract:

Wheat prediction was carried out using different meteorological variables together with agro meteorological indices in Ardebil district for the years 2004-2005 & 2005–2006. On the basis of correlation coefficients, standard error of estimate as well as relative deviation of predicted yield from actual yield using different statistical models, the best subset of agro meteorological indices were selected including daily minimum temperature (Tmin), accumulated difference of maximum & minimum temperatures (TD), growing degree days (GDD), accumulated water vapor pressure deficit (VPD), sunshine hours (SH) & potential evapotranspiration (PET). Yield prediction was done two months in advance before harvesting time which was coincide with commencement of reproductive stage of wheat (5th of June). It revealed that in the final statistical models, 83% of wheat yield variability was accounted for variation in above agro meteorological indices.

Keywords: Wheat yields prediction, agro meteorological indices, statistical models

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2108
13265 Air Quality Forecast Based on Principal Component Analysis-Genetic Algorithm and Back Propagation Model

Authors: Bin Mu, Site Li, Shijin Yuan

Abstract:

Under the circumstance of environment deterioration, people are increasingly concerned about the quality of the environment, especially air quality. As a result, it is of great value to give accurate and timely forecast of AQI (air quality index). In order to simplify influencing factors of air quality in a city, and forecast the city’s AQI tomorrow, this study used MATLAB software and adopted the method of constructing a mathematic model of PCA-GABP to provide a solution. To be specific, this study firstly made principal component analysis (PCA) of influencing factors of AQI tomorrow including aspects of weather, industry waste gas and IAQI data today. Then, we used the back propagation neural network model (BP), which is optimized by genetic algorithm (GA), to give forecast of AQI tomorrow. In order to verify validity and accuracy of PCA-GABP model’s forecast capability. The study uses two statistical indices to evaluate AQI forecast results (normalized mean square error and fractional bias). Eventually, this study reduces mean square error by optimizing individual gene structure in genetic algorithm and adjusting the parameters of back propagation model. To conclude, the performance of the model to forecast AQI is comparatively convincing and the model is expected to take positive effect in AQI forecast in the future.

Keywords: AQI forecast, principal component analysis, genetic algorithm, back propagation neural network model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 985
13264 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3614