Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 13

Search results for: Multinomial%20dirichlet%20classification%20model

13 Multinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park

Abstract:

We present probabilistic multinomial Dirichlet classification model for multidimensional data and Gaussian process priors. Here, we have considered efficient computational method that can be used to obtain the approximate posteriors for latent variables and parameters needed to define the multiclass Gaussian process classification model. We first investigated the process of inducing a posterior distribution for various parameters and latent function by using the variational Bayesian approximations and important sampling method, and next we derived a predictive distribution of latent function needed to classify new samples. The proposed model is applied to classify the synthetic multivariate dataset in order to verify the performance of our model. Experiment result shows that our model is more accurate than the other approximation methods.

Keywords: Multinomial dirichlet classification model, Gaussian process priors, variational Bayesian approximation, Importance sampling, approximate posterior distribution, Marginal likelihood evidence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606

12 A Study of Classification Models to Predict Drill-Bit Breakage Using Degradation Signals

Authors: Bharatendra Rai

Abstract:

Cutting tools are widely used in manufacturing processes and drilling is the most commonly used machining process. Although drill-bits used in drilling may not be expensive, their breakage can cause damage to expensive work piece being drilled and at the same time has major impact on productivity. Predicting drill-bit breakage, therefore, is important in reducing cost and improving productivity. This study uses twenty features extracted from two degradation signals viz., thrust force and torque. The methodology used involves developing and comparing decision tree, random forest, and multinomial logistic regression models for classifying and predicting drill-bit breakage using degradation signals.

Keywords: Degradation signal, drill-bit breakage, random forest, multinomial logistic regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2237

11 Modeling a Multinomial Logit Model of Intercity Travel Mode Choice Behavior for All Trips in Libya

Authors: Manssour A. Abdulsalam Bin Miskeen, Ahmed Mohamed Alhodairi, Riza Atiq Abdullah Bin O. K. Rahmat

Abstract:

In the planning point of view, it is essential to have mode choice, due to the massive amount of incurred in transportation systems. The intercity travellers in Libya have distinct features, as against travellers from other countries, which includes cultural and socioeconomic factors. Consequently, the goal of this study is to recognize the behavior of intercity travel using disaggregate models, for projecting the demand of nation-level intercity travel in Libya. Multinomial Logit Model for all the intercity trips has been formulated to examine the national-level intercity transportation in Libya. The Multinomial logit model was calibrated using nationwide revealed preferences (RP) and stated preferences (SP) survey. The model was developed for deference purpose of intercity trips (work, social and recreational). The variables of the model have been predicted based on maximum likelihood method. The data needed for model development were obtained from all major intercity corridors in Libya. The final sample size consisted of 1300 interviews. About two-thirds of these data were used for model calibration, and the remaining parts were used for model validation. This study, which is the first of its kind in Libya, investigates the intercity traveler’s mode-choice behavior. The intercity travel mode-choice model was successfully calibrated and validated. The outcomes indicate that, the overall model is effective and yields higher precision of estimation. The proposed model is beneficial, due to the fact that, it is receptive to a lot of variables, and can be employed to determine the impact of modifications in the numerous characteristics on the need for various travel modes. Estimations of the model might also be of valuable to planners, who can estimate possibilities for various modes and determine the impact of unique policy modifications on the need for intercity travel.

Keywords: Multinomial logit model, improved intercity transport, intercity mode-choice behavior, disaggregate analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7856

10 On the Performance of Information Criteria in Latent Segment Models

Authors: Jaime R. S. Fonseca

Abstract:

Nevertheless the widespread application of finite mixture models in segmentation, finite mixture model selection is still an important issue. In fact, the selection of an adequate number of segments is a key issue in deriving latent segments structures and it is desirable that the selection criteria used for this end are effective. In order to select among several information criteria, which may support the selection of the correct number of segments we conduct a simulation study. In particular, this study is intended to determine which information criteria are more appropriate for mixture model selection when considering data sets with only categorical segmentation base variables. The generation of mixtures of multinomial data supports the proposed analysis. As a result, we establish a relationship between the level of measurement of segmentation variables and some (eleven) information criteria-s performance. The criterion AIC3 shows better performance (it indicates the correct number of the simulated segments- structure more often) when referring to mixtures of multinomial segmentation base variables.

Keywords: Quantitative Methods, Multivariate Data Analysis, Clustering, Finite Mixture Models, Information Theoretical Criteria, Simulation experiments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513

9 Early Warning System of Financial Distress Based On Credit Cycle Index

Authors: Bi-Huei Tsai

Abstract:

Previous studies on financial distress prediction choose the conventional failing and non-failing dichotomy; however, the distressed extent differs substantially among different financial distress events. To solve the problem, “non-distressed”, “slightlydistressed” and “reorganization and bankruptcy” are used in our article to approximate the continuum of corporate financial health. This paper explains different financial distress events using the two-stage method. First, this investigation adopts firm-specific financial ratios, corporate governance and market factors to measure the probability of various financial distress events based on multinomial logit models. Specifically, the bootstrapping simulation is performed to examine the difference of estimated misclassifying cost (EMC). Second, this work further applies macroeconomic factors to establish the credit cycle index and determines the distressed cut-off indicator of the two-stage models using such index. Two different models, one-stage and two-stage prediction models are developed to forecast financial distress, and the results acquired from different models are compared with each other, and with the collected data. The findings show that the one-stage model has the lower misclassification error rate than the two-stage model. The one-stage model is more accurate than the two-stage model.

Keywords: Multinomial logit model, corporate governance, company failure, reorganization, bankruptcy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2674

8 Modeling the Influence of Socioeconomic and Land-Use Factors on Mode Choice: A Comparison of Riyadh, Saudi Arabia, and Melbourne, Australia

Authors: M. Alqhatani, S. Bajwa, S. Setunge

Abstract:

Metropolitan areas have suffered from traffic problems, which have steadily increased in many monocentric cities. Urban expansion, population growth, and road network development have resulted in a structural shift toward urban sprawl, increasing commuters’ dependence on private modes of transport. This paper aims to model the influence of socioeconomic and land-use factors on mode choice using a multinomial and nested logit model. Land-use patterns—such as residential, commercial, retail, educational and employment related—affect the choice of mode and destination in the short and medium term. Socioeconomic factors—such as age, gender, income, household size, and house type—also affect choice, while residential location is affected in the long term. Riyadh in Saudi Arabia and Melbourne in Australia were chosen as case studies. Riyadh is a car-dependent city with limited public transport, whereas Melbourne has good public transport but an increase in car dependence. Aggregate level land-use data and disaggregate level individual, household, and journey-to-work data are used to determine the effects of land use and socioeconomic factors on mode choice. The model results determined that urban sprawl is the main factor that affects mode choice, income, and house type.

Keywords: Socioeconomic, land use, mode choice, multinomial logit and nested logit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2436

7 A Note on Penalized Power-Divergence Test Statistics

Authors: Aylin Alin

Abstract:

In this paper, penalized power-divergence test statistics have been defined and their exact size properties to test a nested sequence of log-linear models have been compared with ordinary power-divergence test statistics for various penalization, λ and main effect values. Since the ordinary and penalized power-divergence test statistics have the same asymptotic distribution, comparisons have been only made for small and moderate samples. Three-way contingency tables distributed according to a multinomial distribution have been considered. Simulation results reveal that penalized power-divergence test statistics perform much better than their ordinary counterparts.

Keywords: Contingency table, Log-linear models, Penalization, Power-divergence measure, Penalized power-divergence measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1306

6 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines

Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma

Abstract:

Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.

Keywords: Road accident, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1119

5 Data Mining Applied to the Predictive Model of Triage System in Emergency Department

Authors: Wen-Tsann Lin, Yung-Tsan Jou, Yih-Chuan Wu, Yuan-Du Hsiao

Abstract:

The Emergency Department of a medical center in Taiwan cooperated to conduct the research. A predictive model of triage system is contracted from the contract procedure, selection of parameters to sample screening. 2,000 pieces of data needed for the patients is chosen randomly by the computer. After three categorizations of data mining (Multi-group Discriminant Analysis, Multinomial Logistic Regression, Back-propagation Neural Networks), it is found that Back-propagation Neural Networks can best distinguish the patients- extent of emergency, and the accuracy rate can reach to as high as 95.1%. The Back-propagation Neural Networks that has the highest accuracy rate is simulated into the triage acuity expert system in this research. Data mining applied to the predictive model of the triage acuity expert system can be updated regularly for both the improvement of the system and for education training, and will not be affected by subjective factors.

Keywords: Back-propagation Neural Networks, Data Mining, Emergency Department, Triage System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2299

4 A Study of Mode Choice Model Improvement Considering Age Grouping

Authors: Young-Hyun Seo, Hyunwoo Park, Dong-Kyu Kim, Seung-Young Kho

Abstract:

The purpose of this study is providing an improved mode choice model considering parameters including age grouping of prime-aged and old age. In this study, 2010 Household Travel Survey data were used and improper samples were removed through the analysis. Chosen alternative, date of birth, mode, origin code, destination code, departure time, and arrival time are considered from Household Travel Survey. By preprocessing data, travel time, travel cost, mode, and ratio of people aged 45 to 55 years, 55 to 65 years and over 65 years were calculated. After the manipulation, the mode choice model was constructed using LIMDEP by maximum likelihood estimation. A significance test was conducted for nine parameters, three age groups for three modes. Then the test was conducted again for the mode choice model with significant parameters, travel cost variable and travel time variable. As a result of the model estimation, as the age increases, the preference for the car decreases and the preference for the bus increases. This study is meaningful in that the individual and households characteristics are applied to the aggregate model.

Keywords: Age grouping, aging, mode choice model, multinomial logit model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609

3 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include the cost of infrastructure, personnel, and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion, and various indirect costs in terms of air transport. This research aims to predict the probabilistic crash prediction of vehicles using Machine Learning due to natural and structural reasons by excluding spontaneous reasons, like overspeeding, etc., in the United States. These factors range from meteorological elements such as weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity, to human-made structures, like road structure components such as Bumps, Roundabouts, No Exit, Turning Loops, Give Away, etc. The probabilities are categorized into ten distinct classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes in all states collected by the US government. The probability of the crash was determined by employing Multinomial Expected Value, and a classification label was assigned accordingly. We applied three classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-depth insights through exploratory data analysis.

Keywords: Road safety, crash prediction, exploratory analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 48

2 The Association between Food Security Status and Depression in Two Iranian Ethnic Groups Living in Northwest of Iran

Authors: A. Rezazadeh, N. Omidvar, H. Eini-Zinab

Abstract:

Food insecurity (FI) influences may result in poor physical and mental health outcomes. Minor ethnic group may experience higher level of FI, and this situation may be related with higher depression prevalence. The aim of this study was to determine the association of depression with food security status in major (Azeri) and minor (Kurdish) ethnicity living in Urmia, West Azerbaijan, north of Iran. In this cross-sectional study, 723 participants (427 women and 296 men) aged 20–64 years old, from two ethnic groups (445 Azeri and 278 Kurdish), were selected through a multi stage cluster systematic sampling. Depression rate was assessed by “Beck” short form questionnaire (validated in Iranians) through interviews. Household FI status (HFIS) was measured using adapted HFI access scale through face-to-face interviews at homes. Multinomial logistic regression was used to estimate odds ratios (OR) of depression across HFIS. Higher percent of Kurds had moderate and severe depression in comparison with Azeri group (73 [17.3%] vs. 86 [27.9%]). There were not any significant differences between the two ethnicities in mild depression. Also, of all the subjects, moderate-to-sever FI was more prevalent in Kurds (28.5%), compared to Azeri group (17.3%) [P < 0.01]. Kurdish ethnic group living in food security or mild FI households had lower chance to have symptom of severe depression in comparison to those with sever FI (OR=0.097; 95% CI: 0.02-0.47). However, there was no significant association between depression and HFI in Azeri group. Findings revealed that the severity of HFI was related with severity depression in minor studied ethnic groups. However, in Azeri ethnicity as a major group, other confounders may have influence on the relation with depression and FI, that were not studied in the present study.

Keywords: Depression, ethnicity, food security status, Iran.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 989

1 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores of text, ranging from positive, neutral and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing, tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process, and substituting the Naive Bayes for a deep learning neural network model.

Keywords: Sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 459