Search results for: supervised maximum likelihood classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6660

Search results for: supervised maximum likelihood classification

6360 Conformance to Spatial Planning between the Kampala Physical Development Plan of 2012 and the Existing Land Use in 2021

Authors: Brendah Nagula, Omolo Fredrick Okalebo, Ronald Ssengendo, Ivan Bamweyana

Abstract:

The Kampala Physical Development Plan (KPDP) was developed in 2012 and projected both long term and short term developments within the City .The purpose of the plan was to not only shape the city into a spatially planned area but also to control the urban sprawl trends that had expanded with pronounced instances of informal settlements. This plan was approved by the National Physical Planning Board and a signature was appended by the Minister in 2013. Much as the KPDP plan has been implemented using different approaches such as detailed planning, development control, subdivision planning, carrying out construction inspections, greening and beautification, there is still limited knowledge on the level of conformance towards this plan. Therefore, it is yet to be determined whether it has been effective in shaping the City into an ideal spatially planned area. Attaining a clear picture of the level of conformance towards the KPDP 2012 through evaluation between the planned and the existing land use in Kampala City was performed. Methods such as Supervised Classification and Post Classification Change Detection were adopted to perform this evaluation. Scrutiny of findings revealed Central Division registered the lowest level of conformance to the planning standards specified in the KPDP 2012 followed by Nakawa, Rubaga, Kawempe, and Makindye. Furthermore, mixed-use development was identified as the land use with the highest level of non-conformity of 25.11% and institutional land use registered the highest level of conformance of 84.45 %. The results show that the aspect of location was not carefully considered while allocating uses in the KPDP whereby areas located near the Central Business District have higher land rents and hence require uses that ensure profit maximization. Also, the prominence of development towards mixed-use denotes an increased demand for land towards compact development that was not catered for in the plan. Therefore in order to transform Kampala city into a spatially planned area, there is need to carefully develop detailed plans especially for all the Central Division planning precincts indicating considerations for land use densification.

Keywords: spatial plan, post classification change detection, Kampala city, landuse

Procedia PDF Downloads 78
6359 Automatic Classification Using Dynamic Fuzzy C Means Algorithm and Mathematical Morphology: Application in 3D MRI Image

Authors: Abdelkhalek Bakkari

Abstract:

Image segmentation is a critical step in image processing and pattern recognition. In this paper, we proposed a new robust automatic image classification based on a dynamic fuzzy c-means algorithm and mathematical morphology. The proposed segmentation algorithm (DFCM_MM) has been applied to MR perfusion images. The obtained results show the validity and robustness of the proposed approach.

Keywords: segmentation, classification, dynamic, fuzzy c-means, MR image

Procedia PDF Downloads 459
6358 A Stochastic Diffusion Process Based on the Two-Parameters Weibull Density Function

Authors: Meriem Bahij, Ahmed Nafidi, Boujemâa Achchab, Sílvio M. A. Gama, José A. O. Matos

Abstract:

Stochastic modeling concerns the use of probability to model real-world situations in which uncertainty is present. Therefore, the purpose of stochastic modeling is to estimate the probability of outcomes within a forecast, i.e. to be able to predict what conditions or decisions might happen under different situations. In the present study, we present a model of a stochastic diffusion process based on the bi-Weibull distribution function (its trend is proportional to the bi-Weibull probability density function). In general, the Weibull distribution has the ability to assume the characteristics of many different types of distributions. This has made it very popular among engineers and quality practitioners, who have considered it the most commonly used distribution for studying problems such as modeling reliability data, accelerated life testing, and maintainability modeling and analysis. In this work, we start by obtaining the probabilistic characteristics of this model, as the explicit expression of the process, its trends, and its distribution by transforming the diffusion process in a Wiener process as shown in the Ricciaardi theorem. Then, we develop the statistical inference of this model using the maximum likelihood methodology. Finally, we analyse with simulated data the computational problems associated with the parameters, an issue of great importance in its application to real data with the use of the convergence analysis methods. Overall, the use of a stochastic model reflects only a pragmatic decision on the part of the modeler. According to the data that is available and the universe of models known to the modeler, this model represents the best currently available description of the phenomenon under consideration.

Keywords: diffusion process, discrete sampling, likelihood estimation method, simulation, stochastic diffusion process, trends functions, bi-parameters weibull density function

Procedia PDF Downloads 296
6357 Classification of Construction Projects

Authors: M. Safa, A. Sabet, S. MacGillivray, M. Davidson, K. Kaczmarczyk, C. T. Haas, G. E. Gibson, D. Rayside

Abstract:

To address construction project requirements and specifications, scholars and practitioners need to establish a taxonomy according to a scheme that best fits their need. While existing characterization methods are continuously being improved, new ones are devised to cover project properties which have not been previously addressed. One such method, the Project Definition Rating Index (PDRI), has received limited consideration strictly as a classification scheme. Developed by the Construction Industry Institute (CII) in 1996, the PDRI has been refined over the last two decades as a method for evaluating a project's scope definition completeness during front-end planning (FEP). The main contribution of this study is a review of practical project classification methods, and a discussion of how PDRI can be used to classify projects based on their readiness in the FEP phase. The proposed model has been applied to 59 construction projects in Ontario, and the results are discussed.

Keywords: project classification, project definition rating index (PDRI), risk, project goals alignment

Procedia PDF Downloads 664
6356 New Approach to Construct Phylogenetic Tree

Authors: Ouafae Baida, Najma Hamzaoui, Maha Akbib, Abdelfettah Sedqui, Abdelouahid Lyhyaoui

Abstract:

Numerous scientific works present various methods to analyze the data for several domains, specially the comparison of classifications. In our recent work, we presented a new approach to help the user choose the best classification method from the results obtained by every method, by basing itself on the distances between the trees of classification. The result of our approach was in the form of a dendrogram contains methods as a succession of connections. This approach is much needed in phylogeny analysis. This discipline is intended to analyze the sequences of biological macro molecules for information on the evolutionary history of living beings, including their relationship. The product of phylogeny analysis is a phylogenetic tree. In this paper, we recommend the use of a new method of construction the phylogenetic tree based on comparison of different classifications obtained by different molecular genes.

Keywords: hierarchical classification, classification methods, structure of tree, genes, phylogenetic analysis

Procedia PDF Downloads 492
6355 The Normal-Generalized Hyperbolic Secant Distribution: Properties and Applications

Authors: Hazem M. Al-Mofleh

Abstract:

In this paper, a new four-parameter univariate continuous distribution called the Normal-Generalized Hyperbolic Secant Distribution (NGHS) is defined and studied. Some general and structural distributional properties are investigated and discussed, including: central and non-central n-th moments and incomplete moments, quantile and generating functions, hazard function, Rényi and Shannon entropies, shapes: skewed right, skewed left, and symmetric, modality regions: unimodal and bimodal, maximum likelihood (MLE) estimators for the parameters. Finally, two real data sets are used to demonstrate empirically its flexibility and prove the strength of the new distribution.

Keywords: bimodality, estimation, hazard function, moments, Shannon’s entropy

Procedia PDF Downloads 327
6354 Optimization Analysis of Controlled Cooling Process for H-Shape Steam Beams

Authors: Jiin-Yuh Jang, Yu-Feng Gan

Abstract:

In order to improve the comprehensive mechanical properties of the steel, the cooling rate, and the temperature distribution must be controlled in the cooling process. A three-dimensional numerical model for the prediction of the heat transfer coefficient distribution of H-beam in the controlled cooling process was performed in order to obtain the uniform temperature distribution and minimize the maximum stress and the maximum deformation after the controlled cooling. An algorithm developed with a simplified conjugated-gradient method was used as an optimizer to optimize the heat transfer coefficient distribution. The numerical results showed that, for the case of air cooling 5 seconds followed by water cooling 6 seconds with uniform the heat transfer coefficient, the cooling rate is 15.5 (℃/s), the maximum temperature difference is 85℃, the maximum the stress is 125 MPa, and the maximum deformation is 1.280 mm. After optimize the heat transfer coefficient distribution in control cooling process with the same cooling time, the cooling rate is increased to 20.5 (℃/s), the maximum temperature difference is decreased to 52℃, the maximum stress is decreased to 82MPa and the maximum deformation is decreased to 1.167mm.

Keywords: controlled cooling, H-Beam, optimization, thermal stress

Procedia PDF Downloads 355
6353 Sterilization Incident Analysis by the Association of Litigation and Risk Management Method

Authors: Souhir Chelly, Asma Ben Cheikh, Hela Ghali, Salwa Khefacha, Lamine Dhidah, Mohamed Ben Rejeb, Houyem Said Latiri

Abstract:

The hospital risk management department is firstly involved in the methodological analysis of grade zero sterilization incidents. The system is based on a subsequent analysis process in compliance with the ongoing requirements of the Haute Autorité de santé (HAS) for a reactive approach to risk, allowing to identify failures and start the appropriate preventive and corrective measures. The use of the association of litigation and risk management (ALARM) method makes easier the grade zero analysis and brings to light the team or institutional, organizational, temporal, individual factors representative of undesirable effects. Two main factors come out again from this analysis, pre-disinfection step of the emergency block unsupervised instrumentalist intern was poorly done since she did not remove the battery from micro air motor. At the sterilization unit, the worker who was not supervised by the nurse did the conditioning of the motor without having checked it if it still contained the battery. The main cause is that the management of human resources was inadequate at both levels, the instrumental trainee in the block who was not supervised by his supervisor and the worker of the sterilization unit who was not supervised by the responsible nurse. There is a lack of research help, advice, and collaboration. The difficulties encountered during this type of analysis are multiple. The first is based on its necessary acceptance by the various actors of care involved, which should not perceive it as a tool leading to individual punishment, but rather as a means to improve their practices.

Keywords: ALARM (Association of Litigation and Risk Management Method), incident, risk management, sterilization

Procedia PDF Downloads 206
6352 Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach

Authors: Adeep Hande, Shubham Agarwal

Abstract:

This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.

Keywords: large language models, semi-supervised learning, sexism detection, data sparsity

Procedia PDF Downloads 57
6351 Early Diagnosis of Myocardial Ischemia Based on Support Vector Machine and Gaussian Mixture Model by Using Features of ECG Recordings

Authors: Merve Begum Terzi, Orhan Arikan, Adnan Abaci, Mustafa Candemir

Abstract:

Acute myocardial infarction is a major cause of death in the world. Therefore, its fast and reliable diagnosis is a major clinical need. ECG is the most important diagnostic methodology which is used to make decisions about the management of the cardiovascular diseases. In patients with acute myocardial ischemia, temporary chest pains together with changes in ST segment and T wave of ECG occur shortly before the start of myocardial infarction. In this study, a technique which detects changes in ST/T sections of ECG is developed for the early diagnosis of acute myocardial ischemia. For this purpose, a database of real ECG recordings that contains a set of records from 75 patients presenting symptoms of chest pain who underwent elective percutaneous coronary intervention (PCI) is constituted. 12-lead ECG’s of the patients were recorded before and during the PCI procedure. Two ECG epochs, which are the pre-inflation ECG which is acquired before any catheter insertion and the occlusion ECG which is acquired during balloon inflation, are analyzed for each patient. By using pre-inflation and occlusion recordings, ECG features that are critical in the detection of acute myocardial ischemia are identified and the most discriminative features for the detection of acute myocardial ischemia are extracted. A classification technique based on support vector machine (SVM) approach operating with linear and radial basis function (RBF) kernels to detect ischemic events by using ST-T derived joint features from non-ischemic and ischemic states of the patients is developed. The dataset is randomly divided into training and testing sets and the training set is used to optimize SVM hyperparameters by using grid-search method and 10fold cross-validation. SVMs are designed specifically for each patient by tuning the kernel parameters in order to obtain the optimal classification performance results. As a result of implementing the developed classification technique to real ECG recordings, it is shown that the proposed technique provides highly reliable detections of the anomalies in ECG signals. Furthermore, to develop a detection technique that can be used in the absence of ECG recording obtained during healthy stage, the detection of acute myocardial ischemia based on ECG recordings of the patients obtained during ischemia is also investigated. For this purpose, a Gaussian mixture model (GMM) is used to represent the joint pdf of the most discriminating ECG features of myocardial ischemia. Then, a Neyman-Pearson type of approach is developed to provide detection of outliers that would correspond to acute myocardial ischemia. Neyman – Pearson decision strategy is used by computing the average log likelihood values of ECG segments and comparing them with a range of different threshold values. For different discrimination threshold values and number of ECG segments, probability of detection and probability of false alarm values are computed, and the corresponding ROC curves are obtained. The results indicate that increasing number of ECG segments provide higher performance for GMM based classification. Moreover, the comparison between the performances of SVM and GMM based classification showed that SVM provides higher classification performance results over ECG recordings of considerable number of patients.

Keywords: ECG classification, Gaussian mixture model, Neyman–Pearson approach, support vector machine

Procedia PDF Downloads 149
6350 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 151
6349 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction

Procedia PDF Downloads 158
6348 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 50
6347 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 316
6346 Radar on Bike: Coarse Classification based on Multi-Level Clustering for Cyclist Safety Enhancement

Authors: Asma Omri, Noureddine Benothman, Sofiane Sayahi, Fethi Tlili, Hichem Besbes

Abstract:

Cycling, a popular mode of transportation, can also be perilous due to cyclists' vulnerability to collisions with vehicles and obstacles. This paper presents an innovative cyclist safety system based on radar technology designed to offer real-time collision risk warnings to cyclists. The system incorporates a low-power radar sensor affixed to the bicycle and connected to a microcontroller. It leverages radar point cloud detections, a clustering algorithm, and a supervised classifier. These algorithms are optimized for efficiency to run on the TI’s AWR 1843 BOOST radar, utilizing a coarse classification approach distinguishing between cars, trucks, two-wheeled vehicles, and other objects. To enhance the performance of clustering techniques, we propose a 2-Level clustering approach. This approach builds on the state-of-the-art Density-based spatial clustering of applications with noise (DBSCAN). The objective is to first cluster objects based on their velocity, then refine the analysis by clustering based on position. The initial level identifies groups of objects with similar velocities and movement patterns. The subsequent level refines the analysis by considering the spatial distribution of these objects. The clusters obtained from the first level serve as input for the second level of clustering. Our proposed technique surpasses the classical DBSCAN algorithm in terms of geometrical metrics, including homogeneity, completeness, and V-score. Relevant cluster features are extracted and utilized to classify objects using an SVM classifier. Potential obstacles are identified based on their velocity and proximity to the cyclist. To optimize the system, we used the View of Delft dataset for hyperparameter selection and SVM classifier training. The system's performance was assessed using our collected dataset of radar point clouds synchronized with a camera on an Nvidia Jetson Nano board. The radar-based cyclist safety system is a practical solution that can be easily installed on any bicycle and connected to smartphones or other devices, offering real-time feedback and navigation assistance to cyclists. We conducted experiments to validate the system's feasibility, achieving an impressive 85% accuracy in the classification task. This system has the potential to significantly reduce the number of accidents involving cyclists and enhance their safety on the road.

Keywords: 2-level clustering, coarse classification, cyclist safety, warning system based on radar technology

Procedia PDF Downloads 67
6345 Feature Extraction and Classification Based on the Bayes Test for Minimum Error

Authors: Nasar Aldian Ambark Shashoa

Abstract:

Classification with a dimension reduction based on Bayesian approach is proposed in this paper . The first step is to generate a sample (parameter) of fault-free mode class and faulty mode class. The second, in order to obtain good classification performance, a selection of important features is done with the discrete karhunen-loeve expansion. Next, the Bayes test for minimum error is used to classify the classes. Finally, the results for simulated data demonstrate the capabilities of the proposed procedure.

Keywords: analytical redundancy, fault detection, feature extraction, Bayesian approach

Procedia PDF Downloads 514
6344 Network Traffic Classification Scheme for Internet Network Based on Application Categorization for Ipv6

Authors: Yaser Miaji, Mohammed Aloryani

Abstract:

The rise of recent applications in everyday implementation like videoconferencing, online recreation and voice speech communication leads to pressing the need for novel mechanism and policy to serve this steep improvement within the application itself and users‟ wants. This diversity in web traffics needs some classification and prioritization of the traffics since some traffics merit abundant attention with less delay and loss, than others. This research is intended to reinforce the mechanism by analysing the performance in application according to the proposed mechanism implemented. The mechanism used is quite direct and analytical. The mechanism is implemented by modifying the queue limit in the algorithm.

Keywords: traffic classification, IPv6, internet, application categorization

Procedia PDF Downloads 548
6343 Modern Machine Learning Conniptions for Automatic Speech Recognition

Authors: S. Jagadeesh Kumar

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 430
6342 Time of Week Intensity Estimation from Interval Censored Data with Application to Police Patrol Planning

Authors: Jiahao Tian, Michael D. Porter

Abstract:

Law enforcement agencies are tasked with crime prevention and crime reduction under limited resources. Having an accurate temporal estimate of the crime rate would be valuable to achieve such a goal. However, estimation is usually complicated by the interval-censored nature of crime data. We cast the problem of intensity estimation as a Poisson regression using an EM algorithm to estimate the parameters. Two special penalties are added that provide smoothness over the time of day and day of the week. This approach presented here provides accurate intensity estimates and can also uncover day-of-week clusters that share the same intensity patterns. Anticipating where and when crimes might occur is a key element to successful policing strategies. However, this task is complicated by the presence of interval-censored data. The censored data refers to the type of data that the event time is only known to lie within an interval instead of being observed exactly. This type of data is prevailing in the field of criminology because of the absence of victims for certain types of crime. Despite its importance, the research in temporal analysis of crime has lagged behind the spatial component. Inspired by the success of solving crime-related problems with a statistical approach, we propose a statistical model for the temporal intensity estimation of crime with censored data. The model is built on Poisson regression and has special penalty terms added to the likelihood. An EM algorithm was derived to obtain maximum likelihood estimates, and the resulting model shows superior performance to the competing model. Our research is in line with the smart policing initiative (SPI) proposed by the Bureau Justice of Assistance (BJA) as an effort to support law enforcement agencies in building evidence-based, data-driven law enforcement tactics. The goal is to identify strategic approaches that are effective in crime prevention and reduction. In our case, we allow agencies to deploy their resources for a relatively short period of time to achieve the maximum level of crime reduction. By analyzing a particular area within cities where data are available, our proposed approach could not only provide an accurate estimate of intensities for the time unit considered but a time-variation crime incidence pattern. Both will be helpful in the allocation of limited resources by either improving the existing patrol plan with the understanding of the discovery of the day of week cluster or supporting extra resources available.

Keywords: cluster detection, EM algorithm, interval censoring, intensity estimation

Procedia PDF Downloads 58
6341 Emotion Detection in Twitter Messages Using Combination of Long Short-Term Memory and Convolutional Deep Neural Networks

Authors: Bahareh Golchin, Nooshin Riahi

Abstract:

One of the most significant issues as attended a lot in recent years is that of recognizing the sentiments and emotions in social media texts. The analysis of sentiments and emotions is intended to recognize the conceptual information such as the opinions, feelings, attitudes and emotions of people towards the products, services, organizations, people, topics, events and features in the written text. These indicate the greatness of the problem space. In the real world, businesses and organizations are always looking for tools to gather ideas, emotions, and directions of people about their products, services, or events related to their own. This article uses the Twitter social network, one of the most popular social networks with about 420 million active users, to extract data. Using this social network, users can share their information and opinions about personal issues, policies, products, events, etc. It can be used with appropriate classification of emotional states due to the availability of its data. In this study, supervised learning and deep neural network algorithms are used to classify the emotional states of Twitter users. The use of deep learning methods to increase the learning capacity of the model is an advantage due to the large amount of available data. Tweets collected on various topics are classified into four classes using a combination of two Bidirectional Long Short Term Memory network and a Convolutional network. The results obtained from this study with an average accuracy of 93%, show good results extracted from the proposed framework and improved accuracy compared to previous work.

Keywords: emotion classification, sentiment analysis, social networks, deep neural networks

Procedia PDF Downloads 128
6340 Predictors of the Self-Reported Likelihood of Seeking Social Worker Help among People with Physical Disabilities

Authors: Maya Kagan, Michal Itzick, Patricia Tal-Katz

Abstract:

Social workers hold a variety of roles and practices, and one of these involves the care, treatment, and rehabilitation of disabled people. The current study assesses the association between demographic factors, attitudes towards social workers, the stigma attached to seeking social worker help, perceived social support, and psychological distress - and the self-reported likelihood of seeking social worker help, among people with physical disabilities (PWPD) in Israel. Data collection utilized structured questionnaires, administered to a sample of 435 PWPD. Statistical analyses were done using SPSS software. The findings suggest that women, older respondents, people with more positive attitudes towards social workers, with higher levels of psychological distress and of social support, and with a lower level of stigma, reported a greater likelihood of seeking social worker help. The study's conclusion is that there are certain avoidance factors among PWPD that might discourage them from seeking professional social worker help. Therefore, it is important that social workers identify these factors and develop interventions aimed at encouraging PWPD to seek professional social worker help in case of need, and also develop practices adjusted to PWPD's unique needs.

Keywords: attitudes towards social workers, people with physical disabilities, perceived social support, psychological distress, seeking help, stigma

Procedia PDF Downloads 326
6339 Modelling Structural Breaks in Stock Price Time Series Using Stochastic Differential Equations

Authors: Daniil Karzanov

Abstract:

This paper studies the effect of quarterly earnings reports on the stock price. The profitability of the stock is modeled by geometric Brownian diffusion and the Constant Elasticity of Variance model. We fit several variations of stochastic differential equations to the pre-and after-report period using the Maximum Likelihood Estimation and Grid Search of parameters method. By examining the change in the model parameters after reports’ publication, the study reveals that the reports have enough evidence to be a structural breakpoint, meaning that all the forecast models exploited are not applicable for forecasting and should be refitted shortly.

Keywords: stock market, earnings reports, financial time series, structural breaks, stochastic differential equations

Procedia PDF Downloads 190
6338 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 102
6337 Effect of Nitrogen and Gibberellic Acid at Different Level and their Interaction on Calendula

Authors: Pragnyashree Mishra, Shradhanjali Mohapatra

Abstract:

The present investigation is carried out to know the effect of foliar feeding of nitrogen and gibberellic acid on vegetative growth, flowering behaviour and yield of calendula variety ‘Golden Emporer’. The experiment was laid out in RBD in rabi season of 2013-14. There are 16 treatments are taken at different level such as nitrogen (at 0%,1%,2%,3%) and GA3 (at 50 ppm,100ppm,150 ppm). Among them maximum height at bud initiation stage was obtained at 3% nitrogen (27.00 cm) and at 150 ppm GA3 (26.5 cm), fist flowering was obtained at 3% nitrogen(60.00 days) and at 150 ppm GA3 (63.75 days), maximum flower stalk length was obtained at 3% nitrogen(3.50 cm) and at 150 ppm GA3 (5.42 cm),maximum duration of flowering was obtained at 3% nitrogen(46.00 days) and at 150 ppm GA3 (46.50days), maximum number of flower was obtained at 3% nitrogen (89.00per plant) and at 150 ppm GA3 (83.50 per plant), maximum flower weight was obtained at 3% nitrogen(1.25 gm per flower) and at 150 ppm GA3 (1.50 gm per flower), maximum yield was was obtained at 3% nitrogen (110.00 gm per plant) and at 150 ppm GA3 (105.00gm per plant) and minimum of all character was obtained when 0% nitrogen0 ppm GA3. All interaction between nitrogen and GA3 was found in significant except the yield .

Keywords: calendula, golden emporer, GA3, nitrogen and gibberellic acid

Procedia PDF Downloads 451
6336 Comparison of the Classification of Cystic Renal Lesions Using the Bosniak Classification System with Contrast Enhanced Ultrasound and Magnetic Resonance Imaging to Computed Tomography: A Prospective Study

Authors: Dechen Tshering Vogel, Johannes T. Heverhagen, Bernard Kiss, Spyridon Arampatzis

Abstract:

In addition to computed tomography (CT), contrast enhanced ultrasound (CEUS), and magnetic resonance imaging (MRI) are being increasingly used for imaging of renal lesions. The aim of this prospective study was to compare the classification of complex cystic renal lesions using the Bosniak classification with CEUS and MRI to CT. Forty-eight patients with 65 cystic renal lesions were included in this study. All participants signed written informed consent. The agreement between the Bosniak classifications of complex renal lesions ( ≥ BII-F) on CEUS and MRI were compared to that of CT and were tested using Cohen’s Kappa. Sensitivity, specificity, positive and negative predictive values (PPV/NPV) and the accuracy of CEUS and MRI compared to CT in the detection of complex renal lesions were calculated. Twenty-nine (45%) out of 65 cystic renal lesions were classified as complex using CT. The agreement between CEUS and CT in the classification of complex cysts was fair (agreement 50.8%, Kappa 0.31), and was excellent between MRI and CT (agreement 93.9%, Kappa 0.88). Compared to CT, MRI had a sensitivity of 96.6%, specificity of 91.7%, a PPV of 54.7%, and an NPV of 54.7% with an accuracy of 63.1%. The corresponding values for CEUS were sensitivity 100.0%, specificity 33.3%, PPV 90.3%, and NPV 97.1% with an accuracy 93.8%. The classification of complex renal cysts based on MRI and CT scans correlated well, and MRI can be used instead of CT for this purpose. CEUS can exclude complex lesions, but due to higher sensitivity, cystic lesions tend to be upgraded. However, it is useful for initial imaging, for follow up of lesions and in those patients with contraindications to CT and MRI.

Keywords: Bosniak classification, computed tomography, contrast enhanced ultrasound, cystic renal lesions, magnetic resonance imaging

Procedia PDF Downloads 132
6335 Financial Fraud Prediction for Russian Non-Public Firms Using Relational Data

Authors: Natalia Feruleva

Abstract:

The goal of this paper is to develop the fraud risk assessment model basing on both relational and financial data and test the impact of the relationships between Russian non-public companies on the likelihood of financial fraud commitment. Relationships mean various linkages between companies such as parent-subsidiary relationship and person-related relationships. These linkages may provide additional opportunities for committing fraud. Person-related relationships appear when firms share a director, or the director owns another firm. The number of companies belongs to CEO and managed by CEO, the number of subsidiaries was calculated to measure the relationships. Moreover, the dummy variable describing the existence of parent company was also included in model. Control variables such as financial leverage and return on assets were also implemented because they describe the motivating factors of fraud. To check the hypotheses about the influence of the chosen parameters on the likelihood of financial fraud, information about person-related relationships between companies, existence of parent company and subsidiaries, profitability and the level of debt was collected. The resulting sample consists of 160 Russian non-public firms. The sample includes 80 fraudsters and 80 non-fraudsters operating in 2006-2017. The dependent variable is dichotomous, and it takes the value 1 if the firm is engaged in financial crime, otherwise 0. Employing probit model, it was revealed that the number of companies which belong to CEO of the firm or managed by CEO has significant impact on the likelihood of financial fraud. The results obtained indicate that the more companies are affiliated with the CEO, the higher the likelihood that the company will be involved in financial crime. The forecast accuracy of the model is about is 80%. Thus, the model basing on both relational and financial data gives high level of forecast accuracy.

Keywords: financial fraud, fraud prediction, non-public companies, regression analysis, relational data

Procedia PDF Downloads 107
6334 Enhancement Method of Network Traffic Anomaly Detection Model Based on Adversarial Training With Category Tags

Authors: Zhang Shuqi, Liu Dan

Abstract:

For the problems in intelligent network anomaly traffic detection models, such as low detection accuracy caused by the lack of training samples, poor effect with small sample attack detection, a classification model enhancement method, F-ACGAN(Flow Auxiliary Classifier Generative Adversarial Network) which introduces generative adversarial network and adversarial training, is proposed to solve these problems. Generating adversarial data with category labels could enhance the training effect and improve classification accuracy and model robustness. FACGAN consists of three steps: feature preprocess, which includes data type conversion, dimensionality reduction and normalization, etc.; A generative adversarial network model with feature learning ability is designed, and the sample generation effect of the model is improved through adversarial iterations between generator and discriminator. The adversarial disturbance factor of the gradient direction of the classification model is added to improve the diversity and antagonism of generated data and to promote the model to learn from adversarial classification features. The experiment of constructing a classification model with the UNSW-NB15 dataset shows that with the enhancement of FACGAN on the basic model, the classification accuracy has improved by 8.09%, and the score of F1 has improved by 6.94%.

Keywords: data imbalance, GAN, ACGAN, anomaly detection, adversarial training, data augmentation

Procedia PDF Downloads 89
6333 International Classification of Primary Care as a Reference for Coding the Demand for Care in Primary Health Care

Authors: Souhir Chelly, Chahida Harizi, Aicha Hechaichi, Sihem Aissaoui, Leila Ben Ayed, Maha Bergaoui, Mohamed Kouni Chahed

Abstract:

Introduction: The International Classification of Primary Care (ICPC) is part of the morbidity classification system. It had 17 chapters, and each is coded by an alphanumeric code: the letter corresponds to the chapter, the number to a paragraph in the chapter. The objective of this study is to show the utility of this classification in the coding of the reasons for demand for care in Primary health care (PHC), its advantages and limits. Methods: This is a cross-sectional descriptive study conducted in 4 PHC in Ariana district. Data on the demand for care during 2 days in the same week were collected. The coding of the information was done according to the CISP. The data was entered and analyzed by the EPI Info 7 software. Results: A total of 523 demands for care were investigated. The patients who came for the consultation are predominantly female (62.72%). Most of the consultants are young with an average age of 35 ± 26 years. In the ICPC, there are 7 rubrics: 'infections' is the most common reason with 49.9%, 'other diagnoses' with 40.2%, 'symptoms and complaints' with 5.5%, 'trauma' with 2.1%, 'procedures' with 2.1% and 'neoplasm' with 0.3%. The main advantage of the ICPC is the fact of being a standardized tool. It is very suitable for classification of the reasons for demand for care in PHC according to their specificity, capacity to be used in a computerized medical file of the PHC. Its current limitations are related to the difficulty of classification of some reasons for demand for care. Conclusion: The ICPC has been developed to provide healthcare with a coding reference that takes into account their specificity. The CIM is in its 10th revision; it would gain from revision to revision to be more efficient to be generalized and used by the teams of PHC.

Keywords: international classification of primary care, medical file, primary health care, Tunisia

Procedia PDF Downloads 251
6332 Parametric Modeling for Survival Data with Competing Risks Using the Generalized Gompertz Distribution

Authors: Noora Al-Shanfari, M. Mazharul Islam

Abstract:

The cumulative incidence function (CIF) is a fundamental approach for analyzing survival data in the presence of competing risks, which estimates the marginal probability for each competing event. Parametric modeling of CIF has the advantage of fitting various shapes of CIF and estimates the impact of covariates with maximum efficiency. To calculate the total CIF's covariate influence using a parametric model., it is essential to parametrize the baseline of the CIF. As the CIF is an improper function by nature, it is necessary to utilize an improper distribution when applying parametric models. The Gompertz distribution, which is an improper distribution, is limited in its applicability as it only accounts for monotone hazard shapes. The generalized Gompertz distribution, however, can adapt to a wider range of hazard shapes, including unimodal, bathtub, and monotonic increasing or decreasing hazard shapes. In this paper, the generalized Gompertz distribution is used to parametrize the baseline of the CIF, and the parameters of the proposed model are estimated using the maximum likelihood approach. The proposed model is compared with the existing Gompertz model using the Akaike information criterion. Appropriate statistical test procedures and model-fitting criteria will be used to test the adequacy of the model. Both models are applied to the ‘colon’ dataset, which is available in the “biostat3” package in R.

Keywords: competing risks, cumulative incidence function, improper distribution, parametric modeling, survival analysis

Procedia PDF Downloads 77
6331 A Quantitative Evaluation of Text Feature Selection Methods

Authors: B. S. Harish, M. B. Revanasiddappa

Abstract:

Due to rapid growth of text documents in digital form, automated text classification has become an important research in the last two decades. The major challenge of text document representations are high dimension, sparsity, volume and semantics. Since the terms are only features that can be found in documents, selection of good terms (features) plays an very important role. In text classification, feature selection is a strategy that can be used to improve classification effectiveness, computational efficiency and accuracy. In this paper, we present a quantitative analysis of most widely used feature selection (FS) methods, viz. Term Frequency-Inverse Document Frequency (tfidf ), Mutual Information (MI), Information Gain (IG), CHISquare (x2), Term Frequency-Relevance Frequency (tfrf ), Term Strength (TS), Ambiguity Measure (AM) and Symbolic Feature Selection (SFS) to classify text documents. We evaluated all the feature selection methods on standard datasets like 20 Newsgroups, 4 University dataset and Reuters-21578.

Keywords: classifiers, feature selection, text classification

Procedia PDF Downloads 447