Search results for: euclidean classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 406

Search results for: euclidean classifier

346 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 61
345 Segmentation of Liver Using Random Forest Classifier

Authors: Gajendra Kumar Mourya, Dinesh Bhatia, Akash Handique, Sunita Warjri, Syed Achaab Amir

Abstract:

Nowadays, Medical imaging has become an integral part of modern healthcare. Abdominal CT images are an invaluable mean for abdominal organ investigation and have been widely studied in the recent years. Diagnosis of liver pathologies is one of the major areas of current interests in the field of medical image processing and is still an open problem. To deeply study and diagnose the liver, segmentation of liver is done to identify which part of the liver is mostly affected. Manual segmentation of the liver in CT images is time-consuming and suffers from inter- and intra-observer differences. However, automatic or semi-automatic computer aided segmentation of the Liver is a challenging task due to inter-patient Liver shape and size variability. In this paper, we present a technique for automatic segmenting the liver from CT images using Random Forest Classifier. Random forests or random decision forests are an ensemble learning method for classification that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes of the individual trees. After comparing with various other techniques, it was found that Random Forest Classifier provide a better segmentation results with respect to accuracy and speed. We have done the validation of our results using various techniques and it shows above 89% accuracy in all the cases.

Keywords: CT images, image validation, random forest, segmentation

Procedia PDF Downloads 272
344 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu

Abstract:

Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 256
343 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 328
342 HRV Analysis Based Arrhythmic Beat Detection Using kNN Classifier

Authors: Onder Yakut, Oguzhan Timus, Emine Dogru Bolat

Abstract:

Health diseases have a vital significance affecting human being's life and life quality. Sudden death events can be prevented owing to early diagnosis and treatment methods. Electrical signals, taken from the human being's body using non-invasive methods and showing the heart activity is called Electrocardiogram (ECG). The ECG signal is used for following daily activity of the heart by clinicians. Heart Rate Variability (HRV) is a physiological parameter giving the variation between the heart beats. ECG data taken from MITBIH Arrhythmia Database is used in the model employed in this study. The detection of arrhythmic heart beats is aimed utilizing the features extracted from the HRV time domain parameters. The developed model provides a satisfactory performance with ~89% accuracy, 91.7 % sensitivity and 85% specificity rates for the detection of arrhythmic beats.

Keywords: arrhythmic beat detection, ECG, HRV, kNN classifier

Procedia PDF Downloads 307
341 Audio Information Retrieval in Mobile Environment with Fast Audio Classifier

Authors: Bruno T. Gomes, José A. Menezes, Giordano Cabral

Abstract:

With the popularity of smartphones, mobile apps emerge to meet the diverse needs, however the resources at the disposal are limited, either by the hardware, due to the low computing power, or the software, that does not have the same robustness of desktop environment. For example, in automatic audio classification (AC) tasks, musical information retrieval (MIR) subarea, is required a fast processing and a good success rate. However the mobile platform has limited computing power and the best AC tools are only available for desktop. To solve these problems the fast classifier suits, to mobile environments, the most widespread MIR technologies, seeking a balance in terms of speed and robustness. At the end we found that it is possible to enjoy the best of MIR for mobile environments. This paper presents the results obtained and the difficulties encountered.

Keywords: audio classification, audio extraction, environment mobile, musical information retrieval

Procedia PDF Downloads 500
340 Dynamic Gabor Filter Facial Features-Based Recognition of Emotion in Video Sequences

Authors: T. Hari Prasath, P. Ithaya Rani

Abstract:

In the world of visual technology, recognizing emotions from the face images is a challenging task. Several related methods have not utilized the dynamic facial features effectively for high performance. This paper proposes a method for emotions recognition using dynamic facial features with high performance. Initially, local features are captured by Gabor filter with different scale and orientations in each frame for finding the position and scale of face part from different backgrounds. The Gabor features are sent to the ensemble classifier for detecting Gabor facial features. The region of dynamic features is captured from the Gabor facial features in the consecutive frames which represent the dynamic variations of facial appearances. In each region of dynamic features is normalized using Z-score normalization method which is further encoded into binary pattern features with the help of threshold values. The binary features are passed to Multi-class AdaBoost classifier algorithm with the well-trained database contain happiness, sadness, surprise, fear, anger, disgust, and neutral expressions to classify the discriminative dynamic features for emotions recognition. The developed method is deployed on the Ryerson Multimedia Research Lab and Cohn-Kanade databases and they show significant performance improvement owing to their dynamic features when compared with the existing methods.

Keywords: detecting face, Gabor filter, multi-class AdaBoost classifier, Z-score normalization

Procedia PDF Downloads 238
339 Sign Language Recognition of Static Gestures Using Kinect™ and Convolutional Neural Networks

Authors: Rohit Semwal, Shivam Arora, Saurav, Sangita Roy

Abstract:

This work proposes a supervised framework with deep convolutional neural networks (CNNs) for vision-based sign language recognition of static gestures. Our approach addresses the acquisition and segmentation of correct inputs for the CNN-based classifier. Microsoft Kinect™ sensor, despite complex environmental conditions, can track hands efficiently. Skin Colour based segmentation is applied on cropped images of hands in different poses, used to depict different sign language gestures. The segmented hand images are used as an input for our classifier. The CNN classifier proposed in the paper is able to classify the input images with a high degree of accuracy. The system was trained and tested on 39 static sign language gestures, including 26 letters of the alphabet and 13 commonly used words. This paper includes a problem definition for building the proposed system, which acts as a sign language translator between deaf/mute and the rest of the society. It is then followed by a focus on reviewing existing knowledge in the area and work done by other researchers. It also describes the working principles behind different components of CNNs in brief. The architecture and system design specifications of the proposed system are discussed in the subsequent sections of the paper to give the reader a clear picture of the system in terms of the capability required. The design then gives the top-level details of how the proposed system meets the requirements.

Keywords: sign language, CNN, HCI, segmentation

Procedia PDF Downloads 103
338 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 216
337 Performance Analysis of Artificial Neural Network Based Land Cover Classification

Authors: Najam Aziz, Nasru Minallah, Ahmad Junaid, Kashaf Gul

Abstract:

Landcover classification using automated classification techniques, while employing remotely sensed multi-spectral imagery, is one of the promising areas of research. Different land conditions at different time are captured through satellite and monitored by applying different classification algorithms in specific environment. In this paper, a SPOT-5 image provided by SUPARCO has been studied and classified in Environment for Visual Interpretation (ENVI), a tool widely used in remote sensing. Then, Artificial Neural Network (ANN) classification technique is used to detect the land cover changes in Abbottabad district. Obtained results are compared with a pixel based Distance classifier. The results show that ANN gives the better overall accuracy of 99.20% and Kappa coefficient value of 0.98 over the Mahalanobis Distance Classifier.

Keywords: landcover classification, artificial neural network, remote sensing, SPOT 5

Procedia PDF Downloads 489
336 Learning to Translate by Learning to Communicate to an Entailment Classifier

Authors: Szymon Rutkowski, Tomasz Korbak

Abstract:

We present a reinforcement-learning-based method of training neural machine translation models without parallel corpora. The standard encoder-decoder approach to machine translation suffers from two problems we aim to address. First, it needs parallel corpora, which are scarce, especially for low-resource languages. Second, it lacks psychological plausibility of learning procedure: learning a foreign language is about learning to communicate useful information, not merely learning to transduce from one language’s 'encoding' to another. We instead pose the problem of learning to translate as learning a policy in a communication game between two agents: the translator and the classifier. The classifier is trained beforehand on a natural language inference task (determining the entailment relation between a premise and a hypothesis) in the target language. The translator produces a sequence of actions that correspond to generating translations of both the hypothesis and premise, which are then passed to the classifier. The translator is rewarded for classifier’s performance on determining entailment between sentences translated by the translator to disciple’s native language. Translator’s performance thus reflects its ability to communicate useful information to the classifier. In effect, we train a machine translation model without the need for parallel corpora altogether. While similar reinforcement learning formulations for zero-shot translation were proposed before, there is a number of improvements we introduce. While prior research aimed at grounding the translation task in the physical world by evaluating agents on an image captioning task, we found that using a linguistic task is more sample-efficient. Natural language inference (also known as recognizing textual entailment) captures semantic properties of sentence pairs that are poorly correlated with semantic similarity, thus enforcing basic understanding of the role played by compositionality. It has been shown that models trained recognizing textual entailment produce high-quality general-purpose sentence embeddings transferrable to other tasks. We use stanford natural language inference (SNLI) dataset as well as its analogous datasets for French (XNLI) and Polish (CDSCorpus). Textual entailment corpora can be obtained relatively easily for any language, which makes our approach more extensible to low-resource languages than traditional approaches based on parallel corpora. We evaluated a number of reinforcement learning algorithms (including policy gradients and actor-critic) to solve the problem of translator’s policy optimization and found that our attempts yield some promising improvements over previous approaches to reinforcement-learning based zero-shot machine translation.

Keywords: agent-based language learning, low-resource translation, natural language inference, neural machine translation, reinforcement learning

Procedia PDF Downloads 88
335 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 538
334 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier

Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim

Abstract:

There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.

Keywords: data mining, document classifier, text mining, topic modeling

Procedia PDF Downloads 353
333 Application of Smplify-X Algorithm with Enhanced Gender Classifier in 3D Human Pose Estimation

Authors: Jiahe Liu, Hongyang Yu, Miao Luo, Feng Qian

Abstract:

The widespread application of 3D human body reconstruction spans various fields. Smplify-X, an algorithm reliant on single-image input, employs three distinct body parameter templates, necessitating gender classification of individuals within the input image. Researchers employed a ResNet18 network to train a gender classifier within the Smplify-X framework, setting the threshold at 0.9, designating images falling below this threshold as having neutral gender. This model achieved 62.38% accurate predictions and 7.54% incorrect predictions. Our improvement involved refining the MobileNet network, resulting in a raised threshold of 0.97. Consequently, we attained 78.89% accurate predictions and a mere 0.2% incorrect predictions, markedly enhancing prediction precision and enabling more precise 3D human body reconstruction.

Keywords: SMPLX, mobileNet, gender classification, 3D human reconstruction

Procedia PDF Downloads 10
332 A Machine Learning Approach for Performance Prediction Based on User Behavioral Factors in E-Learning Environments

Authors: Naduni Ranasinghe

Abstract:

E-learning environments are getting more popular than any other due to the impact of COVID19. Even though e-learning is one of the best solutions for the teaching-learning process in the academic process, it’s not without major challenges. Nowadays, machine learning approaches are utilized in the analysis of how behavioral factors lead to better adoption and how they related to better performance of the students in eLearning environments. During the pandemic, we realized the academic process in the eLearning approach had a major issue, especially for the performance of the students. Therefore, an approach that investigates student behaviors in eLearning environments using a data-intensive machine learning approach is appreciated. A hybrid approach was used to understand how each previously told variables are related to the other. A more quantitative approach was used referred to literature to understand the weights of each factor for adoption and in terms of performance. The data set was collected from previously done research to help the training and testing process in ML. Special attention was made to incorporating different dimensionality of the data to understand the dependency levels of each. Five independent variables out of twelve variables were chosen based on their impact on the dependent variable, and by considering the descriptive statistics, out of three models developed (Random Forest classifier, SVM, and Decision tree classifier), random forest Classifier (Accuracy – 0.8542) gave the highest value for accuracy. Overall, this work met its goals of improving student performance by identifying students who are at-risk and dropout, emphasizing the necessity of using both static and dynamic data.

Keywords: academic performance prediction, e learning, learning analytics, machine learning, predictive model

Procedia PDF Downloads 108
331 Trabecular Bone Radiograph Characterization Using Fractal, Multifractal Analysis and SVM Classifier

Authors: I. Slim, H. Akkari, A. Ben Abdallah, I. Bhouri, M. Hedi Bedoui

Abstract:

Osteoporosis is a common disease characterized by low bone mass and deterioration of micro-architectural bone tissue, which provokes an increased risk of fracture. This work treats the texture characterization of trabecular bone radiographs. The aim was to analyze according to clinical research a group of 174 subjects: 87 osteoporotic patients (OP) with various bone fracture types and 87 control cases (CC). To characterize osteoporosis, Fractal and MultiFractal (MF) methods were applied to images for features (attributes) extraction. In order to improve the results, a new method of MF spectrum based on the q-stucture function calculation was proposed and a combination of Fractal and MF attributes was used. The Support Vector Machines (SVM) was applied as a classifier to distinguish between OP patients and CC subjects. The features fusion (fractal and MF) allowed a good discrimination between the two groups with an accuracy rate of 96.22%.

Keywords: fractal, micro-architecture analysis, multifractal, osteoporosis, SVM

Procedia PDF Downloads 356
330 A System to Detect Inappropriate Messages in Online Social Networks

Authors: Shivani Singh, Shantanu Nakhare, Kalyani Nair, Rohan Shetty

Abstract:

As social networking is growing at a rapid pace today it is vital that we work on improving its management. Research has shown that the content present in online social networks may have significant influence on impressionable minds. If such platforms are misused, it will lead to negative consequences. Detecting insults or inappropriate messages continues to be one of the most challenging aspects of Online Social Networks (OSNs) today. We address this problem through a Machine Learning Based Soft Text Classifier approach using Support Vector Machine algorithm. The proposed system acts as a screening mechanism the alerts the user about such messages. The messages are classified according to their subject matter and each comment is labeled for the presence of profanity and insults.

Keywords: machine learning, online social networks, soft text classifier, support vector machine

Procedia PDF Downloads 463
329 [Keynote Talk]: Applying p-Balanced Energy Technique to Solve Liouville-Type Problems in Calculus

Authors: Lina Wu, Ye Li, Jia Liu

Abstract:

We are interested in solving Liouville-type problems to explore constancy properties for maps or differential forms on Riemannian manifolds. Geometric structures on manifolds, the existence of constancy properties for maps or differential forms, and energy growth for maps or differential forms are intertwined. In this article, we concentrate on discovery of solutions to Liouville-type problems where manifolds are Euclidean spaces (i.e. flat Riemannian manifolds) and maps become real-valued functions. Liouville-type results of vanishing properties for functions are obtained. The original work in our research findings is to extend the q-energy for a function from finite in Lq space to infinite in non-Lq space by applying p-balanced technique where q = p = 2. Calculation skills such as Hölder's Inequality and Tests for Series have been used to evaluate limits and integrations for function energy. Calculation ideas and computational techniques for solving Liouville-type problems shown in this article, which are utilized in Euclidean spaces, can be universalized as a successful algorithm, which works for both maps and differential forms on Riemannian manifolds. This innovative algorithm has a far-reaching impact on research work of solving Liouville-type problems in the general settings involved with infinite energy. The p-balanced technique in this algorithm provides a clue to success on the road of q-energy extension from finite to infinite.

Keywords: differential forms, holder inequality, Liouville-type problems, p-balanced growth, p-harmonic maps, q-energy growth, tests for series

Procedia PDF Downloads 198
328 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 316
327 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 130
326 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 204
325 Assessment of Planet Image for Land Cover Mapping Using Soft and Hard Classifiers

Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi

Abstract:

Planet image is a new data source from planet lab. This research is concerned with the assessment of Planet image for land cover mapping. Two pixel based classifiers and one subpixel based classifier were compared. Firstly, rectification of Planet image was performed. Secondly, a comparison between minimum distance, maximum likelihood and neural network classifications for classification of Planet image was performed. Thirdly, the overall accuracy of classification and kappa coefficient were calculated. Results indicate that neural network classification is best followed by maximum likelihood classifier then minimum distance classification for land cover mapping.

Keywords: planet image, land cover mapping, rectification, neural network classification, multilayer perceptron, soft classifiers, hard classifiers

Procedia PDF Downloads 143
324 Computer Aided Classification of Architectural Distortion in Mammograms Using Texture Features

Authors: Birmohan Singh, V.K.Jain

Abstract:

Computer aided diagnosis systems provide vital opinion to radiologists in the detection of early signs of breast cancer from mammogram images. Masses and microcalcifications, architectural distortions are the major abnormalities. In this paper, a computer aided diagnosis system has been proposed for distinguishing abnormal mammograms with architectural distortion from normal mammogram. Four types of texture features GLCM texture, GLRLM texture, fractal texture and spectral texture features for the regions of suspicion are extracted. Support Vector Machine has been used as classifier in this study. The proposed system yielded an overall sensitivity of 96.47% and accuracy of 96% for the detection of abnormalities with mammogram images collected from Digital Database for Screening Mammography (DDSM) database.

Keywords: architecture distortion, mammograms, GLCM texture features, GLRLM texture features, support vector machine classifier

Procedia PDF Downloads 446
323 Sub-Pixel Level Classification Using Remote Sensing For Arecanut Crop

Authors: S. Athiralakshmi, B.E. Bhojaraja, U. Pruthviraj

Abstract:

In agriculture, remote sensing is applied for monitoring of plant development, evaluating of physiological processes and growth conditions. Especially valuable are the spatio-temporal aspects of the remotely sensed data in detecting crop state differences and stress situations. In this study, hyperion imagery is used for classifying arecanut crops based on their age so that these maps can be used in yield estimation of crops, irrigation purposes, applying fertilizers etc. Traditional hard classifiers assigns the mixed pixels to the dominant classes. The proposed method uses a sub pixel level classifier called linear spectral unmixing available in ENVI software. It provides the relative abundance of surface materials and the context within a pixel that may be a potential solution to effectively identifying the land-cover distribution. Validation is done referring to field spectra collected using spectroradiometer and the ground control points obtained from GPS.

Keywords: FLAASH, Hyperspectral remote sensing, Linear Spectral Unmixing, Spectral Angle Mapper Classifier.

Procedia PDF Downloads 476
322 Smartphone Based Wound Assessment System for Diabetes Patients

Authors: Vaibhav V. Dixit, Shubham Ajay Karwa

Abstract:

Diabetic foot ulcers speak to a critical medical problem. Right now, clinicians and medical caretakers primarily construct their injury evaluation in light of visual examination of wound size and mending status, while the patients themselves rarely have a chance to play a dynamic part. Henceforth, love quantitative and practical examination technique that empowers the patients and their parental figures to take a more dynamic part in every day wound care possibly can quicken wound recuperating, spare travel cost and diminish human services costs. Considering the commonness of cell phones with a high-determination computerized camera, evaluating wounds by breaking down pictures of ceaseless foot ulcers is an alluring choice. In this paper, we propose a novel injury picture examination framework actualized using feature extraction and color segmentation. Here we are using the Normalized minimum distance classifier for classifying the output.

Keywords: diabetic, Gabor wavelet, normalized minimum distance classifier, quantiable parameters

Procedia PDF Downloads 234
321 Margin-Based Feed-Forward Neural Network Classifiers

Authors: Xiaohan Bookman, Xiaoyan Zhu

Abstract:

Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labeled samples and flexible network. We have conducted experiments on four UCI open data sets and achieved good results as expected. In conclusion, our model could handle more sparse labeled and more high-dimension data set in a high accuracy while modification from old ANN method to our method is easy and almost free of work.

Keywords: Max-Margin Principle, Feed-Forward Neural Network, classifier, structural risk

Procedia PDF Downloads 291
320 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 522
319 Theater Metaphor in Event Quantification: A Corpus Study

Authors: Zhuo Jing-Schmidt, Jun Lang

Abstract:

Numeral classifiers are common in Asian languages. Research on numeral classifiers primarily focuses on noun classifiers that quantify and individuate nominal referents. There is a scarcity of research on event quantification using verb classifiers. This study aims to understand the semantic and conceptual basis of event quantification in Chinese. From a usage-based Construction Grammar perspective, this study presents a corpus analysis of event quantification in Chinese. Drawing on a large balanced corpus of contemporary Chinese, we analyze 667 NOUN col-lexemes totaling 31136 tokens of a productive numeral classifier construction in Chinese. Using collostructional analysis of the collexemes, the results show that the construction quantifies and classifies dramatic events using a theater-based conceptual metaphor. We argue that the usage patterns reflect the cultural entrenchment of theater as in Chinese conceptualization and the construal of theatricality in linguistic expression. The study has implications for cognitive semantics and construction grammar.

Keywords: event quantification, classifier, corpus, metaphor

Procedia PDF Downloads 33
318 Evaluation of Gesture-Based Password: User Behavioral Features Using Machine Learning Algorithms

Authors: Lakshmidevi Sreeramareddy, Komalpreet Kaur, Nane Pothier

Abstract:

Graphical-based passwords have existed for decades. Their major advantage is that they are easier to remember than an alphanumeric password. However, their disadvantage (especially recognition-based passwords) is the smaller password space, making them more vulnerable to brute force attacks. Graphical passwords are also highly susceptible to the shoulder-surfing effect. The gesture-based password method that we developed is a grid-free, template-free method. In this study, we evaluated the gesture-based passwords for usability and vulnerability. The results of the study are significant. We developed a gesture-based password application for data collection. Two modes of data collection were used: Creation mode and Replication mode. In creation mode (Session 1), users were asked to create six different passwords and reenter each password five times. In replication mode, users saw a password image created by some other user for a fixed duration of time. Three different duration timers, such as 5 seconds (Session 2), 10 seconds (Session 3), and 15 seconds (Session 4), were used to mimic the shoulder-surfing attack. After the timer expired, the password image was removed, and users were asked to replicate the password. There were 74, 57, 50, and 44 users participated in Session 1, Session 2, Session 3, and Session 4 respectfully. In this study, the machine learning algorithms have been applied to determine whether the person is a genuine user or an imposter based on the password entered. Five different machine learning algorithms were deployed to compare the performance in user authentication: namely, Decision Trees, Linear Discriminant Analysis, Naive Bayes Classifier, Support Vector Machines (SVMs) with Gaussian Radial Basis Kernel function, and K-Nearest Neighbor. Gesture-based password features vary from one entry to the next. It is difficult to distinguish between a creator and an intruder for authentication. For each password entered by the user, four features were extracted: password score, password length, password speed, and password size. All four features were normalized before being fed to a classifier. Three different classifiers were trained using data from all four sessions. Classifiers A, B, and C were trained and tested using data from the password creation session and the password replication with a timer of 5 seconds, 10 seconds, and 15 seconds, respectively. The classification accuracies for Classifier A using five ML algorithms are 72.5%, 71.3%, 71.9%, 74.4%, and 72.9%, respectively. The classification accuracies for Classifier B using five ML algorithms are 69.7%, 67.9%, 70.2%, 73.8%, and 71.2%, respectively. The classification accuracies for Classifier C using five ML algorithms are 68.1%, 64.9%, 68.4%, 71.5%, and 69.8%, respectively. SVMs with Gaussian Radial Basis Kernel outperform other ML algorithms for gesture-based password authentication. Results confirm that the shorter the duration of the shoulder-surfing attack, the higher the authentication accuracy. In conclusion, behavioral features extracted from the gesture-based passwords lead to less vulnerable user authentication.

Keywords: authentication, gesture-based passwords, machine learning algorithms, shoulder-surfing attacks, usability

Procedia PDF Downloads 68
317 The Wear Recognition on Guide Surface Based on the Feature of Radar Graph

Authors: Youhang Zhou, Weimin Zeng, Qi Xie

Abstract:

Abstract: In order to solve the wear recognition problem of the machine tool guide surface, a new machine tool guide surface recognition method based on the radar-graph barycentre feature is presented in this paper. Firstly, the gray mean value, skewness, projection variance, flat degrees and kurtosis features of the guide surface image data are defined as primary characteristics. Secondly, data Visualization technology based on radar graph is used. The visual barycentre graphical feature is demonstrated based on the radar plot of multi-dimensional data. Thirdly, a classifier based on the support vector machine technology is used, the radar-graph barycentre feature and wear original feature are put into the classifier separately for classification and comparative analysis of classification and experiment results. The calculation and experimental results show that the method based on the radar-graph barycentre feature can detect the guide surface effectively.

Keywords: guide surface, wear defects, feature extraction, data visualization

Procedia PDF Downloads 469