Search results for: canopy characters classification
2384 An Integrated Lightweight Naïve Bayes Based Webpage Classification Service for Smartphone Browsers
Authors: Mayank Gupta, Siba Prasad Samal, Vasu Kakkirala
Abstract:
The internet world and its priorities have changed considerably in the last decade. Browsing on smart phones has increased manifold and is set to explode much more. Users spent considerable time browsing different websites, that gives a great deal of insight into user’s preferences. Instead of plain information classifying different aspects of browsing like Bookmarks, History, and Download Manager into useful categories would improve and enhance the user’s experience. Most of the classification solutions are server side that involves maintaining server and other heavy resources. It has security constraints and maybe misses on contextual data during classification. On device, classification solves many such problems, but the challenge is to achieve accuracy on classification with resource constraints. This on device classification can be much more useful in personalization, reducing dependency on cloud connectivity and better privacy/security. This approach provides more relevant results as compared to current standalone solutions because it uses content rendered by browser which is customized by the content provider based on user’s profile. This paper proposes a Naive Bayes based lightweight classification engine targeted for a resource constraint devices. Our solution integrates with Web Browser that in turn triggers classification algorithm. Whenever a user browses a webpage, this solution extracts DOM Tree data from the browser’s rendering engine. This DOM data is a dynamic, contextual and secure data that can’t be replicated. This proposal extracts different features of the webpage that runs on an algorithm to classify into multiple categories. Naive Bayes based engine is chosen in this solution for its inherent advantages in using limited resources compared to other classification algorithms like Support Vector Machine, Neural Networks, etc. Naive Bayes classification requires small memory footprint and less computation suitable for smartphone environment. This solution has a feature to partition the model into multiple chunks that in turn will facilitate less usage of memory instead of loading a complete model. Classification of the webpages done through integrated engine is faster, more relevant and energy efficient than other standalone on device solution. This classification engine has been tested on Samsung Z3 Tizen hardware. The Engine is integrated into Tizen Browser that uses Chromium Rendering Engine. For this solution, extensive dataset is sourced from dmoztools.net and cleaned. This cleaned dataset has 227.5K webpages which are divided into 8 generic categories ('education', 'games', 'health', 'entertainment', 'news', 'shopping', 'sports', 'travel'). Our browser integrated solution has resulted in 15% less memory usage (due to partition method) and 24% less power consumption in comparison with standalone solution. This solution considered 70% of the dataset for training the data model and the rest 30% dataset for testing. An average accuracy of ~96.3% is achieved across the above mentioned 8 categories. This engine can be further extended for suggesting Dynamic tags and using the classification for differential uses cases to enhance browsing experience.Keywords: chromium, lightweight engine, mobile computing, Naive Bayes, Tizen, web browser, webpage classification
Procedia PDF Downloads 1652383 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification
Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike
Abstract:
Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.Keywords: data mining, decision tree, classification, imbalance dataset
Procedia PDF Downloads 1392382 Fiction and Reality in Animation: Taking Final Flight of the Osiris as an Example
Authors: Syong-Yang Chung, Xin-An Chen
Abstract:
This study aims to explore the less well-known animation “Final Flight of the Osiris”, consisting of an initial exploration of the film color, storyline, and the simulacrum meanings of the roles, which leads to a further exploration of the light-shadow contrast and the psychological images presented by the screen colors and the characters. The research is based on literature review, and all data was compiled for the analysis of the visual vocabulary evolution of the characters. In terms of the structure, the relational study of the animation and the historical background of that time came first, including The Wachowskis’ and Andy Jones’ impact towards the cinematographic version and the animation version of “The Matrix”. Through literature review, the film color, the meaning and the relevant points were clarified. It was found in this research that “Final Flight of the Osiris” separates the realistic and virtual spaces by the changing the color tones; the "self" of the audience gradually dissolves into the "virtual" in the simulacra world, and the "Animatrix" has become a virtual field for the audience to understand itself about "existence" and "self".Keywords: the matrix, the final flight of Osiris, Wachowski brothers, simulacres
Procedia PDF Downloads 2292381 Land Cover Remote Sensing Classification Advanced Neural Networks Supervised Learning
Authors: Eiman Kattan
Abstract:
This study aims to evaluate the impact of classifying labelled remote sensing images conventional neural network (CNN) architecture, i.e., AlexNet on different land cover scenarios based on two remotely sensed datasets from different point of views such as the computational time and performance. Thus, a set of experiments were conducted to specify the effectiveness of the selected convolutional neural network using two implementing approaches, named fully trained and fine-tuned. For validation purposes, two remote sensing datasets, AID, and RSSCN7 which are publicly available and have different land covers features were used in the experiments. These datasets have a wide diversity of input data, number of classes, amount of labelled data, and texture patterns. A specifically designed interactive deep learning GPU training platform for image classification (Nvidia Digit) was employed in the experiments. It has shown efficiency in training, validation, and testing. As a result, the fully trained approach has achieved a trivial result for both of the two data sets, AID and RSSCN7 by 73.346% and 71.857% within 24 min, 1 sec and 8 min, 3 sec respectively. However, dramatic improvement of the classification performance using the fine-tuning approach has been recorded by 92.5% and 91% respectively within 24min, 44 secs and 8 min 41 sec respectively. The represented conclusion opens the opportunities for a better classification performance in various applications such as agriculture and crops remote sensing.Keywords: conventional neural network, remote sensing, land cover, land use
Procedia PDF Downloads 3722380 Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation
Authors: Arian Hosseini, Mahmudul Hasan
Abstract:
To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence detection use cases as well. Based on our experiments, we propose a "think small, think many" philosophy in classification scenarios. We argue that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, and lightweight models with narrowed-down visual features can possibly lead to predictions with higher accuracy.Keywords: deep classification, content moderation, ensemble learning, explosion detection, video processing
Procedia PDF Downloads 552379 Improve Divers Tracking and Classification in Sonar Images Using Robust Diver Wake Detection Algorithm
Authors: Mohammad Tarek Al Muallim, Ozhan Duzenli, Ceyhun Ilguy
Abstract:
Harbor protection systems are so important. The need for automatic protection systems has increased over the last years. Diver detection active sonar has great significance. It used to detect underwater threats such as divers and autonomous underwater vehicle. To automatically detect such threats the sonar image is processed by algorithms. These algorithms used to detect, track and classify of underwater objects. In this work, divers tracking and classification algorithm is improved be proposing a robust wake detection method. To detect objects the sonar images is normalized then segmented based on fixed threshold. Next, the centroids of the segments are found and clustered based on distance metric. Then to track the objects linear Kalman filter is applied. To reduce effect of noise and creation of false tracks, the Kalman tracker is fine tuned. The tuning is done based on our active sonar specifications. After the tracks are initialed and updated they are subjected to a filtering stage to eliminate the noisy and unstable tracks. Also to eliminate object with a speed out of the diver speed range such as buoys and fast boats. Afterwards the result tracks are subjected to a classification stage to deiced the type of the object been tracked. Here the classification stage is to deice wither if the tracked object is an open circuit diver or a close circuit diver. At the classification stage, a small area around the object is extracted and a novel wake detection method is applied. The morphological features of the object with his wake is extracted. We used support vector machine to find the best classifier. The sonar training images and the test images are collected by ARMELSAN Defense Technologies Company using the portable diver detection sonar ARAS-2023. After applying the algorithm to the test sonar data, we get fine and stable tracks of the divers. The total classification accuracy achieved with the diver type is 97%.Keywords: harbor protection, diver detection, active sonar, wake detection, diver classification
Procedia PDF Downloads 2382378 Credit Risk Assessment Using Rule Based Classifiers: A Comparative Study
Authors: Salima Smiti, Ines Gasmi, Makram Soui
Abstract:
Credit risk is the most important issue for financial institutions. Its assessment becomes an important task used to predict defaulter customers and classify customers as good or bad payers. To this objective, numerous techniques have been applied for credit risk assessment. However, to our knowledge, several evaluation techniques are black-box models such as neural networks, SVM, etc. They generate applicants’ classes without any explanation. In this paper, we propose to assess credit risk using rules classification method. Our output is a set of rules which describe and explain the decision. To this end, we will compare seven classification algorithms (JRip, Decision Table, OneR, ZeroR, Fuzzy Rule, PART and Genetic programming (GP)) where the goal is to find the best rules satisfying many criteria: accuracy, sensitivity, and specificity. The obtained results confirm the efficiency of the GP algorithm for German and Australian datasets compared to other rule-based techniques to predict the credit risk.Keywords: credit risk assessment, classification algorithms, data mining, rule extraction
Procedia PDF Downloads 1832377 Robust Pattern Recognition via Correntropy Generalized Orthogonal Matching Pursuit
Authors: Yulong Wang, Yuan Yan Tang, Cuiming Zou, Lina Yang
Abstract:
This paper presents a novel sparse representation method for robust pattern classification. Generalized orthogonal matching pursuit (GOMP) is a recently proposed efficient sparse representation technique. However, GOMP adopts the mean square error (MSE) criterion and assign the same weights to all measurements, including both severely and slightly corrupted ones. To reduce the limitation, we propose an information-theoretic GOMP (ITGOMP) method by exploiting the correntropy induced metric. The results show that ITGOMP can adaptively assign small weights on severely contaminated measurements and large weights on clean ones, respectively. An ITGOMP based classifier is further developed for robust pattern classification. The experiments on public real datasets demonstrate the efficacy of the proposed approach.Keywords: correntropy induced metric, matching pursuit, pattern classification, sparse representation
Procedia PDF Downloads 3572376 Data Quality Enhancement with String Length Distribution
Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda
Abstract:
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.Keywords: string classification, data quality, feature selection, probability distribution, string length
Procedia PDF Downloads 3192375 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification
Authors: Samiah Alammari, Nassim Ammour
Abstract:
When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on HSI dataset Indian Pines. The results confirm the capability of the proposed method.Keywords: continual learning, data reconstruction, remote sensing, hyperspectral image segmentation
Procedia PDF Downloads 2682374 Comparing the Apparent Error Rate of Gender Specifying from Human Skeletal Remains by Using Classification and Cluster Methods
Authors: Jularat Chumnaul
Abstract:
In forensic science, corpses from various homicides are different; there are both complete and incomplete, depending on causes of death or forms of homicide. For example, some corpses are cut into pieces, some are camouflaged by dumping into the river, some are buried, some are burned to destroy the evidence, and others. If the corpses are incomplete, it can lead to the difficulty of personally identifying because some tissues and bones are destroyed. To specify gender of the corpses from skeletal remains, the most precise method is DNA identification. However, this method is costly and takes longer so that other identification techniques are used instead. The first technique that is widely used is considering the features of bones. In general, an evidence from the corpses such as some pieces of bones, especially the skull and pelvis can be used to identify their gender. To use this technique, forensic scientists are required observation skills in order to classify the difference between male and female bones. Although this technique is uncomplicated, saving time and cost, and the forensic scientists can fairly accurately determine gender by using this technique (apparently an accuracy rate of 90% or more), the crucial disadvantage is there are only some positions of skeleton that can be used to specify gender such as supraorbital ridge, nuchal crest, temporal lobe, mandible, and chin. Therefore, the skeletal remains that will be used have to be complete. The other technique that is widely used for gender specifying in forensic science and archeology is skeletal measurements. The advantage of this method is it can be used in several positions in one piece of bones, and it can be used even if the bones are not complete. In this study, the classification and cluster analysis are applied to this technique, including the Kth Nearest Neighbor Classification, Classification Tree, Ward Linkage Cluster, K-mean Cluster, and Two Step Cluster. The data contains 507 particular individuals and 9 skeletal measurements (diameter measurements), and the performance of five methods are investigated by considering the apparent error rate (APER). The results from this study indicate that the Two Step Cluster and Kth Nearest Neighbor method seem to be suitable to specify gender from human skeletal remains because both yield small apparent error rate of 0.20% and 4.14%, respectively. On the other hand, the Classification Tree, Ward Linkage Cluster, and K-mean Cluster method are not appropriate since they yield large apparent error rate of 10.65%, 10.65%, and 16.37%, respectively. However, there are other ways to evaluate the performance of classification such as an estimate of the error rate using the holdout procedure or misclassification costs, and the difference methods can make the different conclusions.Keywords: skeletal measurements, classification, cluster, apparent error rate
Procedia PDF Downloads 2522373 Gendered Experiences of the Urban Space in India as Portrayed by Hindi Cinema: A Quantitative Analysis
Authors: Hugo Ribadeau Dumas
Abstract:
In India, cities represent intense battlefields where patriarchal norms are simultaneously defied and reinforced. While Indian metropolises have witnessed numerous initiatives where women boldly claimed their right to the city, urban spaces still remain disproportionately unfriendly to female city-dwellers. As a result, the presence of strees (women, in Hindi) in the streets remains a socially and politically potent phenomenon. This paper explores how, in India, women engage with the city as compared to men. Borrowing analytical tools from urban geography, it uses Hindi cinema as a medium to map the extent to which activities, attitudes and experiences in urban spaces are highly gendered. The sample consists of 30 movies, both mainstream and independent, which were released between 2010 and 2020, were set in an urban environment and comprised at least one pivotal female character. The paper adopts a quantitative approach, consisting of the scrutiny of close to 3,000 minutes of footage, the labeling and time count of every scene, and the computation of regressions to identify statistical relationships between characters and the way they navigate the city. According to the analysis, female characters spend half less time in the public space than their male counterparts. When they do step out, women do it mostly for utilitarian reasons; inversely, in private spaces or in pseudo-public commercial places – like malls – they indulge in fun activities. For male characters, the pattern is the exact opposite: fun takes place in public and serious work in private. The characters’ attitudes in the streets are also greatly gendered: men spend a significant amount of time immobile, loitering, while women are usually on the move, displaying some sense of purpose. Likewise, body language and emotional expressiveness betray differentiated gender scripts: while women wander in the streets either smiling – in a charming role – or with a hostile face – in a defensive mode – men are more likely to adopt neutral facial expressions. These trends were observed across all movies, although some nuances were identified depending on the character's age group, social background, and city, highlighting that the urban experience is not the same for all women. The empirical pieces of evidence presented in this study are helpful to reflect on the meaning of public space in the context of contemporary Indian cities. The paper ends with a discussion on the link between universal access to public spaces and women's empowerment.Keywords: cinema, Indian cities, public space, women empowerment
Procedia PDF Downloads 1582372 Non-intrusive Hand Control of Drone Using an Inexpensive and Streamlined Convolutional Neural Network Approach
Authors: Evan Lowhorn, Rocio Alba-Flores
Abstract:
The purpose of this work is to develop a method for classifying hand signals and using the output in a drone control algorithm. To achieve this, methods based on Convolutional Neural Networks (CNN) were applied. CNN's are a subset of deep learning, which allows grid-like inputs to be processed and passed through a neural network to be trained for classification. This type of neural network allows for classification via imaging, which is less intrusive than previous methods using biosensors, such as EMG sensors. Classification CNN's operate purely from the pixel values in an image; therefore they can be used without additional exteroceptive sensors. A development bench was constructed using a desktop computer connected to a high-definition webcam mounted on a scissor arm. This allowed the camera to be pointed downwards at the desk to provide a constant solid background for the dataset and a clear detection area for the user. A MATLAB script was created to automate dataset image capture at the development bench and save the images to the desktop. This allowed the user to create their own dataset of 12,000 images within three hours. These images were evenly distributed among seven classes. The defined classes include forward, backward, left, right, idle, and land. The drone has a popular flip function which was also included as an additional class. To simplify control, the corresponding hand signals chosen were the numerical hand signs for one through five for movements, a fist for land, and the universal “ok” sign for the flip command. Transfer learning with PyTorch (Python) was performed using a pre-trained 18-layer residual learning network (ResNet-18) to retrain the network for custom classification. An algorithm was created to interpret the classification and send encoded messages to a Ryze Tello drone over its 2.4 GHz Wi-Fi connection. The drone’s movements were performed in half-meter distance increments at a constant speed. When combined with the drone control algorithm, the classification performed as desired with negligible latency when compared to the delay in the drone’s movement commands.Keywords: classification, computer vision, convolutional neural networks, drone control
Procedia PDF Downloads 2122371 The Crossroad of Identities in Wajdi Mouawad's 'Littoral': A Rhizomatic Approach of Identity Reconstruction through Theatre and Performance
Authors: Mai Hussein
Abstract:
'Littoral' is an original voice in Québécois theatre, spanning the cultural gaps that can exist between the playwrights’ native Lebanon, North America, Quebec, and Europe. Littoral is a 'crossroad' of cultures and themes, a 'bridge' connecting cultures and languages. It represents a new form of theatrical writing that combines the verbal, the vocal and the pantomimic, calling upon the stage to question the real, to engage characters in a quest, in a journey of mourning, of reconstructing identity and a collective memory despite ruins and wars. A theatre of witness, a theatre denouncing irrationality of racism and war, a theatre 'performing' the symptoms of the stress disorders of characters passing from resistance and anger to reconciliation and giving voice to the silenced victims, these are some of the pillars that this play has to offer. In this corrida between life and death, the identity seems like a work-in-progress that is shaped in the presence of the Self and the Other. This trajectory will lead to re-open widely the door to questions, interrogations, and reflections to show how this play is at the nexus of contemporary preoccupations of the 21st century: the importance of memory, the search for meaning, the pursuit of the infinite. It also shows how a play can create bridges between languages, cultures, societies, and movements. To what extent does it mediate between the words and the silence, and how does it burn the bridges or the gaps between the textual and the performative while investigating the power of intermediality to confront racism and segregation. It also underlines the centrality of confrontation between cultures, languages, writing and representation techniques to challenge the characters in their quest to restructure their shattered, but yet intertwined identities. The goal of this theatre would then be to invite everyone involved in the process of a journey of self-discovery away from their comfort zone. Everyone will have to explore the liminal space, to read in between the lines of the written text as well as in between the text and the performance to explore the gaps and the tensions that exist between what is said, and what is played, between the 'parole' and the performative body.Keywords: identity, memory, performance, testimony, trauma
Procedia PDF Downloads 1152370 Recommendations to Improve Classification of Grade Crossings in Urban Areas of Mexico
Authors: Javier Alfonso Bonilla-Chávez, Angélica Lozano
Abstract:
In North America, more than 2,000 people annually die in accidents related to railroad tracks. In 2020, collisions at grade crossings were the main cause of deaths related to railway accidents in Mexico. Railway networks have constant interaction with motor transport users, cyclists, and pedestrians, mainly in grade crossings, where is the greatest vulnerability and risk of accidents. Usually, accidents at grade crossings are directly related to risky behavior and non-compliance with regulations by motorists, cyclists, and pedestrians, especially in developing countries. Around the world, countries classify these crossings in different ways. In Mexico, according to their dangerousness (high, medium, or low), types A, B and C have been established, recommending for each one different type of auditive and visual signaling and gates, as well as horizontal and vertical signaling. This classification is based in a weighting, but regrettably, it is not explained how the weight values were obtained. A review of the variables and the current approach for the grade crossing classification is required, since it is inadequate for some crossings. In contrast, North America (USA and Canada) and European countries consider a broader classification so that attention to each crossing is addressed more precisely and equipment costs are adjusted. Lack of a proper classification, could lead to cost overruns in the equipment and a deficient operation. To exemplify the lack of a good classification, six crossings are studied, three located in the rural area of Mexico and three in Mexico City. These cases show the need of: improving the current regulations, improving the existing infrastructure, and implementing technological systems, including informative signals with nomenclature of the involved crossing and direct telephone line for reporting emergencies. This implementation is unaffordable for most municipal governments. Also, an inventory of the most dangerous grade crossings in urban and rural areas must be obtained. Then, an approach for improving the classification of grade crossings is suggested. This approach must be based on criteria design, characteristics of adjacent roads or intersections which can influence traffic flow through the crossing, accidents related to motorized and non-motorized vehicles, land use and land management, type of area, and services and economic activities in the zone where the grade crossings is located. An expanded classification of grade crossing in Mexico could reduce accidents and improve the efficiency of the railroad.Keywords: accidents, grade crossing, railroad, traffic safety
Procedia PDF Downloads 1092369 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography
Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu
Abstract:
Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli
Procedia PDF Downloads 2562368 Research on Reservoir Lithology Prediction Based on Residual Neural Network and Squeeze-and- Excitation Neural Network
Authors: Li Kewen, Su Zhaoxin, Wang Xingmou, Zhu Jian Bing
Abstract:
Conventional reservoir prediction methods ar not sufficient to explore the implicit relation between seismic attributes, and thus data utilization is low. In order to improve the predictive classification accuracy of reservoir lithology, this paper proposes a deep learning lithology prediction method based on ResNet (Residual Neural Network) and SENet (Squeeze-and-Excitation Neural Network). The neural network model is built and trained by using seismic attribute data and lithology data of Shengli oilfield, and the nonlinear mapping relationship between seismic attribute and lithology marker is established. The experimental results show that this method can significantly improve the classification effect of reservoir lithology, and the classification accuracy is close to 70%. This study can effectively predict the lithology of undrilled area and provide support for exploration and development.Keywords: convolutional neural network, lithology, prediction of reservoir, seismic attributes
Procedia PDF Downloads 1782367 Random Forest Classification for Population Segmentation
Authors: Regina Chua
Abstract:
To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling
Procedia PDF Downloads 952366 Genetic Algorithms for Feature Generation in the Context of Audio Classification
Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes
Abstract:
Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.Keywords: feature generation, feature learning, genetic algorithm, music information retrieval
Procedia PDF Downloads 4362365 Machine Learning-Enabled Classification of Climbing Using Small Data
Authors: Nicholas Milburn, Yu Liang, Dalei Wu
Abstract:
Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence
Procedia PDF Downloads 1442364 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing
Authors: Aleksandra Zysk, Pawel Badura
Abstract:
Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.Keywords: classification, singing, spectral analysis, vocal emission, vocal register
Procedia PDF Downloads 3052363 Storage Study of Bael (Aegle marmelos Correa.) Fruit and Pulp of Cv. Pant Sujata
Authors: B. R. Jana, Madhumita Singh
Abstract:
Storage study of bael fruit and pulp were conducted at ICAR-RCER, Research Centre Ranchi to find out suitable storage life to extent the availability of the fruit and produce the value added product in form of fruit. The cultivar under storage is Pant Sujata. CFB box packing resulted in minimum 21 % PLW during 2010-11 during its 28-35 days storage under ambient temperature. CFB box and Gunny bag retains maximum total sugar (17.3-17.4 °B) after 28 days storage. Bael pulp of cultivar Pant Sujata can be stored up to 2 months at 4 °C with good quality condition. Treatments were highly significant in the characters such as T.S.S., acidity, reducing sugar and total sugar. Storage conditions and treatments interaction were insignificant in all characters except acidity. The maximum T.S.S. of 21.87 °B has been found in sample treated with 800 ppm benzoic acid when kept for two months at 4 °C temperature. This treatment also resulted in retaining the maximum reducing sugar (8.09 %) and total sugar content (9.52 %) at same storage condition than other treatments. From the present experiments, it is concluded that CFB box packing and pulp storage with 800 ppm benzoic acid at 4 °C are important to extent the availability of bael for two months.Keywords: bael, storage, fruits, pulp, benzoic acid
Procedia PDF Downloads 2472362 Performance Comparison of Deep Convolutional Neural Networks for Binary Classification of Fine-Grained Leaf Images
Authors: Kamal KC, Zhendong Yin, Dasen Li, Zhilu Wu
Abstract:
Intra-plant disease classification based on leaf images is a challenging computer vision task due to similarities in texture, color, and shape of leaves with a slight variation of leaf spot; and external environmental changes such as lighting and background noises. Deep convolutional neural network (DCNN) has proven to be an effective tool for binary classification. In this paper, two methods for binary classification of diseased plant leaves using DCNN are presented; model created from scratch and transfer learning. Our main contribution is a thorough evaluation of 4 networks created from scratch and transfer learning of 5 pre-trained models. Training and testing of these models were performed on a plant leaf images dataset belonging to 16 distinct classes, containing a total of 22,265 images from 8 different plants, consisting of a pair of healthy and diseased leaves. We introduce a deep CNN model, Optimized MobileNet. This model with depthwise separable CNN as a building block attained an average test accuracy of 99.77%. We also present a fine-tuning method by introducing the concept of a convolutional block, which is a collection of different deep neural layers. Fine-tuned models proved to be efficient in terms of accuracy and computational cost. Fine-tuned MobileNet achieved an average test accuracy of 99.89% on 8 pairs of [healthy, diseased] leaf ImageSet.Keywords: deep convolution neural network, depthwise separable convolution, fine-grained classification, MobileNet, plant disease, transfer learning
Procedia PDF Downloads 1882361 Harmonic Data Preparation for Clustering and Classification
Authors: Ali Asheibi
Abstract:
The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.Keywords: data mining, harmonic data, clustering, classification
Procedia PDF Downloads 2502360 Unravelling the Knot: Towards a Definition of ‘Digital Labor’
Authors: Marta D'Onofrio
Abstract:
The debate on the digitalization of the economy has raised questions about how both labor and the regulation of work processes are changing due to the introduction of digital technologies in the productive system. Within the literature, the term ‘digital labor’ is commonly used to identify the impact of digitalization on labor. Despite the wide use of this term, it is still not available an unambiguous definition of it, and this could create confusion in the use of terminology and in the attempts of classification. As a consequence, the purpose of this paper is to provide for a definition and to propose a classification of ‘digital labor’, resorting to the theoretical approach of organizational studies.Keywords: digital labor, digitalization, data-driven algorithms, big data, organizational studies
Procedia PDF Downloads 1562359 Classification of Tropical Semi-Modules
Authors: Wagneur Edouard
Abstract:
Tropical algebra is the algebra constructed over an idempotent semifield S. We show here that every m-dimensional tropical module M over S with strongly independent basis can be embedded into Sm, and provide an algebraic invariant -the Γ-matrix of M- which characterises the isomorphy class of M. The strong independence condition also yields a significant improvement to the Whitney embedding for tropical torsion modules published earlier We also show that the strong independence of the basis of M is equivalent to the unique representation of elements of M. Numerous examples illustrate our results.Keywords: classification, idempotent semi-modules, strong independence, tropical algebra
Procedia PDF Downloads 3712358 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets
Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi
Abstract:
Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network
Procedia PDF Downloads 1392357 Engineering Parameters and Classification of Marly Soils of Tabriz
Authors: Amirali Mahouti, Hooshang Katebi
Abstract:
Enlargement of Tabriz metropolis to the east and north-east caused urban construction to be built on Marl layers and because of increase in excavations depth, further information of this layer is inescapable. Looking at geotechnical investigation shows there is not enough information about Tabriz Marl and this soil has been classified only by color. Tabriz Marl is lacustrine carbonate sediment outcrops, surrounds eastern, northern and southern region of city in the East Azerbaijan Province of Iran and is known as bed rock of city under alluvium sediments. This investigation aims to characterize geotechnical parameters of this soil to identify and set it in classification system of carbonated soils. For this purpose, specimens obtained from 80 locations over the city and subjected to physical and mechanical tests, such as Atterberg limits, density, moisture content, unconfined compression, direct shear and consolidation. CaCO3 content, organic content, PH, XRD, XRF, TGA and geophysical downhole tests also have been done on some of them.Keywords: carbonated soils, classification of soils, mineralogy, physical and mechanical tests for Marls, Tabriz Marl
Procedia PDF Downloads 3182356 Using New Machine Algorithms to Classify Iranian Musical Instruments According to Temporal, Spectral and Coefficient Features
Authors: Ronak Khosravi, Mahmood Abbasi Layegh, Siamak Haghipour, Avin Esmaili
Abstract:
In this paper, a study on classification of musical woodwind instruments using a small set of features selected from a broad range of extracted ones by the sequential forward selection method was carried out. Firstly, we extract 42 features for each record in the music database of 402 sound files belonging to five different groups of Flutes (end blown and internal duct), Single –reed, Double –reed (exposed and capped), Triple reed and Quadruple reed. Then, the sequential forward selection method is adopted to choose the best feature set in order to achieve very high classification accuracy. Two different classification techniques of support vector machines and relevance vector machines have been tested out and an accuracy of up to 96% can be achieved by using 21 time, frequency and coefficient features and relevance vector machine with the Gaussian kernel function.Keywords: coefficient features, relevance vector machines, spectral features, support vector machines, temporal features
Procedia PDF Downloads 3222355 Stabilization of Clay Soil Using A-3 Soil
Authors: Mohammed Mustapha Alhaji, Sadiku Salawu
Abstract:
A clay soil which classified under A-7-6 soil according to AASHTO soil classification system and CH according to the unified soil classification system was stabilized using A-3 soil (AASHTO soil classification system). The clay soil was replaced with 0%, 10%, 20% to 100% A-3 soil, compacted at both the BSL and BSH compaction energy level and using unconfined compressive strength as evaluation criteria. The MDD of the compactions at both the BSL and BSH compaction energy levels showed increase in MDD from 0% A-3 soil replacement to 40% A-3 soil replacement after which the values reduced to 100% A-3 soil replacement. The trend of the OMC with varied A-3 soil replacement is similar to that of MDD but in a reversed order. The OMC reduced from 0% A-3 soil replacement to 40% A-3 soil replacement after which the values increased to 100% A-3 soil replacement. This trend was attributed to the observed reduction in the void ratio from 0% A-3 soil replacement to 40% A-3 soil replacement after which the void ratio increased to 100% A-3 soil replacement. The maximum UCS for clay at varied A-3 soil replacement increased from 272 and 770kN/m2 for BSL and BSH compaction energy level at 0% A-3 soil replacement to 295 and 795kN/m2 for BSL and BSH compaction energy level respectively at 10% A-3 soil replacement after which the values reduced to 22 and 60kN/m2 for BSL and BSH compaction energy level respectively at 70% A-3 soil replacement. Beyond 70% A-3 soil replacement, the mixture cannot be moulded for UCS test.Keywords: A-3 soil, clay minerals, pozzolanic action, stabilization
Procedia PDF Downloads 445