Search results for: MFCC feature warping
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1603

Search results for: MFCC feature warping

883 Using New Machine Algorithms to Classify Iranian Musical Instruments According to Temporal, Spectral and Coefficient Features

Authors: Ronak Khosravi, Mahmood Abbasi Layegh, Siamak Haghipour, Avin Esmaili

Abstract:

In this paper, a study on classification of musical woodwind instruments using a small set of features selected from a broad range of extracted ones by the sequential forward selection method was carried out. Firstly, we extract 42 features for each record in the music database of 402 sound files belonging to five different groups of Flutes (end blown and internal duct), Single –reed, Double –reed (exposed and capped), Triple reed and Quadruple reed. Then, the sequential forward selection method is adopted to choose the best feature set in order to achieve very high classification accuracy. Two different classification techniques of support vector machines and relevance vector machines have been tested out and an accuracy of up to 96% can be achieved by using 21 time, frequency and coefficient features and relevance vector machine with the Gaussian kernel function.

Keywords: coefficient features, relevance vector machines, spectral features, support vector machines, temporal features

Procedia PDF Downloads 318
882 UWB Open Spectrum Access for a Smart Software Radio

Authors: Hemalatha Rallapalli, K. Lal Kishore

Abstract:

In comparison to systems that are typically designed to provide capabilities over a narrow frequency range through hardware elements, the next generation cognitive radios are intended to implement a broader range of capabilities through efficient spectrum exploitation. This offers the user the promise of greater flexibility, seamless roaming possible on different networks, countries, frequencies, etc. It requires true paradigm shift i.e., liberalization over a wide band of spectrum as well as a growth path to more and greater capability. This work contributes towards the design and implementation of an open spectrum access (OSA) feature to unlicensed users thus offering a frequency agile radio platform that is capable of performing spectrum sensing over a wideband. Thus, an ultra-wideband (UWB) radio, which has the intelligence of spectrum sensing only, unlike the cognitive radio with complete intelligence, is named as a Smart Software Radio (SSR). The spectrum sensing mechanism is implemented based on energy detection. Simulation results show the accuracy and validity of this method.

Keywords: cognitive radio, energy detection, software radio, spectrum sensing

Procedia PDF Downloads 427
881 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: Graphviz representation, semantic relatedness, similarity measurement, WordNet similarity

Procedia PDF Downloads 235
880 Smartphone Based Wound Assessment System for Diabetes Patients

Authors: Vaibhav V. Dixit, Shubham Ajay Karwa

Abstract:

Diabetic foot ulcers speak to a critical medical problem. Right now, clinicians and medical caretakers primarily construct their injury evaluation in light of visual examination of wound size and mending status, while the patients themselves rarely have a chance to play a dynamic part. Henceforth, love quantitative and practical examination technique that empowers the patients and their parental figures to take a more dynamic part in every day wound care possibly can quicken wound recuperating, spare travel cost and diminish human services costs. Considering the commonness of cell phones with a high-determination computerized camera, evaluating wounds by breaking down pictures of ceaseless foot ulcers is an alluring choice. In this paper, we propose a novel injury picture examination framework actualized using feature extraction and color segmentation. Here we are using the Normalized minimum distance classifier for classifying the output.

Keywords: diabetic, Gabor wavelet, normalized minimum distance classifier, quantiable parameters

Procedia PDF Downloads 266
879 Prosodic Characteristics of Post Traumatic Stress Disorder Induced Speech Changes

Authors: Jarek Krajewski, Andre Wittenborn, Martin Sauerland

Abstract:

This abstract describes a promising approach for estimating post-traumatic stress disorder (PTSD) based on prosodic speech characteristics. It illustrates the validity of this method by briefly discussing results from an Arabic refugee sample (N= 47, 32 m, 15 f). A well-established standardized self-report scale “Reaction of Adolescents to Traumatic Stress” (RATS) was used to determine the ground truth level of PTSD. The speech material was prompted by telling about autobiographical related sadness inducing experiences (sampling rate 16 kHz, 8 bit resolution). In order to investigate PTSD-induced speech changes, a self-developed set of 136 prosodic speech features was extracted from the .wav files. This set was adapted to capture traumatization related speech phenomena. An artificial neural network (ANN) machine learning model was applied to determine the PTSD level and reached a correlation of r = .37. These results indicate that our classifiers can achieve similar results to those seen in speech-based stress research.

Keywords: speech prosody, PTSD, machine learning, feature extraction

Procedia PDF Downloads 89
878 Investigation on the Properties of Particulate Reinforced AA2014 Metal Matrix Composite Materials Produced by Vacuum Infiltration Method

Authors: Isil Kerti, Onur Okur, Sibel Daglilar, Recep Calin

Abstract:

Particulate reinforced aluminium matrix composites have gained more importance in automotive, aeronautical and defense industries due to their specific properties like as low density, high strength and stiffness, good fatigue strength, dimensional stability at high temperature and acceptable tribological properties. In this study, 2014 Aluminium alloy used as a matrix material and B₄C and SiC were selected as reinforcements components. For production of composites materials, vacuum infiltration method was used. In the experimental studies, the reinforcement volume ratios were defined by mixing as totally 10% B₄C and SiC. Aging treatment (T6) was applied to the specimens. The effect of T6 treatment on hardness was determined by using Brinell hardness test method. The effects of the aging treatment on microstructure and chemical structure were analysed by making XRD, SEM and EDS analysis on the specimens.

Keywords: metal matrix composite, vacumm infiltration method, aluminum metal matrix, mechanical feature

Procedia PDF Downloads 312
877 Spin-Dependent Transport Signatures of Bound States: From Finger to Top Gates

Authors: Yun-Hsuan Yu, Chi-Shung Tang, Nzar Rauf Abdullah, Vidar Gudmundsson

Abstract:

Spin-orbit gap feature in energy dispersion of one-dimensional devices is revealed via strong spin-orbit interaction (SOI) effects under Zeeman field. We describe the utilization of a finger-gate or a top-gate to control the spin-dependent transport characteristics in the SOI-Zeeman influenced split-gate devices by means of a generalized spin-mixed propagation matrix method. For the finger-gate system, we find a bound state in continuum for incident electrons within the ultra-low energy regime. For the top-gate system, we observe more bound-state features in conductance associated with the formation of spin-associated hole-like or electron-like quasi-bound states around band thresholds, as well as hole bound states around the reverse point of the energy dispersion. We demonstrate that the spin-dependent transport behavior of a top-gate system is similar to that of a finger-gate system only if the top-gate length is less than the effective Fermi wavelength.

Keywords: spin-orbit, zeeman, top-gate, finger-gate, bound state

Procedia PDF Downloads 268
876 Comparison of Linear Discriminant Analysis and Support Vector Machine Classifications for Electromyography Signals Acquired at Five Positions of Elbow Joint

Authors: Amna Khan, Zareena Kausar, Saad Malik

Abstract:

Bio Mechatronics has extended applications in the field of rehabilitation. It has been contributing since World War II in improving the applicability of prosthesis and assistive devices in real life scenarios. In this paper, classification accuracies have been compared for two classifiers against five positions of elbow. Electromyography (EMG) signals analysis have been acquired directly from skeletal muscles of human forearm for each of the three defined positions and at modified extreme positions of elbow flexion and extension using 8 electrode Myo armband sensor. Features were extracted from filtered EMG signals for each position. Performance of two classifiers, support vector machine (SVM) and linear discriminant analysis (LDA) has been compared by analyzing the classification accuracies. SVM illustrated classification accuracies between 90-96%, in contrast to 84-87% depicted by LDA for five defined positions of elbow keeping the number of samples and selected feature the same for both SVM and LDA.

Keywords: classification accuracies, electromyography, linear discriminant analysis (LDA), Myo armband sensor, support vector machine (SVM)

Procedia PDF Downloads 366
875 A Predictive Machine Learning Model of the Survival of Female-led and Co-Led Small and Medium Enterprises in the UK

Authors: Mais Khader, Xingjie Wei

Abstract:

This research sheds light on female entrepreneurs by providing new insights on the survival predictions of companies led by females in the UK. This study aims to build a predictive machine learning model of the survival of female-led & co-led small & medium enterprises (SMEs) in the UK over the period 2000-2020. The predictive model built utilised a combination of financial and non-financial features related to both companies and their directors to predict SMEs' survival. These features were studied in terms of their contribution to the resultant predictive model. Five machine learning models are used in the modelling: Decision tree, AdaBoost, Naïve Bayes, Logistic regression and SVM. The AdaBoost model had the highest performance of the five models, with an accuracy of 73% and an AUC of 80%. The results show high feature importance in predicting companies' survival for company size, management experience, financial performance, industry, region, and females' percentage in management.

Keywords: company survival, entrepreneurship, females, machine learning, SMEs

Procedia PDF Downloads 99
874 Transition to Hydrogen Cities in Korea and Japan

Authors: Minhee Son, Kyung Nam Kim

Abstract:

This study explores the plan of the Korean and Japanese governments to transition into the hydrogen economy. Two motor companies, Hyundai Motor Company from Korea and Toyota from Japan, released the Hydrogen Fuel Cell Vehicle to monopolize the green energy automobile market. Although, they are the main countries which emit greenhouse gas, hydrogen energy can bring from a certain industry places, such as chemical plants and steel mills. Recent, the two countries have been focusing on the hydrogen industry including a fuel cell vehicle, a hydrogen station, a fuel cell plant, a residential fuel cell. The purpose of this paper is to find out the differences of the policies in the two countries to be hydrogen societies. We analyze the behavior of the public and private sectors in Korea and Japan about hydrogen energy and fuel cells for the transition of the hydrogen economy. Finally we show the similarities and differences of both countries in hydrogen fuel cells. And some cities have feature such as Hydrogen cities. Hydrogen energy can make impact environmental sustainability.

Keywords: fuel cell, hydrogen city, hydrogen fuel cell vehicle, hydrogen station, hydrogen energy

Procedia PDF Downloads 486
873 An Auxiliary Technique for Coronary Heart Disease Prediction by Analyzing Electrocardiogram Based on ResNet and Bi-Long Short-Term Memory

Authors: Yang Zhang, Jian He

Abstract:

Heart disease is one of the leading causes of death in the world, and coronary heart disease (CHD) is one of the major heart diseases. Electrocardiogram (ECG) is widely used in the detection of heart diseases, but the traditional manual method for CHD prediction by analyzing ECG requires lots of professional knowledge for doctors. This paper introduces sliding window and continuous wavelet transform (CWT) to transform ECG signals into images, and then ResNet and Bi-LSTM are introduced to build the ECG feature extraction network (namely ECGNet). At last, an auxiliary system for coronary heart disease prediction was developed based on modified ResNet18 and Bi-LSTM, and the public ECG dataset of CHD from MIMIC-3 was used to train and test the system. The experimental results show that the accuracy of the method is 83%, and the F1-score is 83%. Compared with the available methods for CHD prediction based on ECG, such as kNN, decision tree, VGGNet, etc., this method not only improves the prediction accuracy but also could avoid the degradation phenomenon of the deep learning network.

Keywords: Bi-LSTM, CHD, ECG, ResNet, sliding window

Procedia PDF Downloads 88
872 Transient Free Laminar Convection in the Vicinity of a Thermal Conductive Vertical Plate

Authors: Anna Bykalyuk, Frédéric Kuznik, Kévyn Johannes

Abstract:

In this paper, the influence of a vertical plate’s thermal capacity is numerically investigated in order to evaluate the evolution of the thermal boundary layer structure, as well as the convective heat transfer coefficient and the velocity and temperature profiles. Whereas the heat flux of the heated vertical plate is evaluated under time depending boundary conditions. The main important feature of this problem is the unsteadiness of the physical phenomena. A 2D CFD model is developed with the Ansys Fluent 14.0 environment and is validated using unsteady data obtained for plasterboard studied under a dynamic temperature evolution. All the phenomena produced in the vicinity of the thermal conductive vertical plate (plasterboard) are analyzed and discussed. This work is the first stage of a holistic research on transient free convection that aims, in the future, to study the natural convection in the vicinity of a vertical plate containing Phase Change Materials (PCM).

Keywords: CFD modeling, natural convection, thermal conductive plate, time-depending boundary conditions

Procedia PDF Downloads 274
871 Enhancement Dynamic Cars Detection Based on Optimized HOG Descriptor

Authors: Mansouri Nabila, Ben Jemaa Yousra, Motamed Cina, Watelain Eric

Abstract:

Research and development efforts in intelligent Advanced Driver Assistance Systems (ADAS) seek to save lives and reduce the number of on-road fatalities. For traffic and emergency monitoring, the essential but challenging task is vehicle detection and tracking in reasonably short time. This purpose needs first of all a powerful dynamic car detector model. In fact, this paper presents an optimized HOG process based on shape and motion parameters fusion. Our proposed approach mains to compute HOG by bloc feature from foreground blobs using configurable research window and pathway in order to overcome the shortcoming in term of computing time of HOG descriptor and improve their dynamic application performance. Indeed we prove in this paper that HOG by bloc descriptor combined with motion parameters is a very suitable car detector which reaches in record time a satisfactory recognition rate in dynamic outside area and bypasses several popular works without using sophisticated and expensive architectures such as GPU and FPGA.

Keywords: car-detector, HOG, motion, computing time

Procedia PDF Downloads 322
870 Requirement Engineering Within Open Source Software Development: A Case Study

Authors: Kars Beek, Remco Groeneveld, Sjaak Brinkkemper

Abstract:

Although there is much literature available on requirement documentation in traditional software development, few studies have been conducted about this topic in open source software development. While open-source software development is becoming more important, the software development processes are often not as structured as corporate software development processes. Papers show that communities, creating open-source software, often lack structure and documentation. However, most recent studies about this topic are often ten or more years old. Therefore, this research has been conducted to determine if the lack of structure and documentation in requirement engineering is currently still the situation in these communities. Three open-source products have been chosen as subjects for conducting this research. The data for this research was gathered based on interviews, observations, and analyses of feature proposals and issue tracking tools. In this paper, we present a comparison and an analysis of the different methods used for requirements documentation to understand the current practices of requirements documentation in open source software development.

Keywords: case study, open source software, open source software development, requirement elicitation, requirement engineering

Procedia PDF Downloads 102
869 On the Possibility of Real Time Characterisation of Ambient Toxicity Using Multi-Wavelength Photoacoustic Instrument

Authors: Tibor Ajtai, Máté Pintér, Noémi Utry, Gergely Kiss-Albert, Andrea Palágyi, László Manczinger, Csaba Vágvölgyi, Gábor Szabó, Zoltán Bozóki

Abstract:

According to the best knowledge of the authors, here we experimentally demonstrate first, a quantified correlation between the real-time measured optical feature of the ambient and the off-line measured toxicity data. Finally, using these correlations we are presenting a novel methodology for real time characterisation of ambient toxicity based on the multi wavelength aerosol phase photoacoustic measurement. Ambient carbonaceous particulate matter is one of the most intensively studied atmospheric constituent in climate science nowadays. Beyond their climatic impact, atmospheric soot also plays an important role as an air pollutant that harms human health. Moreover, according to the latest scientific assessments ambient soot is the second most important anthropogenic emission source, while in health aspect its being one of the most harmful atmospheric constituents as well. Despite of its importance, generally accepted standard methodology for the quantitative determination of ambient toxicology is not available yet. Dominantly, ambient toxicology measurement is based on the posterior analysis of filter accumulated aerosol with limited time resolution. Most of the toxicological studies are based on operational definitions using different measurement protocols therefore the comprehensive analysis of the existing data set is really limited in many cases. The situation is further complicated by the fact that even during its relatively short residence time the physicochemical features of the aerosol can be masked significantly by the actual ambient factors. Therefore, decreasing the time resolution of the existing methodology and developing real-time methodology for air quality monitoring are really actual issues in the air pollution research. During the last decades many experimental studies have verified that there is a relation between the chemical composition and the absorption feature quantified by Absorption Angström Exponent (AAE) of the carbonaceous particulate matter. Although the scientific community are in the common platform that the PhotoAcoustic Spectroscopy (PAS) is the only methodology that can measure the light absorption by aerosol with accurate and reliable way so far, the multi-wavelength PAS which are able to selectively characterise the wavelength dependency of absorption has become only available in the last decade. In this study, the first results of the intensive measurement campaign focusing the physicochemical and toxicological characterisation of ambient particulate matter are presented. Here we demonstrate the complete microphysical characterisation of winter time urban ambient including optical absorption and scattering as well as size distribution using our recently developed state of the art multi-wavelength photoacoustic instrument (4λ-PAS), integrating nephelometer (Aurora 3000) as well as single mobility particle sizer and optical particle counter (SMPS+C). Beyond this on-line characterisation of the ambient, we also demonstrate the results of the eco-, cyto- and genotoxicity measurements of ambient aerosol based on the posterior analysis of filter accumulated aerosol with 6h time resolution. We demonstrate a diurnal variation of toxicities and AAE data deduced directly from the multi-wavelength absorption measurement results.

Keywords: photoacoustic spectroscopy, absorption Angström exponent, toxicity, Ames-test

Procedia PDF Downloads 300
868 Hierarchical Tree Long Short-Term Memory for Sentence Representations

Authors: Xiuying Wang, Changliang Li, Bo Xu

Abstract:

A fixed-length feature vector is required for many machine learning algorithms in NLP field. Word embeddings have been very successful at learning lexical information. However, they cannot capture the compositional meaning of sentences, which prevents them from a deeper understanding of language. In this paper, we introduce a novel hierarchical tree long short-term memory (HTLSTM) model that learns vector representations for sentences of arbitrary syntactic type and length. We propose to split one sentence into three hierarchies: short phrase, long phrase and full sentence level. The HTLSTM model gives our algorithm the potential to fully consider the hierarchical information and long-term dependencies of language. We design the experiments on both English and Chinese corpus to evaluate our model on sentiment analysis task. And the results show that our model outperforms several existing state of the art approaches significantly.

Keywords: deep learning, hierarchical tree long short-term memory, sentence representation, sentiment analysis

Procedia PDF Downloads 348
867 Secure E-Voting Using Blockchain Technology

Authors: Barkha Ramteke, Sonali Ridhorkar

Abstract:

An election is an important event in all countries. Traditional voting has several drawbacks, including the expense of time and effort required for tallying and counting results, the cost of papers, arrangements, and everything else required to complete a voting process. Many countries are now considering online e-voting systems, but the traditional e-voting systems suffer a lack of trust. It is not known if a vote is counted correctly, tampered or not. A lack of transparency means that the voter has no assurance that his or her vote will be counted as they voted in elections. Electronic voting systems are increasingly using blockchain technology as an underlying storage mechanism to make the voting process more transparent and assure data immutability as blockchain technology grows in popularity. The transparent feature, on the other hand, may reveal critical information about applicants because all system users have the same entitlement to their data. Furthermore, because of blockchain's pseudo-anonymity, voters' privacy will be revealed, and third parties involved in the voting process, such as registration institutions, will be able to tamper with data. To overcome these difficulties, we apply Ethereum smart contracts into blockchain-based voting systems.

Keywords: blockchain, AMV chain, electronic voting, decentralized

Procedia PDF Downloads 133
866 Design and Optimization Fire Alarm System to Protect Gas Condensate Reservoirs With the Use of Nano-Technology

Authors: Hefzollah Mohammadian, Ensieh Hajeb, Mohamad Baqer Heidari

Abstract:

In this paper, for the protection and safety of tanks gases (flammable materials) and also due to the considerable economic value of the reservoir, the new system for the protection, the conservation and fire fighting has been cloned. The system consists of several parts: the Sensors to detect heat and fire with Nanotechnology (nano sensor), Barrier for isolation and protection from a range of two electronic zones, analyzer for detection and locating point of fire accurately, Main electronic board to announce fire, Fault diagnosis in different locations, such as relevant alarms and activate different devices for fire distinguish and announcement. An important feature of this system, high speed and capability of fire detection system in a way that is able to detect the value of the ambient temperature that can be adjusted. Another advantage of this system is autonomous and does not require human operator in place. Using nanotechnology, in addition to speeding up the work, reduces the cost of construction of the sensor and also the notification system and fire extinguish.

Keywords: analyser, barrier, heat resistance, general fault, general alarm, nano sensor

Procedia PDF Downloads 454
865 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: Anudeep Appe, Bhanu Poluparthi, Lakshmi Kasivajjula, Udai Mv, Sobha Bagadi, Punya Modi, Aditya Singh, Hemanth Gunupudi, Spenser Troiano, Jeff Paul, Justin Stovall, Justin Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data is considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since its data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for ex. quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP (SHapley Additive exPlanations), a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: competition, DAGs, facility, healthcare, machine learning, market share, random forest, SHAP

Procedia PDF Downloads 89
864 Predicting Machine-Down of Woodworking Industrial Machines

Authors: Matteo Calabrese, Martin Cimmino, Dimos Kapetis, Martina Manfrin, Donato Concilio, Giuseppe Toscano, Giovanni Ciandrini, Giancarlo Paccapeli, Gianluca Giarratana, Marco Siciliano, Andrea Forlani, Alberto Carrotta

Abstract:

In this paper we describe a machine learning methodology for Predictive Maintenance (PdM) applied on woodworking industrial machines. PdM is a prominent strategy consisting of all the operational techniques and actions required to ensure machine availability and to prevent a machine-down failure. One of the challenges with PdM approach is to design and develop of an embedded smart system to enable the health status of the machine. The proposed approach allows screening simultaneously multiple connected machines, thus providing real-time monitoring that can be adopted with maintenance management. This is achieved by applying temporal feature engineering techniques and training an ensemble of classification algorithms to predict Remaining Useful Lifetime of woodworking machines. The effectiveness of the methodology is demonstrated by testing an independent sample of additional woodworking machines without presenting machine down event.

Keywords: predictive maintenance, machine learning, connected machines, artificial intelligence

Procedia PDF Downloads 222
863 Detection and Classification of Rubber Tree Leaf Diseases Using Machine Learning

Authors: Kavyadevi N., Kaviya G., Gowsalya P., Janani M., Mohanraj S.

Abstract:

Hevea brasiliensis, also known as the rubber tree, is one of the foremost assets of crops in the world. One of the most significant advantages of the Rubber Plant in terms of air oxygenation is its capacity to reduce the likelihood of an individual developing respiratory allergies like asthma. To construct such a system that can properly identify crop diseases and pests and then create a database of insecticides for each pest and disease, we must first give treatment for the illness that has been detected. We shall primarily examine three major leaf diseases since they are economically deficient in this article, which is Bird's eye spot, algal spot and powdery mildew. And the recommended work focuses on disease identification on rubber tree leaves. It will be accomplished by employing one of the superior algorithms. Input, Preprocessing, Image Segmentation, Extraction Feature, and Classification will be followed by the processing technique. We will use time-consuming procedures that they use to detect the sickness. As a consequence, the main ailments, underlying causes, and signs and symptoms of diseases that harm the rubber tree are covered in this study.

Keywords: image processing, python, convolution neural network (CNN), machine learning

Procedia PDF Downloads 76
862 A Clustering-Based Approach for Weblog Data Cleaning

Authors: Amine Ganibardi, Cherif Arab Ali

Abstract:

This paper addresses the data cleaning issue as a part of web usage data preprocessing within the scope of Web Usage Mining. Weblog data recorded by web servers within log files reflect usage activity, i.e., End-users’ clicks and underlying user-agents’ hits. As Web Usage Mining is interested in End-users’ behavior, user-agents’ hits are referred to as noise to be cleaned-off before mining. Filtering hits from clicks is not trivial for two reasons, i.e., a server records requests interlaced in sequential order regardless of their source or type, website resources may be set up as requestable interchangeably by end-users and user-agents. The current methods are content-centric based on filtering heuristics of relevant/irrelevant items in terms of some cleaning attributes, i.e., website’s resources filetype extensions, website’s resources pointed by hyperlinks/URIs, http methods, user-agents, etc. These methods need exhaustive extra-weblog data and prior knowledge on the relevant and/or irrelevant items to be assumed as clicks or hits within the filtering heuristics. Such methods are not appropriate for dynamic/responsive Web for three reasons, i.e., resources may be set up to as clickable by end-users regardless of their type, website’s resources are indexed by frame names without filetype extensions, web contents are generated and cancelled differently from an end-user to another. In order to overcome these constraints, a clustering-based cleaning method centered on the logging structure is proposed. This method focuses on the statistical properties of the logging structure at the requested and referring resources attributes levels. It is insensitive to logging content and does not need extra-weblog data. The used statistical property takes on the structure of the generated logging feature by webpage requests in terms of clicks and hits. Since a webpage consists of its single URI and several components, these feature results in a single click to multiple hits ratio in terms of the requested and referring resources. Thus, the clustering-based method is meant to identify two clusters based on the application of the appropriate distance to the frequency matrix of the requested and referring resources levels. As the ratio clicks to hits is single to multiple, the clicks’ cluster is the smallest one in requests number. Hierarchical Agglomerative Clustering based on a pairwise distance (Gower) and average linkage has been applied to four logfiles of dynamic/responsive websites whose click to hits ratio range from 1/2 to 1/15. The optimal clustering set on the basis of average linkage and maximum inter-cluster inertia results always in two clusters. The evaluation of the smallest cluster referred to as clicks cluster under the terms of confusion matrix indicators results in 97% of true positive rate. The content-centric cleaning methods, i.e., conventional and advanced cleaning, resulted in a lower rate 91%. Thus, the proposed clustering-based cleaning outperforms the content-centric methods within dynamic and responsive web design without the need of any extra-weblog. Such an improvement in cleaning quality is likely to refine dependent analysis.

Keywords: clustering approach, data cleaning, data preprocessing, weblog data, web usage data

Procedia PDF Downloads 168
861 Parkinson's Disease Gene Identification Using Physicochemical Properties of Amino Acids

Authors: Priya Arora, Ashutosh Mishra

Abstract:

Gene identification, towards the pursuit of mutated genes, leading to Parkinson’s disease, puts forward a challenge towards proactive cure of the disorder itself. Computational analysis is an effective technique for exploring genes in the form of protein sequences, as the theoretical and manual analysis is infeasible. The limitations and effectiveness of a particular computational method are entirely dependent on the previous data that is available for disease identification. The article presents a sequence-based classification method for the identification of genes responsible for Parkinson’s disease. During the initiation phase, the physicochemical properties of amino acids transform protein sequences into a feature vector. The second phase of the method employs Jaccard distances to select negative genes from the candidate population. The third phase involves artificial neural networks for making final predictions. The proposed approach is compared with the state of art methods on the basis of F-measure. The results confirm and estimate the efficiency of the method.

Keywords: disease gene identification, Parkinson’s disease, physicochemical properties of amino acid, protein sequences

Procedia PDF Downloads 139
860 Optimized Preprocessing for Accurate and Efficient Bioassay Prediction with Machine Learning Algorithms

Authors: Jeff Clarine, Chang-Shyh Peng, Daisy Sang

Abstract:

Bioassay is the measurement of the potency of a chemical substance by its effect on a living animal or plant tissue. Bioassay data and chemical structures from pharmacokinetic and drug metabolism screening are mined from and housed in multiple databases. Bioassay prediction is calculated accordingly to determine further advancement. This paper proposes a four-step preprocessing of datasets for improving the bioassay predictions. The first step is instance selection in which dataset is categorized into training, testing, and validation sets. The second step is discretization that partitions the data in consideration of accuracy vs. precision. The third step is normalization where data are normalized between 0 and 1 for subsequent machine learning processing. The fourth step is feature selection where key chemical properties and attributes are generated. The streamlined results are then analyzed for the prediction of effectiveness by various machine learning algorithms including Pipeline Pilot, R, Weka, and Excel. Experiments and evaluations reveal the effectiveness of various combination of preprocessing steps and machine learning algorithms in more consistent and accurate prediction.

Keywords: bioassay, machine learning, preprocessing, virtual screen

Procedia PDF Downloads 273
859 A Geospatial Consumer Marketing Campaign Optimization Strategy: Case of Fuzzy Approach in Nigeria Mobile Market

Authors: Adeolu O. Dairo

Abstract:

Getting the consumer marketing strategy right is a crucial and complex task for firms with a large customer base such as mobile operators in a competitive mobile market. While empirical studies have made efforts to identify key constructs, no geospatial model has been developed to comprehensively assess the viability and interdependency of ground realities regarding the customer, competition, channel and the network quality of mobile operators. With this research, a geo-analytic framework is proposed for strategy formulation and allocation for mobile operators. Firstly, a fuzzy analytic network using a self-organizing feature map clustering technique based on inputs from managers and literature, which depicts the interrelationships amongst ground realities is developed. The model is tested with a mobile operator in the Nigeria mobile market. As a result, a customer-centric geospatial and visualization solution is developed. This provides a consolidated and integrated insight that serves as a transparent, logical and practical guide for strategic, tactical and operational decision making.

Keywords: geospatial, geo-analytics, self-organizing map, customer-centric

Procedia PDF Downloads 182
858 A Framework for Automated Nuclear Waste Classification

Authors: Seonaid Hume, Gordon Dobie, Graeme West

Abstract:

Detecting and localizing radioactive sources is a necessity for safe and secure decommissioning of nuclear facilities. An important aspect for the management of the sort-and-segregation process is establishing the spatial distributions and quantities of the waste radionuclides, their type, corresponding activity, and ultimately classification for disposal. The data received from surveys directly informs decommissioning plans, on-site incident management strategies, the approach needed for a new cell, as well as protecting the workforce and the public. Manual classification of nuclear waste from a nuclear cell is time-consuming, expensive, and requires significant expertise to make the classification judgment call. Also, in-cell decommissioning is still in its relative infancy, and few techniques are well-developed. As with any repetitive and routine tasks, there is the opportunity to improve the task of classifying nuclear waste using autonomous systems. Hence, this paper proposes a new framework for the automatic classification of nuclear waste. This framework consists of five main stages; 3D spatial mapping and object detection, object classification, radiological mapping, source localisation based on gathered evidence and finally, waste classification. The first stage of the framework, 3D visual mapping, involves object detection from point cloud data. A review of related applications in other industries is provided, and recommendations for approaches for waste classification are made. Object detection focusses initially on cylindrical objects since pipework is significant in nuclear cells and indeed any industrial site. The approach can be extended to other commonly occurring primitives such as spheres and cubes. This is in preparation of stage two, characterizing the point cloud data and estimating the dimensions, material, degradation, and mass of the objects detected in order to feature match them to an inventory of possible items found in that nuclear cell. Many items in nuclear cells are one-offs, have limited or poor drawings available, or have been modified since installation, and have complex interiors, which often and inadvertently pose difficulties when accessing certain zones and identifying waste remotely. Hence, this may require expert input to feature match objects. The third stage, radiological mapping, is similar in order to facilitate the characterization of the nuclear cell in terms of radiation fields, including the type of radiation, activity, and location within the nuclear cell. The fourth stage of the framework takes the visual map for stage 1, the object characterization from stage 2, and radiation map from stage 3 and fuses them together, providing a more detailed scene of the nuclear cell by identifying the location of radioactive materials in three dimensions. The last stage involves combining the evidence from the fused data sets to reveal the classification of the waste in Bq/kg, thus enabling better decision making and monitoring for in-cell decommissioning. The presentation of the framework is supported by representative case study data drawn from an application in decommissioning from a UK nuclear facility. This framework utilises recent advancements of the detection and mapping capabilities of complex radiation fields in three dimensions to make the process of classifying nuclear waste faster, more reliable, cost-effective and safer.

Keywords: nuclear decommissioning, radiation detection, object detection, waste classification

Procedia PDF Downloads 200
857 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 335
856 Geometric Simplification Method of Building Energy Model Based on Building Performance Simulation

Authors: Yan Lyu, Yiqun Pan, Zhizhong Huang

Abstract:

In the design stage of a new building, the energy model of this building is often required for the analysis of the performance on energy efficiency. In practice, a certain degree of geometric simplification should be done in the establishment of building energy models, since the detailed geometric features of a real building are hard to be described perfectly in most energy simulation engine, such as ESP-r, eQuest or EnergyPlus. Actually, the detailed description is not necessary when the result with extremely high accuracy is not demanded. Therefore, this paper analyzed the relationship between the error of the simulation result from building energy models and the geometric simplification of the models. Finally, the following two parameters are selected as the indices to characterize the geometric feature of in building energy simulation: the southward projected area and total side surface area of the building, Based on the parameterization method, the simplification from an arbitrary column building to a typical shape (a cuboid) building can be made for energy modeling. The result in this study indicates that this simplification would only lead to the error that is less than 7% for those buildings with the ratio of southward projection length to total perimeter of the bottom of 0.25~0.35, which can cover most situations.

Keywords: building energy model, simulation, geometric simplification, design, regression

Procedia PDF Downloads 178
855 Transfer Learning for Protein Structure Classification at Low Resolution

Authors: Alexander Hudson, Shaogang Gong

Abstract:

Structure determination is key to understanding protein function at a molecular level. Whilst significant advances have been made in predicting structure and function from amino acid sequence, researchers must still rely on expensive, time-consuming analytical methods to visualise detailed protein conformation. In this study, we demonstrate that it is possible to make accurate (≥80%) predictions of protein class and architecture from structures determined at low (>3A) resolution, using a deep convolutional neural network trained on high-resolution (≤3A) structures represented as 2D matrices. Thus, we provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function. We investigate the impact of the input representation on classification performance, showing that side-chain information may not be necessary for fine-grained structure predictions. Finally, we confirm that high resolution, low-resolution and NMR-determined structures inhabit a common feature space, and thus provide a theoretical foundation for boosting with single-image super-resolution.

Keywords: transfer learning, protein distance maps, protein structure classification, neural networks

Procedia PDF Downloads 134
854 Protein Remote Homology Detection by Using Profile-Based Matrix Transformation Approaches

Authors: Bin Liu

Abstract:

As one of the most important tasks in protein sequence analysis, protein remote homology detection has been studied for decades. Currently, the profile-based methods show state-of-the-art performance. Position-Specific Frequency Matrix (PSFM) is widely used profile. However, there exists noise information in the profiles introduced by the amino acids with low frequencies. In this study, we propose a method to remove the noise information in the PSFM by removing the amino acids with low frequencies called Top frequency profile (TFP). Three new matrix transformation methods, including Autocross covariance (ACC) transformation, Tri-gram, and K-separated bigram (KSB), are performed on these profiles to convert them into fixed length feature vectors. Combined with Support Vector Machines (SVMs), the predictors are constructed. Evaluated on two benchmark datasets, and experimental results show that these proposed methods outperform other state-of-the-art predictors.

Keywords: protein remote homology detection, protein fold recognition, top frequency profile, support vector machines

Procedia PDF Downloads 124