Search results for: support vector machine
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9532

Search results for: support vector machine

9322 A Decision Support System to Detect the Lumbar Disc Disease on the Basis of Clinical MRI

Authors: Yavuz Unal, Kemal Polat, H. Erdinc Kocer

Abstract:

In this study, a decision support system comprising three stages has been proposed to detect the disc abnormalities of the lumbar region. In the first stage named the feature extraction, T2-weighted sagittal and axial Magnetic Resonance Images (MRI) were taken from 55 people and then 27 appearance and shape features were acquired from both sagittal and transverse images. In the second stage named the feature weighting process, k-means clustering based feature weighting (KMCBFW) proposed by Gunes et al. Finally, in the third stage named the classification process, the classifier algorithms including multi-layer perceptron (MLP- neural network), support vector machine (SVM), Naïve Bayes, and decision tree have been used to classify whether the subject has lumbar disc or not. In order to test the performance of the proposed method, the classification accuracy (%), sensitivity, specificity, precision, recall, f-measure, kappa value, and computation times have been used. The best hybrid model is the combination of k-means clustering based feature weighting and decision tree in the detecting of lumbar disc disease based on both sagittal and axial MR images.

Keywords: lumbar disc abnormality, lumbar MRI, lumbar spine, hybrid models, hybrid features, k-means clustering based feature weighting

Procedia PDF Downloads 487
9321 Isolation and Classification of Red Blood Cells in Anemic Microscopic Images

Authors: Jameela Ali Alkrimi, Abdul Rahim Ahmad, Azizah Suliman, Loay E. George

Abstract:

Red blood cells (RBCs) are among the most commonly and intensively studied type of blood cells in cell biology. The lack of RBCs is a condition characterized by lower than normal hemoglobin level; this condition is referred to as 'anemia'. In this study, a software was developed to isolate RBCs by using a machine learning approach to classify anemic RBCs in microscopic images. Several features of RBCs were extracted using image processing algorithms, including principal component analysis (PCA). With the proposed method, RBCs were isolated in 34 second from an image containing 18 to 27 cells. We also proposed that PCA could be performed to increase the speed and efficiency of classification. Our classifier algorithm yielded accuracy rates of 100%, 99.99%, and 96.50% for K-nearest neighbor (K-NN) algorithm, support vector machine (SVM), and neural network ANN, respectively. Classification was evaluated in highly sensitivity, specificity, and kappa statistical parameters. In conclusion, the classification results were obtained for a short time period with more efficient when PCA was used.

Keywords: red blood cells, pre-processing image algorithms, classification algorithms, principal component analysis PCA, confusion matrix, kappa statistical parameters, ROC

Procedia PDF Downloads 370
9320 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 256
9319 Advancements in Predicting Diabetes Biomarkers: A Machine Learning Epigenetic Approach

Authors: James Ladzekpo

Abstract:

Background: The urgent need to identify new pharmacological targets for diabetes treatment and prevention has been amplified by the disease's extensive impact on individuals and healthcare systems. A deeper insight into the biological underpinnings of diabetes is crucial for the creation of therapeutic strategies aimed at these biological processes. Current predictive models based on genetic variations fall short of accurately forecasting diabetes. Objectives: Our study aims to pinpoint key epigenetic factors that predispose individuals to diabetes. These factors will inform the development of an advanced predictive model that estimates diabetes risk from genetic profiles, utilizing state-of-the-art statistical and data mining methods. Methodology: We have implemented a recursive feature elimination with cross-validation using the support vector machine (SVM) approach for refined feature selection. Building on this, we developed six machine learning models, including logistic regression, k-Nearest Neighbors (k-NN), Naive Bayes, Random Forest, Gradient Boosting, and Multilayer Perceptron Neural Network, to evaluate their performance. Findings: The Gradient Boosting Classifier excelled, achieving a median recall of 92.17% and outstanding metrics such as area under the receiver operating characteristics curve (AUC) with a median of 68%, alongside median accuracy and precision scores of 76%. Through our machine learning analysis, we identified 31 genes significantly associated with diabetes traits, highlighting their potential as biomarkers and targets for diabetes management strategies. Conclusion: Particularly noteworthy were the Gradient Boosting Classifier and Multilayer Perceptron Neural Network, which demonstrated potential in diabetes outcome prediction. We recommend future investigations to incorporate larger cohorts and a wider array of predictive variables to enhance the models' predictive capabilities.

Keywords: diabetes, machine learning, prediction, biomarkers

Procedia PDF Downloads 8
9318 Adaptive Neuro Fuzzy Inference System Model Based on Support Vector Regression for Stock Time Series Forecasting

Authors: Anita Setianingrum, Oki S. Jaya, Zuherman Rustam

Abstract:

Forecasting stock price is a challenging task due to the complex time series of the data. The complexity arises from many variables that affect the stock market. Many time series models have been proposed before, but those previous models still have some problems: 1) put the subjectivity of choosing the technical indicators, and 2) rely upon some assumptions about the variables, so it is limited to be applied to all datasets. Therefore, this paper studied a novel Adaptive Neuro-Fuzzy Inference System (ANFIS) time series model based on Support Vector Regression (SVR) for forecasting the stock market. In order to evaluate the performance of proposed models, stock market transaction data of TAIEX and HIS from January to December 2015 is collected as experimental datasets. As a result, the method has outperformed its counterparts in terms of accuracy.

Keywords: ANFIS, fuzzy time series, stock forecasting, SVR

Procedia PDF Downloads 205
9317 Investigation of Different Machine Learning Algorithms in Large-Scale Land Cover Mapping within the Google Earth Engine

Authors: Amin Naboureh, Ainong Li, Jinhu Bian, Guangbin Lei, Hamid Ebrahimy

Abstract:

Large-scale land cover mapping has become a new challenge in land change and remote sensing field because of involving a big volume of data. Moreover, selecting the right classification method, especially when there are different types of landscapes in the study area is quite difficult. This paper is an attempt to compare the performance of different machine learning (ML) algorithms for generating a land cover map of the China-Central Asia–West Asia Corridor that is considered as one of the main parts of the Belt and Road Initiative project (BRI). The cloud-based Google Earth Engine (GEE) platform was used for generating a land cover map for the study area from Landsat-8 images (2017) by applying three frequently used ML algorithms including random forest (RF), support vector machine (SVM), and artificial neural network (ANN). The selected ML algorithms (RF, SVM, and ANN) were trained and tested using reference data obtained from MODIS yearly land cover product and very high-resolution satellite images. The finding of the study illustrated that among three frequently used ML algorithms, RF with 91% overall accuracy had the best result in producing a land cover map for the China-Central Asia–West Asia Corridor whereas ANN showed the worst result with 85% overall accuracy. The great performance of the GEE in applying different ML algorithms and handling huge volume of remotely sensed data in the present study showed that it could also help the researchers to generate reliable long-term land cover change maps. The finding of this research has great importance for decision-makers and BRI’s authorities in strategic land use planning.

Keywords: land cover, google earth engine, machine learning, remote sensing

Procedia PDF Downloads 85
9316 A Data-Mining Model for Protection of FACTS-Based Transmission Line

Authors: Ashok Kalagura

Abstract:

This paper presents a data-mining model for fault-zone identification of flexible AC transmission systems (FACTS)-based transmission line including a thyristor-controlled series compensator (TCSC) and unified power-flow controller (UPFC), using ensemble decision trees. Given the randomness in the ensemble of decision trees stacked inside the random forests model, it provides an effective decision on the fault-zone identification. Half-cycle post-fault current and voltage samples from the fault inception are used as an input vector against target output ‘1’ for the fault after TCSC/UPFC and ‘1’ for the fault before TCSC/UPFC for fault-zone identification. The algorithm is tested on simulated fault data with wide variations in operating parameters of the power system network, including noisy environment providing a reliability measure of 99% with faster response time (3/4th cycle from fault inception). The results of the presented approach using the RF model indicate the reliable identification of the fault zone in FACTS-based transmission lines.

Keywords: distance relaying, fault-zone identification, random forests, RFs, support vector machine, SVM, thyristor-controlled series compensator, TCSC, unified power-flow controller, UPFC

Procedia PDF Downloads 393
9315 Single Imputation for Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125Hz to 8000Hz. The data contains patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R\textsuperscript{2} values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R\textsuperscript{2} values for the best models for KNN ranges from .89 to .95. The best imputation models received R\textsuperscript{2} between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our best imputation models versus constant imputations by a two percent increase.

Keywords: machine learning, audiograms, data imputations, single imputations

Procedia PDF Downloads 42
9314 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 58
9313 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 112
9312 The Boundary Element Method in Excel for Teaching Vector Calculus and Simulation

Authors: Stephen Kirkup

Abstract:

This paper discusses the implementation of the boundary element method (BEM) on an Excel spreadsheet and how it can be used in teaching vector calculus and simulation. There are two separate spreadheets, within which Laplace equation is solved by the BEM in two dimensions (LIBEM2) and axisymmetric three dimensions (LBEMA). The main algorithms are implemented in the associated programming language within Excel, Visual Basic for Applications (VBA). The BEM only requires a boundary mesh and hence it is a relatively accessible method. The BEM in the open spreadsheet environment is demonstrated as being useful as an aid to teaching and learning. The application of the BEM implemented on a spreadsheet for educational purposes in introductory vector calculus and simulation is explored. The development of assignment work is discussed, and sample results from student work are given. The spreadsheets were found to be useful tools in developing the students’ understanding of vector calculus and in simulating heat conduction.

Keywords: boundary element method, Laplace’s equation, vector calculus, simulation, education

Procedia PDF Downloads 123
9311 Multivariate Output-Associative RVM for Multi-Dimensional Affect Predictions

Authors: Achut Manandhar, Kenneth D. Morton, Peter A. Torrione, Leslie M. Collins

Abstract:

The current trends in affect recognition research are to consider continuous observations from spontaneous natural interactions in people using multiple feature modalities, and to represent affect in terms of continuous dimensions, incorporate spatio-temporal correlation among affect dimensions, and provide fast affect predictions. These research efforts have been propelled by a growing effort to develop affect recognition system that can be implemented to enable seamless real-time human-computer interaction in a wide variety of applications. Motivated by these desired attributes of an affect recognition system, in this work a multi-dimensional affect prediction approach is proposed by integrating multivariate Relevance Vector Machine (MVRVM) with a recently developed Output-associative Relevance Vector Machine (OARVM) approach. The resulting approach can provide fast continuous affect predictions by jointly modeling the multiple affect dimensions and their correlations. Experiments on the RECOLA database show that the proposed approach performs competitively with the OARVM while providing faster predictions during testing.

Keywords: dimensional affect prediction, output-associative RVM, multivariate regression, fast testing

Procedia PDF Downloads 251
9310 Performance of Total Vector Error of an Estimated Phasor within Local Area Networks

Authors: Ahmed Abdolkhalig, Rastko Zivanovic

Abstract:

This paper evaluates the Total Vector Error of an estimated Phasor as define in IEEE C37.118 standard within different medium access in Local Area Networks (LAN). Three different LAN models (CSMA/CD, CSMA/AMP, and Switched Ethernet) are evaluated. The Total Vector Error of the estimated Phasor has been evaluated for the effect of Nodes Number under the standardized network Band-width values defined in IEC 61850-9-2 communication standard (i.e. 0.1, 1, and 10 Gbps).

Keywords: phasor, local area network, total vector error, IEEE C37.118, IEC 61850

Procedia PDF Downloads 272
9309 Protein Remote Homology Detection by Using Profile-Based Matrix Transformation Approaches

Authors: Bin Liu

Abstract:

As one of the most important tasks in protein sequence analysis, protein remote homology detection has been studied for decades. Currently, the profile-based methods show state-of-the-art performance. Position-Specific Frequency Matrix (PSFM) is widely used profile. However, there exists noise information in the profiles introduced by the amino acids with low frequencies. In this study, we propose a method to remove the noise information in the PSFM by removing the amino acids with low frequencies called Top frequency profile (TFP). Three new matrix transformation methods, including Autocross covariance (ACC) transformation, Tri-gram, and K-separated bigram (KSB), are performed on these profiles to convert them into fixed length feature vectors. Combined with Support Vector Machines (SVMs), the predictors are constructed. Evaluated on two benchmark datasets, and experimental results show that these proposed methods outperform other state-of-the-art predictors.

Keywords: protein remote homology detection, protein fold recognition, top frequency profile, support vector machines

Procedia PDF Downloads 92
9308 Role of Machine Learning in Internet of Things Enabled Smart Cities

Authors: Amit Prakash Singh, Shyamli Singh, Chavi Srivastav

Abstract:

This paper presents the idea of Internet of Thing (IoT) for the infrastructure of smart cities. Internet of Thing has been visualized as a communication prototype that incorporates myriad of digital services. The various component of the smart cities shall be implemented using microprocessor, microcontroller, sensors for network communication and protocols. IoT enabled systems have been devised to support the smart city vision, of which aim is to exploit the currently available precocious communication technologies to support the value-added services for function of the city. Due to volume, variety, and velocity of data, it requires analysis using Big Data concept. This paper presented the various techniques used to analyze big data using machine learning.

Keywords: IoT, smart city, embedded systems, sustainable environment

Procedia PDF Downloads 535
9307 3D Classification Optimization of Low-Density Airborne Light Detection and Ranging Point Cloud by Parameters Selection

Authors: Baha Eddine Aissou, Aichouche Belhadj Aissa

Abstract:

Light detection and ranging (LiDAR) is an active remote sensing technology used for several applications. Airborne LiDAR is becoming an important technology for the acquisition of a highly accurate dense point cloud. A classification of airborne laser scanning (ALS) point cloud is a very important task that still remains a real challenge for many scientists. Support vector machine (SVM) is one of the most used statistical learning algorithms based on kernels. SVM is a non-parametric method, and it is recommended to be used in cases where the data distribution cannot be well modeled by a standard parametric probability density function. Using a kernel, it performs a robust non-linear classification of samples. Often, the data are rarely linearly separable. SVMs are able to map the data into a higher-dimensional space to become linearly separable, which allows performing all the computations in the original space. This is one of the main reasons that SVMs are well suited for high-dimensional classification problems. Only a few training samples, called support vectors, are required. SVM has also shown its potential to cope with uncertainty in data caused by noise and fluctuation, and it is computationally efficient as compared to several other methods. Such properties are particularly suited for remote sensing classification problems and explain their recent adoption. In this poster, the SVM classification of ALS LiDAR data is proposed. Firstly, connected component analysis is applied for clustering the point cloud. Secondly, the resulting clusters are incorporated in the SVM classifier. Radial basic function (RFB) kernel is used due to the few numbers of parameters (C and γ) that needs to be chosen, which decreases the computation time. In order to optimize the classification rates, the parameters selection is explored. It consists to find the parameters (C and γ) leading to the best overall accuracy using grid search and 5-fold cross-validation. The exploited LiDAR point cloud is provided by the German Society for Photogrammetry, Remote Sensing, and Geoinformation. The ALS data used is characterized by a low density (4-6 points/m²) and is covering an urban area located in residential parts of the city Vaihingen in southern Germany. The class ground and three other classes belonging to roof superstructures are considered, i.e., a total of 4 classes. The training and test sets are selected randomly several times. The obtained results demonstrated that a parameters selection can orient the selection in a restricted interval of (C and γ) that can be further explored but does not systematically lead to the optimal rates. The SVM classifier with hyper-parameters is compared with the most used classifiers in literature for LiDAR data, random forest, AdaBoost, and decision tree. The comparison showed the superiority of the SVM classifier using parameters selection for LiDAR data compared to other classifiers.

Keywords: classification, airborne LiDAR, parameters selection, support vector machine

Procedia PDF Downloads 117
9306 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 575
9305 A Medical Resource Forecasting Model for Emergency Room Patients with Acute Hepatitis

Authors: R. J. Kuo, W. C. Cheng, W. C. Lien, T. J. Yang

Abstract:

Taiwan is a hyper endemic area for the Hepatitis B virus (HBV). The estimated total number of HBsAg carriers in the general population who are more than 20 years old is more than 3 million. Therefore, a case record review is conducted from January 2003 to June 2007 for all patients with a diagnosis of acute hepatitis who were admitted to the Emergency Department (ED) of a well-known teaching hospital. The cost for the use of medical resources is defined as the total medical fee. In this study, principal component analysis (PCA) is firstly employed to reduce the number of dimensions. Support vector regression (SVR) and artificial neural network (ANN) are then used to develop the forecasting model. A total of 117 patients meet the inclusion criteria. 61% patients involved in this study are hepatitis B related. The computational result shows that the proposed PCA-SVR model has superior performance than other compared algorithms. In conclusion, the Child-Pugh score and echogram can both be used to predict the cost of medical resources for patients with acute hepatitis in the ED.

Keywords: acute hepatitis, medical resource cost, artificial neural network, support vector regression

Procedia PDF Downloads 390
9304 Development of Prediction Models of Day-Ahead Hourly Building Electricity Consumption and Peak Power Demand Using the Machine Learning Method

Authors: Dalin Si, Azizan Aziz, Bertrand Lasternas

Abstract:

To encourage building owners to purchase electricity at the wholesale market and reduce building peak demand, this study aims to develop models that predict day-ahead hourly electricity consumption and demand using artificial neural network (ANN) and support vector machine (SVM). All prediction models are built in Python, with tool Scikit-learn and Pybrain. The input data for both consumption and demand prediction are time stamp, outdoor dry bulb temperature, relative humidity, air handling unit (AHU), supply air temperature and solar radiation. Solar radiation, which is unavailable a day-ahead, is predicted at first, and then this estimation is used as an input to predict consumption and demand. Models to predict consumption and demand are trained in both SVM and ANN, and depend on cooling or heating, weekdays or weekends. The results show that ANN is the better option for both consumption and demand prediction. It can achieve 15.50% to 20.03% coefficient of variance of root mean square error (CVRMSE) for consumption prediction and 22.89% to 32.42% CVRMSE for demand prediction, respectively. To conclude, the presented models have potential to help building owners to purchase electricity at the wholesale market, but they are not robust when used in demand response control.

Keywords: building energy prediction, data mining, demand response, electricity market

Procedia PDF Downloads 279
9303 Ontology-Driven Knowledge Discovery and Validation from Admission Databases: A Structural Causal Model Approach for Polytechnic Education in Nigeria

Authors: Bernard Igoche Igoche, Olumuyiwa Matthew, Peter Bednar, Alexander Gegov

Abstract:

This study presents an ontology-driven approach for knowledge discovery and validation from admission databases in Nigerian polytechnic institutions. The research aims to address the challenges of extracting meaningful insights from vast amounts of admission data and utilizing them for decision-making and process improvement. The proposed methodology combines the knowledge discovery in databases (KDD) process with a structural causal model (SCM) ontological framework. The admission database of Benue State Polytechnic Ugbokolo (Benpoly) is used as a case study. The KDD process is employed to mine and distill knowledge from the database, while the SCM ontology is designed to identify and validate the important features of the admission process. The SCM validation is performed using the conditional independence test (CIT) criteria, and an algorithm is developed to implement the validation process. The identified features are then used for machine learning (ML) modeling and prediction of admission status. The results demonstrate the adequacy of the SCM ontological framework in representing the admission process and the high predictive accuracies achieved by the ML models, with k-nearest neighbors (KNN) and support vector machine (SVM) achieving 92% accuracy. The study concludes that the proposed ontology-driven approach contributes to the advancement of educational data mining and provides a foundation for future research in this domain.

Keywords: admission databases, educational data mining, machine learning, ontology-driven knowledge discovery, polytechnic education, structural causal model

Procedia PDF Downloads 12
9302 A Survey on Intelligent Techniques Based Modelling of Size Enlargement Process for Fine Materials

Authors: Mohammad Nadeem, Haider Banka, R. Venugopal

Abstract:

Granulation or agglomeration is a size enlargement process to transform the fine particulates into larger aggregates since the fine size of available materials and minerals poses difficulty in their utilization. Though a long list of methods is available in the literature for the modeling of granulation process to facilitate the in-depth understanding and interpretation of the system, there is still scope of improvements using novel tools and techniques. Intelligent techniques, such as artificial neural network, fuzzy logic, self-organizing map, support vector machine and others, have emerged as compelling alternatives for dealing with imprecision and complex non-linearity of the systems. The present study tries to review the applications of intelligent techniques in the modeling of size enlargement process for fine materials.

Keywords: fine material, granulation, intelligent technique, modelling

Procedia PDF Downloads 333
9301 Predicting Daily Patient Hospital Visits Using Machine Learning

Authors: Shreya Goyal

Abstract:

The study aims to build user-friendly software to understand patient arrival patterns and compute the number of potential patients who will visit a particular health facility for a given period by using a machine learning algorithm. The underlying machine learning algorithm used in this study is the Support Vector Machine (SVM). Accurate prediction of patient arrival allows hospitals to operate more effectively, providing timely and efficient care while optimizing resources and improving patient experience. It allows for better allocation of staff, equipment, and other resources. If there's a projected surge in patients, additional staff or resources can be allocated to handle the influx, preventing bottlenecks or delays in care. Understanding patient arrival patterns can also help streamline processes to minimize waiting times for patients and ensure timely access to care for patients in need. Another big advantage of using this software is adhering to strict data protection regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States as the hospital will not have to share the data with any third party or upload it to the cloud because the software can read data locally from the machine. The data needs to be arranged in. a particular format and the software will be able to read the data and provide meaningful output. Using software that operates locally can facilitate compliance with these regulations by minimizing data exposure. Keeping patient data within the hospital's local systems reduces the risk of unauthorized access or breaches associated with transmitting data over networks or storing it in external servers. This can help maintain the confidentiality and integrity of sensitive patient information. Historical patient data is used in this study. The input variables used to train the model include patient age, time of day, day of the week, seasonal variations, and local events. The algorithm uses a Supervised learning method to optimize the objective function and find the global minima. The algorithm stores the values of the local minima after each iteration and at the end compares all the local minima to find the global minima. The strength of this study is the transfer function used to calculate the number of patients. The model has an output accuracy of >95%. The method proposed in this study could be used for better management planning of personnel and medical resources.

Keywords: machine learning, SVM, HIPAA, data

Procedia PDF Downloads 35
9300 Perception and Implementation of Machine Translation Applications by the Iranian English Translators

Authors: Abdul Amir Hazbavi

Abstract:

The present study is an attempt to provide a relatively comprehensive preview of the Iranian English translators’ perception on Machine Translation. Furthermore, the study tries to shed light on the status of implementation of Machine Translation among the Iranian English Translators. To reach the aforementioned objectives, the Localization Industry Standards Association’s questioner for measuring perceptions with regard to the adoption of a technology innovation was adapted and used to investigate three parameter among the participants of the study, namely familiarity with Machine Translation, general perception on Machine Translation and implementation of Machine Translation systems in translation tasks. The participants of the study were 224 last-year undergraduate Iranian students of English translation at 10 universities across the country. The study revealed a very low level of adoption and a very high level of willingness to get familiar with and learn about Machine Translation, as well as a positive perception of and attitude toward Machine Translation by the Iranian English translators.

Keywords: translation technology, machine translation, perception, implementation

Procedia PDF Downloads 486
9299 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 395
9298 Myanmar Consonants Recognition System Based on Lip Movements Using Active Contour Model

Authors: T. Thein, S. Kalyar Myo

Abstract:

Human uses visual information for understanding the speech contents in noisy conditions or in situations where the audio signal is not available. The primary advantage of visual information is that it is not affected by the acoustic noise and cross talk among speakers. Using visual information from the lip movements can improve the accuracy and robustness of automatic speech recognition. However, a major challenge with most automatic lip reading system is to find a robust and efficient method for extracting the linguistically relevant speech information from a lip image sequence. This is a difficult task due to variation caused by different speakers, illumination, camera setting and the inherent low luminance and chrominance contrast between lip and non-lip region. Several researchers have been developing methods to overcome these problems; the one is lip reading. Moreover, it is well known that visual information about speech through lip reading is very useful for human speech recognition system. Lip reading is the technique of a comprehensive understanding of underlying speech by processing on the movement of lips. Therefore, lip reading system is one of the different supportive technologies for hearing impaired or elderly people, and it is an active research area. The need for lip reading system is ever increasing for every language. This research aims to develop a visual teaching method system for the hearing impaired persons in Myanmar, how to pronounce words precisely by identifying the features of lip movement. The proposed research will work a lip reading system for Myanmar Consonants, one syllable consonants (င (Nga)၊ ည (Nya)၊ မ (Ma)၊ လ (La)၊ ၀ (Wa)၊ သ (Tha)၊ ဟ (Ha)၊ အ (Ah) ) and two syllable consonants ( က(Ka Gyi)၊ ခ (Kha Gway)၊ ဂ (Ga Nge)၊ ဃ (Ga Gyi)၊ စ (Sa Lone)၊ ဆ (Sa Lain)၊ ဇ (Za Gwe) ၊ ဒ (Da Dway)၊ ဏ (Na Gyi)၊ န (Na Nge)၊ ပ (Pa Saug)၊ ဘ (Ba Gone)၊ ရ (Ya Gaug)၊ ဠ (La Gyi) ). In the proposed system, there are three subsystems, the first one is the lip localization system, which localizes the lips in the digital inputs. The next one is the feature extraction system, which extracts features of lip movement suitable for visual speech recognition. And the final one is the classification system. In the proposed research, Two Dimensional Discrete Cosine Transform (2D-DCT) and Linear Discriminant Analysis (LDA) with Active Contour Model (ACM) will be used for lip movement features extraction. Support Vector Machine (SVM) classifier is used for finding class parameter and class number in training set and testing set. Then, experiments will be carried out for the recognition accuracy of Myanmar consonants using the only visual information on lip movements which are useful for visual speech of Myanmar languages. The result will show the effectiveness of the lip movement recognition for Myanmar Consonants. This system will help the hearing impaired persons to use as the language learning application. This system can also be useful for normal hearing persons in noisy environments or conditions where they can find out what was said by other people without hearing voice.

Keywords: feature extraction, lip reading, lip localization, Active Contour Model (ACM), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Two Dimensional Discrete Cosine Transform (2D-DCT)

Procedia PDF Downloads 252
9297 A Machine Learning Framework Based on Biometric Measurements for Automatic Fetal Head Anomalies Diagnosis in Ultrasound Images

Authors: Hanene Sahli, Aymen Mouelhi, Marwa Hajji, Amine Ben Slama, Mounir Sayadi, Farhat Fnaiech, Radhwane Rachdi

Abstract:

Fetal abnormality is still a public health problem of interest to both mother and baby. Head defect is one of the most high-risk fetal deformities. Fetal head categorization is a sensitive task that needs a massive attention from neurological experts. In this sense, biometrical measurements can be extracted by gynecologist doctors and compared with ground truth charts to identify normal or abnormal growth. The fetal head biometric measurements such as Biparietal Diameter (BPD), Occipito-Frontal Diameter (OFD) and Head Circumference (HC) needs to be monitored, and expert should carry out its manual delineations. This work proposes a new approach to automatically compute BPD, OFD and HC based on morphological characteristics extracted from head shape. Hence, the studied data selected at the same Gestational Age (GA) from the fetal Ultrasound images (US) are classified into two categories: Normal and abnormal. The abnormal subjects include hydrocephalus, microcephaly and dolichocephaly anomalies. By the use of a support vector machines (SVM) method, this study achieved high classification for automated detection of anomalies. The proposed method is promising although it doesn't need expert interventions.

Keywords: biometric measurements, fetal head malformations, machine learning methods, US images

Procedia PDF Downloads 246
9296 Application of Supervised Deep Learning-based Machine Learning to Manage Smart Homes

Authors: Ahmed Al-Adaileh

Abstract:

Renewable energy sources, domestic storage systems, controllable loads and machine learning technologies will be key components of future smart homes management systems. An energy management scheme that uses a Deep Learning (DL) approach to support the smart home management systems, which consist of a standalone photovoltaic system, storage unit, heating ventilation air-conditioning system and a set of conventional and smart appliances, is presented. The objective of the proposed scheme is to apply DL-based machine learning to predict various running parameters within a smart home's environment to achieve maximum comfort levels for occupants, reduced electricity bills, and less dependency on the public grid. The problem is using Reinforcement learning, where decisions are taken based on applying the Continuous-time Markov Decision Process. The main contribution of this research is the proposed framework that applies DL to enhance the system's supervised dataset to offer unlimited chances to effectively support smart home systems. A case study involving a set of conventional and smart appliances with dedicated processing units in an inhabited building can demonstrate the validity of the proposed framework. A visualization graph can show "before" and "after" results.

Keywords: smart homes systems, machine learning, deep learning, Markov Decision Process

Procedia PDF Downloads 149
9295 Reducing the Imbalance Penalty Through Artificial Intelligence Methods Geothermal Production Forecasting: A Case Study for Turkey

Authors: Hayriye Anıl, Görkem Kar

Abstract:

In addition to being rich in renewable energy resources, Turkey is one of the countries that promise potential in geothermal energy production with its high installed power, cheapness, and sustainability. Increasing imbalance penalties become an economic burden for organizations since geothermal generation plants cannot maintain the balance of supply and demand due to the inadequacy of the production forecasts given in the day-ahead market. A better production forecast reduces the imbalance penalties of market participants and provides a better imbalance in the day ahead market. In this study, using machine learning, deep learning, and, time series methods, the total generation of the power plants belonging to Zorlu Natural Electricity Generation, which has a high installed capacity in terms of geothermal, was estimated for the first one and two weeks of March, then the imbalance penalties were calculated with these estimates and compared with the real values. These modeling operations were carried out on two datasets, the basic dataset and the dataset created by extracting new features from this dataset with the feature engineering method. According to the results, Support Vector Regression from traditional machine learning models outperformed other models and exhibited the best performance. In addition, the estimation results in the feature engineering dataset showed lower error rates than the basic dataset. It has been concluded that the estimated imbalance penalty calculated for the selected organization is lower than the actual imbalance penalty, optimum and profitable accounts.

Keywords: machine learning, deep learning, time series models, feature engineering, geothermal energy production forecasting

Procedia PDF Downloads 66
9294 A Computationally Intelligent Framework to Support Youth Mental Health in Australia

Authors: Nathaniel Carpenter

Abstract:

Web-enabled systems for supporting youth mental health management in Australia are pioneering in their field; however, with their success, these systems are experiencing exponential growth in demand which is straining an already stretched service. Supporting youth mental is critical as the lack of support is associated with significant and lasting negative consequences. To meet this growing demand, and provide critical support, investigations are needed on evaluating and improving existing online support services. Improvements should focus on developing frameworks capable of augmenting and scaling service provisions. There are few investigations informing best-practice frameworks when implementing e-mental health support systems for youth mental health; there are fewer which implement machine learning or artificially intelligent systems to facilitate the delivering of services. This investigation will use a case study methodology to highlight the design features which are important for systems to enable young people to self-manage their mental health. The investigation will also highlight the current information system challenges, to include challenges associated with service quality, provisioning, and scaling. This work will propose methods of meeting these challenges through improved design, service augmentation and automation, service quality, and through artificially intelligent inspired solutions. The results of this study will inform a framework for supporting youth mental health with intelligent and scalable web-enabled technologies to support an ever-growing user base.

Keywords: artificial intelligence, information systems, machine learning, youth mental health

Procedia PDF Downloads 80
9293 Texture-Based Image Forensics from Video Frame

Authors: Li Zhou, Yanmei Fang

Abstract:

With current technology, images and videos can be obtained more easily than ever. It is so easy to manipulate these digital multimedia information when obtained, and that the content or source of the image and video could be easily tampered. In this paper, we propose to identify the image and video frame by the texture-based approach, e.g. Markov Transition Probability (MTP), which is in space domain, DCT domain and DWT domain, respectively. In the experiment, image and video frame database is constructed, and is used to train and test the classifier Support Vector Machine (SVM). Experiment results show that the texture-based approach has good performance. In order to verify the experiment result, and testify the universality and robustness of algorithm, we build a random testing dataset, the random testing result is in keeping with above experiment.

Keywords: multimedia forensics, video frame, LBP, MTP, SVM

Procedia PDF Downloads 388