Search results for: supports vectors machine
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3871

Search results for: supports vectors machine

3601 Identification of Biological Pathways Causative for Breast Cancer Using Unsupervised Machine Learning

Authors: Karthik Mittal

Abstract:

This study performs an unsupervised machine learning analysis to find clusters of related SNPs which highlight biological pathways that are important for the biological mechanisms of breast cancer. Studying genetic variations in isolation is illogical because these genetic variations are known to modulate protein production and function; the downstream effects of these modifications on biological outcomes are highly interconnected. After extracting the SNPs and their effect on different types of breast cancer using the MRBase library, two unsupervised machine learning clustering algorithms were implemented on the genetic variants: a k-means clustering algorithm and a hierarchical clustering algorithm; furthermore, principal component analysis was executed to visually represent the data. These algorithms specifically used the SNP’s beta value on the three different types of breast cancer tested in this project (estrogen-receptor positive breast cancer, estrogen-receptor negative breast cancer, and breast cancer in general) to perform this clustering. Two significant genetic pathways validated the clustering produced by this project: the MAPK signaling pathway and the connection between the BRCA2 gene and the ESR1 gene. This study provides the first proof of concept showing the importance of unsupervised machine learning in interpreting GWAS summary statistics.

Keywords: breast cancer, computational biology, unsupervised machine learning, k-means, PCA

Procedia PDF Downloads 117
3600 A Method to Predict the Thermo-Elastic Behavior of Laser-Integrated Machine Tools

Authors: C. Brecher, M. Fey, F. Du Bois-Reymond, S. Neus

Abstract:

Additive manufacturing has emerged into a fast-growing section within the manufacturing technologies. Established machine tool manufacturers, such as DMG MORI, recently presented machine tools combining milling and laser welding. By this, machine tools can realize a higher degree of flexibility and a shorter production time. Still there are challenges that have to be accounted for in terms of maintaining the necessary machining accuracy - especially due to thermal effects arising through the use of high power laser processing units. To study the thermal behavior of laser-integrated machine tools, it is essential to analyze and simulate the thermal behavior of machine components, individual and assembled. This information will help to design a geometrically stable machine tool under the influence of high power laser processes. This paper presents an approach to decrease the loss of machining precision due to thermal impacts. Real effects of laser machining processes are considered and thus enable an optimized design of the machine tool, respective its components, in the early design phase. Core element of this approach is a matched FEM model considering all relevant variables arising, e.g. laser power, angle of laser beam, reflective coefficients and heat transfer coefficient. Hence, a systematic approach to obtain this matched FEM model is essential. Indicating the thermal behavior of structural components as well as predicting the laser beam path, to determine the relevant beam intensity on the structural components, there are the two constituent aspects of the method. To match the model both aspects of the method have to be combined and verified empirically. In this context, an essential machine component of a five axis machine tool, the turn-swivel table, serves as the demonstration object for the verification process. Therefore, a turn-swivel table test bench as well as an experimental set-up to measure the beam propagation were developed and are described in the paper. In addition to the empirical investigation, a simulative approach of the described types of experimental examination is presented. Concluding, it is shown that the method and a good understanding of the two core aspects, the thermo-elastic machine behavior and the laser beam path, as well as their combination helps designers to minimize the loss of precision in the early stages of the design phase.

Keywords: additive manufacturing, laser beam machining, machine tool, thermal effects

Procedia PDF Downloads 241
3599 Intrusion Detection Based on Graph Oriented Big Data Analytics

Authors: Ahlem Abid, Farah Jemili

Abstract:

Intrusion detection has been the subject of numerous studies in industry and academia, but cyber security analysts always want greater precision and global threat analysis to secure their systems in cyberspace. To improve intrusion detection system, the visualisation of the security events in form of graphs and diagrams is important to improve the accuracy of alerts. In this paper, we propose an approach of an IDS based on cloud computing, big data technique and using a machine learning graph algorithm which can detect in real time different attacks as early as possible. We use the MAWILab intrusion detection dataset . We choose Microsoft Azure as a unified cloud environment to load our dataset on. We implement the k2 algorithm which is a graphical machine learning algorithm to classify attacks. Our system showed a good performance due to the graphical machine learning algorithm and spark structured streaming engine.

Keywords: Apache Spark Streaming, Graph, Intrusion detection, k2 algorithm, Machine Learning, MAWILab, Microsoft Azure Cloud

Procedia PDF Downloads 113
3598 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 113
3597 Effect of the Tooling Conditions on the Machining Stability of a Milling Machine

Authors: Jui-Pui Hung, Yong-Run Chen, Wei-Cheng Shih, Shen-He Tsui, Kung-Da Wu

Abstract:

This paper presents the effect on the tooling conditions on the machining stabilities of a milling machine tool. The machining stability was evaluated in different feeding direction in the X-Y plane, which was referred as the orientation-dependent machining stability. According to the machining mechanics, the machining stability was determined by the frequency response function of the cutter. Thus, we first conducted the vibration tests on the spindle tool of the milling machine to assess the tool tip frequency response functions along the principal direction of the machine tool. Then, basing on the orientation dependent stability analysis model proposed in this study, we evaluated the variation of the dynamic characteristics of the spindle tool and the corresponding machining stabilities at a specific feeding direction. Current results demonstrate that the stability boundaries and limited axial cutting depth of a specific cutter were affected to vary when it was fixed in the tool holder with different overhang length. The flute of the cutter also affects the stability boundary. When a two flute cutter was used, the critical cutting depth can be increased by 47 % as compared with the four flute cutter. The results presented in study provide valuable references for the selection of the tooling conditions for achieving high milling performance.

Keywords: tooling condition, machining stability, milling machine, chatter

Procedia PDF Downloads 400
3596 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 50
3595 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 65
3594 Insider Theft Detection in Organizations Using Keylogger and Machine Learning

Authors: Shamatha Shetty, Sakshi Dhabadi, Prerana M., Indushree B.

Abstract:

About 66% of firms claim that insider attacks are more likely to happen. The frequency of insider incidents has increased by 47% in the last two years. The goal of this work is to prevent dangerous employee behavior by using keyloggers and the Machine Learning (ML) model. Every keystroke that the user enters is recorded by the keylogging program, also known as keystroke logging. Keyloggers are used to stop improper use of the system. This enables us to collect all textual data, save it in a CSV file, and analyze it using an ML algorithm and the VirusTotal API. Many large companies use it to methodically monitor how their employees use computers, the internet, and email. We are utilizing the SVM algorithm and the VirusTotal API to improve overall efficiency and accuracy in identifying specific patterns and words to automate and offer the report for improved monitoring.

Keywords: cyber security, machine learning, cyclic process, email notification

Procedia PDF Downloads 25
3593 A Case Study on the Condition Monitoring of a Critical Machine in a Tyre Manufacturing Plant

Authors: Ramachandra C. G., Amarnath. M., Prashanth Pai M., Nagesh S. N.

Abstract:

The machine's performance level drops down over a period of time due to the wear and tear of its components. The early detection of an emergent fault becomes very vital in order to obtain uninterrupted production in a plant. Maintenance is an activity that helps to keep the machine's performance at an anticipated level, thereby ensuring the availability of the machine to perform its intended function. At present, a number of modern maintenance techniques are available, such as preventive maintenance, predictive maintenance, condition-based maintenance, total productive maintenance, etc. Condition-based maintenance or condition monitoring is one such modern maintenance technique in which the machine's condition or health is checked by the measurement of certain parameters such as sound level, temperature, velocity, displacement, vibration, etc. It can recognize most of the factors restraining the usefulness and efficacy of the total manufacturing unit. This research work is conducted on a Batch Mill in a tire production unit located in the Southern Karnataka region. The health of the mill is assessed using amplitude of vibration as a parameter of measurement. Most commonly, the vibration level is assessed using various points on the machine bearing. The normal or standard level is fixed using reference materials such as manuals or catalogs supplied by the manufacturers and also by referring vibration standards. The Rio-Vibro meter is placed in different locations on the batch-off mill to record the vibration data. The data collected are analyzed to identify the malfunctioning components in the batch off the mill, and corrective measures are suggested.

Keywords: availability, displacement, vibration, rio-vibro, condition monitoring

Procedia PDF Downloads 37
3592 Design Modification in CNC Milling Machine to Reduce the Weight of Structure

Authors: Harshkumar K. Desai, Anuj K. Desai, Jay P. Patel, Snehal V. Trivedi, Yogendrasinh Parmar

Abstract:

The need of continuous improvement in a product or process in this era of global competition leads to apply value engineering for functional and aesthetic improvement in consideration with economic aspect too. Solar industries located at G.I.D.C., Makarpura, Vadodara, Gujarat, India; a manufacturer of variety of CNC Machines had a challenge to analyze the structural design of column, base, carriage and table of CNC Milling Machine in the account of reduction of overall weight of a machine without affecting the rigidity and accuracy at the time of operation. The identified task is the first attempt to validate and optimize the proposed design of ribbed structure statically using advanced modeling and analysis tools in a systematic way. Results of stress and deformation obtained using analysis software are validated with theoretical analysis and found quite satisfactory. Such optimized results offer a weight reduction of the final assembly which is desired by manufacturers in favor of reduction of material cost, processing cost and handling cost finally.

Keywords: CNC milling machine, optimization, finite element analysis (FEA), weight reduction

Procedia PDF Downloads 244
3591 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 76
3590 A Unique Multi-Class Support Vector Machine Algorithm Using MapReduce

Authors: Aditi Viswanathan, Shree Ranjani, Aruna Govada

Abstract:

With data sizes constantly expanding, and with classical machine learning algorithms that analyze such data requiring larger and larger amounts of computation time and storage space, the need to distribute computation and memory requirements among several computers has become apparent. Although substantial work has been done in developing distributed binary SVM algorithms and multi-class SVM algorithms individually, the field of multi-class distributed SVMs remains largely unexplored. This research seeks to develop an algorithm that implements the Support Vector Machine over a multi-class data set and is efficient in a distributed environment. For this, we recursively choose the best binary split of a set of classes using a greedy technique. Much like the divide and conquer approach. Our algorithm has shown better computation time during the testing phase than the traditional sequential SVM methods (One vs. One, One vs. Rest) and out-performs them as the size of the data set grows. This approach also classifies the data with higher accuracy than the traditional multi-class algorithms.

Keywords: distributed algorithm, MapReduce, multi-class, support vector machine

Procedia PDF Downloads 367
3589 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) for English and machine translation (MT) for English and Croatian in the domain of business correspondence. The first part presents results of training the ASR commercial system on two English data sets, enriched by error analysis. The second part presents results of machine translation performed by online tool Google Translate for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: automatic machine translation, integrated language technologies, quality evaluation, speech recognition

Procedia PDF Downloads 452
3588 Friction and Wear, Including Mechanisms, Modeling,Characterization, Measurement and Testing (Bangladesh Case)

Authors: Gor Muradyan

Abstract:

The paper is about friction and wear, including mechanisms, modeling, characterization, measurement and testing case in Bangladesh. Bangladesh is a country under development, A lot of people live here, approximately 145 million. The territory of this country is very small. Therefore buildings are very close to each other. As the pipe lines are very old, and people get almost dirty water, there are a lot of ongoing projects under ADB. In those projects the contractors using HDD machines (Horizontal Directional Drilling ) and grundoburst. These machines are working underground. As ground in Bangladesh is very sludge, machine can't work relevant because of big friction in the soil. When drilling works are finished machine is pulling the pipe underground. Very often the pulling of the pipes becomes very complicated because of the friction. Therefore long section of the pipe laying can’t be done because of a big friction. In that case, additional problems rise, as well as additional work must be done. As we mentioned above it is not possible to do big section of the pipe laying because of big friction in the soil, Because of this it is coming out that contractors must do more joints, more pressure test. It is always connected with additional expenditure and losing time. This machine can pull in 75 mm to 500 mm pipes connected with the soil condition. Length is possible till 500m related how much friction it will had on the puller. As less as much it can pull. Another machine grundoburst is not working at this soil condition at all. The machine is working with air compressor. This machine are using for the smaller diameter pipes, 20 mm to 63 mm. Most of the cases these machines are being used for the installing of the house connection pipes, for making service connection. To make a friction less contractors using bigger pulling had then the pipe. It is taking down the friction, But the problem of this machine is that it can't work at sludge. Because of mentioned reasons the friction has a big mining during this kind of works. There are a lot of ways to reduce the friction. In this paper we'll introduce the ways that we have researched during our practice in Bangladesh.

Keywords: Bangladesh, friction and wear, HDD machines, reducing friction

Procedia PDF Downloads 277
3587 Calcined Tertiaries Hydrotalcites as Supports of Cobalt-Molybdenum Based Catalysts for the Hydrodesulfurization Reaction of Dibenzothiophene

Authors: Edwin Oviedo, Carlos Linares, Philippe Ayrault, Sylvette Brunet

Abstract:

Nowadays, light conventional crude oils are going down. Therefore, the exploitation of heavy crude oils has been increasing. Hence, a major quantity of refractory sulfur compounds such as dibenzothiophene (DBT) should be removed. Many efforts have been carried out to modify hydrotreatment typical supports in order to increase hydrodesulfurization (HDS) reactions. The present work shows the synthesis of tertiaries MgFeAl(0.16), MgFeAl(0.32), CoFeAl, ZnFeAl hydrotalcites, as supports of CoMo based catalysts, where 0.16 and 0.32 are the Fe3+/Al3+ molar ratio. Solids were characterized by different techniques (XRD, CO2-TPD, H2-TPR, FT-IR, BET, Chemical Analysis and HRTEM) and tested in the DBT HDS reaction. The reactions conditions were: Temp=325°C, P=40 Bar, H2/feed=475. Results show that the catalysts CoMo/MgFeAl(0.16) and CoMo/MgFeAl(0.32), which were the most basics, reduced the sulfur content from 500ppm to less than 1 ppm, increasing the cyclohexylbenzene content, i.e. presented a higher selective toward the HYD pathway than reference catalyst CoMo/γ- Al2O3. This is suitable for improving the fuel quality due to the increase of the cetane number. These catalysts were also more active to the HDS reaction increasing the direct desulfurization (DDS) way and presented a good stability. It is advantageous when the gas oil centane number should be improved. Cobalt, iron or zinc species inside support could avoid the Co and Mo dispersion or form spinel species which could be less active to hydrodesulfuration reactions, while hydrotalcites containing Mg increases the HDS activity probably due to improved Co/Mo ratio.

Keywords: catalyst, cetane number, dibenzothiophene, diesel, hydrodesulfurization, hydrotreatment, MoS2

Procedia PDF Downloads 135
3586 Electrical Machine Winding Temperature Estimation Using Stateful Long Short-Term Memory Networks (LSTM) and Truncated Backpropagation Through Time (TBPTT)

Authors: Yujiang Wu

Abstract:

As electrical machine (e-machine) power density re-querulents become more stringent in vehicle electrification, mounting a temperature sensor for e-machine stator windings becomes increasingly difficult. This can lead to higher manufacturing costs, complicated harnesses, and reduced reliability. In this paper, we propose a deep-learning method for predicting electric machine winding temperature, which can either replace the sensor entirely or serve as a backup to the existing sensor. We compare the performance of our method, the stateful long short-term memory networks (LSTM) with truncated backpropagation through time (TBTT), with that of linear regression, as well as stateless LSTM with/without residual connection. Our results demonstrate the strength of combining stateful LSTM and TBTT in tackling nonlinear time series prediction problems with long sequence lengths. Additionally, in industrial applications, high-temperature region prediction accuracy is more important because winding temperature sensing is typically used for derating machine power when the temperature is high. To evaluate the performance of our algorithm, we developed a temperature-stratified MSE. We propose a simple but effective data preprocessing trick to improve the high-temperature region prediction accuracy. Our experimental results demonstrate the effectiveness of our proposed method in accurately predicting winding temperature, particularly in high-temperature regions, while also reducing manufacturing costs and improving reliability.

Keywords: deep learning, electrical machine, functional safety, long short-term memory networks (LSTM), thermal management, time series prediction

Procedia PDF Downloads 59
3585 Experimental Investigation of Stain Removal Performance of Different Types of Top Load Washing Machines with Textile Mechanical Damage Consideration

Authors: Ehsan Tuzcuoğlu, Muhammed Emin Çoban, Songül Byraktar

Abstract:

One of the main targets of the washing machine is to remove any dirt and stains from the clothes. Especially, the stain removal is significantly important in the Far East market, where the high percentage of the consumers use the top load washing machines as washing appliance. They use all pretreatment methods (i.e. soaking, prewash, and heavy functions) to eliminate the stains from their clothes. Therefore, with this study it is aimed to study experimentally the stain removal performance of 3 different Top-Loading washing machines of the Far East market with 24 different types of stains which are mostly related to Far East culture. In the meanwhile, the mechanical damge on laundry is examined for each machine to see the mechanical effect of the related stain programs on the textile load of the machines. The test machines vary according to have a heater, moving part(s)on their impeller, and to be in different height/width ratio of the drum. The results indicate that decreasing the water level inside the washing machine might result in better soil removal as well as less textile damage. Beside this, the experimental results reveal that heating has the main effect on stain removal. Two-step (or delayed) heating and a lower amount of water can also be considered as the further parameters

Keywords: laundry, washing machine, top load washing machine, stain removal, textile damage, mechanical textile damage

Procedia PDF Downloads 100
3584 Smart Production Planning: The Case of Aluminium Foundry

Authors: Samira Alvandi

Abstract:

In the context of the circular economy, production planning aims to eliminate waste and emissions and maximize resource efficiency. Historically production planning is challenged through arrays of uncertainty and complexity arising from the interdependence and variability of products, processes, and systems. Manufacturers worldwide are facing new challenges in tackling various environmental issues such as climate change, resource depletion, and land degradation. In managing the inherited complexity and uncertainty and yet maintaining profitability, the manufacturing sector is in need of a holistic framework that supports energy efficiency and carbon emission reduction schemes. The proposed framework addresses the current challenges and integrates simulation modeling with optimization for finding optimal machine-job allocation to maximize throughput and total energy consumption while minimizing lead time. The aluminium refinery facility in western Sydney, Australia, is used as an exemplar to validate the proposed framework.

Keywords: smart production planning, simulation-optimisation, energy aware capacity planning, energy intensive industries

Procedia PDF Downloads 32
3583 Advanced Machine Learning Algorithm for Credit Card Fraud Detection

Authors: Manpreet Kaur

Abstract:

When legitimate credit card users are mistakenly labelled as fraudulent in numerous financial delated applications, there are numerous ethical problems. The innovative machine learning approach we have suggested in this research outperforms the current models and shows how to model a data set for credit card fraud detection while minimizing false positives. As a result, we advise using random forests as the best machine learning method for predicting and identifying credit card transaction fraud. The majority of victims of these fraudulent transactions were discovered to be credit card users over the age of 60, with a higher percentage of fraudulent transactions taking place between the specific hours.

Keywords: automated fraud detection, isolation forest method, local outlier factor, ML algorithm, credit card

Procedia PDF Downloads 78
3582 Optimization of 3D Printing Parameters Using Machine Learning to Enhance Mechanical Properties in Fused Deposition Modeling (FDM) Technology

Authors: Darwin Junnior Sabino Diego, Brando Burgos Guerrero, Diego Arroyo Villanueva

Abstract:

Additive manufacturing, commonly known as 3D printing, has revolutionized modern manufacturing by enabling the agile creation of complex objects. However, challenges persist in the consistency and quality of printed parts, particularly in their mechanical properties. This study focuses on addressing these challenges through the optimization of printing parameters in FDM technology, using Machine Learning techniques. Our aim is to improve the mechanical properties of printed objects by optimizing parameters such as speed, temperature, and orientation. We implement a methodology that combines experimental data collection with Machine Learning algorithms to identify relationships between printing parameters and mechanical properties. The results demonstrate the potential of this methodology to enhance the quality and consistency of 3D printed products, with significant applications across various industrial fields. This research not only advances understanding of additive manufacturing but also opens new avenues for practical implementation in industrial settings.

Keywords: 3D printing, additive manufacturing, machine learning, mechanical properties

Procedia PDF Downloads 7
3581 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 16
3580 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 327
3579 Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: Anudeep Appe, Bhanu Poluparthi, Lakshmi Kasivajjula, Udai Mv, Sobha Bagadi, Punya Modi, Aditya Singh, Hemanth Gunupudi, Spenser Troiano, Jeff Paul, Justin Stovall, Justin Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data is considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP, to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since its data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for ex. quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP (SHapley Additive exPlanations), a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: competition, DAGs, facility, healthcare, machine learning, market share, random forest, SHAP

Procedia PDF Downloads 55
3578 Transformer Fault Diagnostic Predicting Model Using Support Vector Machine with Gradient Decent Optimization

Authors: R. O. Osaseri, A. R. Usiobaifo

Abstract:

The power transformer which is responsible for the voltage transformation is of great relevance in the power system and oil-immerse transformer is widely used all over the world. A prompt and proper maintenance of the transformer is of utmost importance. The dissolved gasses content in power transformer, oil is of enormous importance in detecting incipient fault of the transformer. There is a need for accurate prediction of the incipient fault in transformer oil in order to facilitate the prompt maintenance and reducing the cost and error minimization. Study on fault prediction and diagnostic has been the center of many researchers and many previous works have been reported on the use of artificial intelligence to predict incipient failure of transformer faults. In this study machine learning technique was employed by using gradient decent algorithms and Support Vector Machine (SVM) in predicting incipient fault diagnosis of transformer. The method focuses on creating a system that improves its performance on previous result and historical data. The system design approach is basically in two phases; training and testing phase. The gradient decent algorithm is trained with a training dataset while the learned algorithm is applied to a set of new data. This two dataset is used to prove the accuracy of the proposed model. In this study a transformer fault diagnostic model based on Support Vector Machine (SVM) and gradient decent algorithms has been presented with a satisfactory diagnostic capability with high percentage in predicting incipient failure of transformer faults than existing diagnostic methods.

Keywords: diagnostic model, gradient decent, machine learning, support vector machine (SVM), transformer fault

Procedia PDF Downloads 287
3577 Advancing the Analysis of Physical Activity Behaviour in Diverse, Rapidly Evolving Populations: Using Unsupervised Machine Learning to Segment and Cluster Accelerometer Data

Authors: Christopher Thornton, Niina Kolehmainen, Kianoush Nazarpour

Abstract:

Background: Accelerometers are widely used to measure physical activity behavior, including in children. The traditional method for processing acceleration data uses cut points, relying on calibration studies that relate the quantity of acceleration to energy expenditure. As these relationships do not generalise across diverse populations, they must be parametrised for each subpopulation, including different age groups, which is costly and makes studies across diverse populations difficult. A data-driven approach that allows physical activity intensity states to emerge from the data under study without relying on parameters derived from external populations offers a new perspective on this problem and potentially improved results. We evaluated the data-driven approach in a diverse population with a range of rapidly evolving physical and mental capabilities, namely very young children (9-38 months old), where this new approach may be particularly appropriate. Methods: We applied an unsupervised machine learning approach (a hidden semi-Markov model - HSMM) to segment and cluster the accelerometer data recorded from 275 children with a diverse range of physical and cognitive abilities. The HSMM was configured to identify a maximum of six physical activity intensity states and the output of the model was the time spent by each child in each of the states. For comparison, we also processed the accelerometer data using published cut points with available thresholds for the population. This provided us with time estimates for each child’s sedentary (SED), light physical activity (LPA), and moderate-to-vigorous physical activity (MVPA). Data on the children’s physical and cognitive abilities were collected using the Paediatric Evaluation of Disability Inventory (PEDI-CAT). Results: The HSMM identified two inactive states (INS, comparable to SED), two lightly active long duration states (LAS, comparable to LPA), and two short-duration high-intensity states (HIS, comparable to MVPA). Overall, the children spent on average 237/392 minutes per day in INS/SED, 211/129 minutes per day in LAS/LPA, and 178/168 minutes in HIS/MVPA. We found that INS overlapped with 53% of SED, LAS overlapped with 37% of LPA and HIS overlapped with 60% of MVPA. We also looked at the correlation between the time spent by a child in either HIS or MVPA and their physical and cognitive abilities. We found that HIS was more strongly correlated with physical mobility (R²HIS =0.5, R²MVPA= 0.28), cognitive ability (R²HIS =0.31, R²MVPA= 0.15), and age (R²HIS =0.15, R²MVPA= 0.09), indicating increased sensitivity to key attributes associated with a child’s mobility. Conclusion: An unsupervised machine learning technique can segment and cluster accelerometer data according to the intensity of movement at a given time. It provides a potentially more sensitive, appropriate, and cost-effective approach to analysing physical activity behavior in diverse populations, compared to the current cut points approach. This, in turn, supports research that is more inclusive across diverse populations.

Keywords: physical activity, machine learning, under 5s, disability, accelerometer

Procedia PDF Downloads 176
3576 Developing an Intelligent Table Tennis Ball Machine with Human Play Simulation for Technical Training

Authors: Chen-Chi An, Jun-Yi He, Cheng-Han Hsieh, Chen-Ching Ting

Abstract:

This research has successfully developed an intelligent table tennis ball machine with human play simulate all situations of human play to take the service. It is well known; an excellent ball machine can help the table tennis coach to provide more efficient teaching, also give players the good technical training and entertainment. An excellent ball machine should be able to service all balls based on human play simulation due to the conventional competitions are today all taken place for people. In this work, two counter-rotating wheels are used to service the balls, where changing the absolute rotating speeds of the two wheels and the differences of rotating speeds between the two wheels can adjust the struck forces and the rotating speeds of the ball. The relationships between the absolute rotating speed of the two wheels and the struck forces of the ball as well as the differences rotating speeds between the two wheels and the rotating speeds of the ball are experimentally determined for technical development. The outlet speed, the ejected distance, and the rotating speed of the ball were measured by changing the absolute rotating speeds of the two wheels in terms of a series of differences in rotating speed between the two wheels for calibration of the ball machine; where the outlet speed and the ejected distance of the ball were further converted to the struck forces of the ball. In process, the balls serviced by the intelligent ball machine were based on the received calibration curves with help of the computer. Experiments technically used photosensitive devices to detect the outlet and rotating speed of the ball. Finally, this research developed some teaching programs for technical training using three ball machines and received more efficient training.

Keywords: table tennis, ball machine, human play simulation, counter-rotating wheels

Procedia PDF Downloads 399
3575 Study of Porous Metallic Support for Intermediate-Temperature Solid Oxide Fuel Cells

Authors: S. Belakry, D. Fasquelle, A. Rolle, E. Capoen, R. N. Vannier, J. C. Carru

Abstract:

Solid oxide fuel cells (SOFCs) are promising devices for energy conversion due to their high electrical efficiency and eco-friendly behavior. Their performance is not only influenced by the microstructural and electrical properties of the electrodes and electrolyte but also depends on the interactions at the interfaces. Nowadays, commercial SOFCs are electrically efficient at high operating temperatures, typically between 800 and 1000 °C, which restricts their real-life applications. The present work deals with the objectives to reduce the operating temperature and to develop cost-effective intermediate-temperature solid oxide fuel cells (IT-SOFCs). This work focuses on the development of metal-supported solid oxide fuel cells (MS-IT-SOFCs) that would provide cheaper SOFC cells with increased lifetime and reduced operating temperature. In the framework, the local company TIBTECH brings its skills for the manufacturing of porous metal supports. This part of the work focuses on the physical, chemical, and electrical characterizations of porous metallic supports (stainless steel 316 L and FeCrAl alloy) under different exposure conditions of temperature and atmosphere by studying oxidation, mechanical resistance, and electrical conductivity of the materials. Within the target operating temperature (i.e., 500 to 700 ° C), the stainless steel 316 L and FeCrAl alloy slightly oxidize in the air and H2, but don’t deform; whereas under Ar atmosphere, they oxidize more than with previously mentioned atmospheres. Above 700 °C under air and Ar, the two metallic supports undergo high oxidation. From 500 to 700 °C, the resistivity of FeCrAl increases by 55%. But nevertheless, the FeCrAl resistivity increases more slowly than the stainless steel 316L resistivity. This study allows us to verify the compatibility of electrodes and electrolyte materials with metallic support at the operating requirements of the IT-SOFC cell. The characterizations made in this context will also allow us to choose the most suitable fabrication process for all functional layers in order to limit the oxidation of the metallic supports.

Keywords: stainless steel 316L, FeCrAl alloy, solid oxide fuel cells, porous metallic support

Procedia PDF Downloads 67
3574 Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data

Authors: Gayathri Nagarajan, L. D. Dhinesh Babu

Abstract:

Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets. In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.

Keywords: big data analytics, ensemble machine learning, gradient boosted trees, Spark platform

Procedia PDF Downloads 216
3573 Design and Implementation of a Memory Safety Isolation Method Based on the Xen Cloud Environment

Authors: Dengpan Wu, Dan Liu

Abstract:

In view of the present cloud security problem has increasingly become one of the major obstacles hindering the development of the cloud computing, put forward a kind of memory based on Xen cloud environment security isolation technology implementation. And based on Xen virtual machine monitor system, analysis of the model of memory virtualization is implemented, using Xen memory virtualization system mechanism of super calls and grant table, based on the virtual machine manager internal implementation of access control module (ACM) to design the security isolation system memory. Experiments show that, the system can effectively isolate different customer domain OS between illegal access to memory data.

Keywords: cloud security, memory isolation, xen, virtual machine

Procedia PDF Downloads 365
3572 Classification of Cochannel Signals Using Cyclostationary Signal Processing and Deep Learning

Authors: Bryan Crompton, Daniel Giger, Tanay Mehta, Apurva Mody

Abstract:

The task of classifying radio frequency (RF) signals has seen recent success in employing deep neural network models. In this work, we present a combined signal processing and machine learning approach to signal classification for cochannel anomalous signals. The power spectral density and cyclostationary signal processing features of a captured signal are computed and fed into a neural net to produce a classification decision. Our combined signal preprocessing and machine learning approach allows for simpler neural networks with fast training times and small computational resource requirements for inference with longer preprocessing time.

Keywords: signal processing, machine learning, cyclostationary signal processing, signal classification

Procedia PDF Downloads 69