Search results for: engineering dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4190

Search results for: engineering dataset

3890 Developing Primary Care Datasets for a National Asthma Audit

Authors: Rachael Andrews, Viktoria McMillan, Shuaib Nasser, Christopher M. Roberts

Abstract:

Background and objective: The National Review of Asthma Deaths (NRAD) found that asthma management and care was inadequate in 26% of cases reviewed. Major shortfalls identified were adherence to national guidelines and standards and, particularly, the organisation of care, including supervision and monitoring in primary care, with 70% of cases reviewed having at least one avoidable factor in this area. 5.4 million people in the UK are diagnosed with and actively treated for asthma, and approximately 60,000 are admitted to hospital with acute exacerbations each year. The majority of people with asthma receive management and treatment solely in primary care. This has therefore created concern that many people within the UK are receiving sub-optimal asthma care resulting in unnecessary morbidity and risk of adverse outcome. NRAD concluded that a national asthma audit programme should be established to measure and improve processes, organisation, and outcomes of asthma care. Objective: To develop a primary care dataset enabling extraction of information from GP practices in Wales and providing robust data by which results and lessons could be drawn and drive service development and improvement. Methods: A multidisciplinary group of experts, including general practitioners, primary care organisation representatives, and asthma patients was formed and used as a source of governance and guidance. A review of asthma literature, guidance, and standards took place and was used to identify areas of asthma care which, if improved, would lead to better patient outcomes. Modified Delphi methodology was used to gain consensus from the expert group on which of the areas identified were to be prioritised, and an asthma patient and carer focus group held to seek views and feedback on areas of asthma care that were important to them. Areas of asthma care identified by both groups were mapped to asthma guidelines and standards to inform and develop primary and secondary care datasets covering both adult and pediatric care. Dataset development consisted of expert review and a targeted consultation process in order to seek broad stakeholder views and feedback. Results: Areas of asthma care identified as requiring prioritisation by the National Asthma Audit were: (i) Prescribing, (ii) Asthma diagnosis (iii) Asthma Reviews (iv) Personalised Asthma Action Plans (PAAPs) (v) Primary care follow-up after discharge from hospital (vi) Methodologies and primary care queries were developed to cover each of the areas of poor and variable asthma care identified and the queries designed to extract information directly from electronic patients’ records. Conclusion: This paper describes the methodological approach followed to develop primary care datasets for a National Asthma Audit. It sets out the principles behind the establishment of a National Asthma Audit programme in response to a national asthma mortality review and describes the development activities undertaken. Key process elements included: (i) mapping identified areas of poor and variable asthma care to national guidelines and standards, (ii) early engagement of experts, including clinicians and patients in the process, and (iii) targeted consultation of the queries to provide further insight into measures that were collectable, reproducible and relevant.

Keywords: asthma, primary care, general practice, dataset development

Procedia PDF Downloads 177
3889 DEEPMOTILE: Motility Analysis of Human Spermatozoa Using Deep Learning in Sri Lankan Population

Authors: Chamika Chiran Perera, Dananjaya Perera, Chirath Dasanayake, Banuka Athuraliya

Abstract:

Male infertility is a major problem in the world, and it is a neglected and sensitive health issue in Sri Lanka. It can be determined by analyzing human semen samples. Sperm motility is one of many factors that can evaluate male’s fertility potential. In Sri Lanka, this analysis is performed manually. Manual methods are time consuming and depend on the person, but they are reliable and it can depend on the expert. Machine learning and deep learning technologies are currently being investigated to automate the spermatozoa motility analysis, and these methods are unreliable. These automatic methods tend to produce false positive results and false detection. Current automatic methods support different techniques, and some of them are very expensive. Due to the geographical variance in spermatozoa characteristics, current automatic methods are not reliable for motility analysis in Sri Lanka. The suggested system, DeepMotile, is to explore a method to analyze motility of human spermatozoa automatically and present it to the andrology laboratories to overcome current issues. DeepMotile is a novel deep learning method for analyzing spermatozoa motility parameters in the Sri Lankan population. To implement the current approach, Sri Lanka patient data were collected anonymously as a dataset, and glass slides were used as a low-cost technique to analyze semen samples. Current problem was identified as microscopic object detection and tackling the problem. YOLOv5 was customized and used as the object detector, and it achieved 94 % mAP (mean average precision), 86% Precision, and 90% Recall with the gathered dataset. StrongSORT was used as the object tracker, and it was validated with andrology experts due to the unavailability of annotated ground truth data. Furthermore, this research has identified many potential ways for further investigation, and andrology experts can use this system to analyze motility parameters with realistic accuracy.

Keywords: computer vision, deep learning, convolutional neural networks, multi-target tracking, microscopic object detection and tracking, male infertility detection, motility analysis of human spermatozoa

Procedia PDF Downloads 111
3888 Comparative Analysis of Feature Extraction and Classification Techniques

Authors: R. L. Ujjwal, Abhishek Jain

Abstract:

In the field of computer vision, most facial variations such as identity, expression, emotions and gender have been extensively studied. Automatic age estimation has been rarely explored. With age progression of a human, the features of the face changes. This paper is providing a new comparable study of different type of algorithm to feature extraction [Hybrid features using HAAR cascade & HOG features] & classification [KNN & SVM] training dataset. By using these algorithms we are trying to find out one of the best classification algorithms. Same thing we have done on the feature selection part, we extract the feature by using HAAR cascade and HOG. This work will be done in context of age group classification model.

Keywords: computer vision, age group, face detection

Procedia PDF Downloads 373
3887 Early Prediction of Cognitive Impairment in Adults Aged 20 Years and Older using Machine Learning and Biomarkers of Heavy Metal Exposure

Authors: Ali Nabavi, Farimah Safari, Mohammad Kashkooli, Sara Sadat Nabavizadeh, Hossein Molavi Vardanjani

Abstract:

Cognitive impairment presents a significant and increasing health concern as populations age. Environmental risk factors such as heavy metal exposure are suspected contributors, but their specific roles remain incompletely understood. Machine learning offers a promising approach to integrate multi-factorial data and improve the prediction of cognitive outcomes. This study aimed to develop and validate machine learning models to predict early risk of cognitive impairment by incorporating demographic, clinical, and biomarker data, including measures of heavy metal exposure. A retrospective analysis was conducted using 2011-2014 National Health and Nutrition Examination Survey (NHANES) data. The dataset included participants aged 20 years and older who underwent cognitive testing. Variables encompassed demographic information, medical history, lifestyle factors, and biomarkers such as blood and urine levels of lead, cadmium, manganese, and other metals. Machine learning algorithms were trained on 90% of the data and evaluated on the remaining 10%, with performance assessed through metrics such as accuracy, area under curve (AUC), and sensitivity. Analysis included 2,933 participants. The stacking ensemble model demonstrated the highest predictive performance, achieving an AUC of 0.778 and a sensitivity of 0.879 on the test dataset. Key predictors included age, gender, hypertension, education level, urinary cadmium, and blood manganese levels. The findings indicate that machine learning can effectively predict the risk of cognitive impairment using a comprehensive set of clinical and environmental exposure data. Incorporating biomarkers of heavy metal exposure improved prediction accuracy and highlighted the role of environmental factors in cognitive decline. Further prospective studies are recommended to validate the models and assess their utility over time.

Keywords: cognitive impairment, heavy metal exposure, predictive models, aging

Procedia PDF Downloads 8
3886 Predicting Success and Failure in Drug Development Using Text Analysis

Authors: Zhi Hao Chow, Cian Mulligan, Jack Walsh, Antonio Garzon Vico, Dimitar Krastev

Abstract:

Drug development is resource-intensive, time-consuming, and increasingly expensive with each developmental stage. The success rates of drug development are also relatively low, and the resources committed are wasted with each failed candidate. As such, a reliable method of predicting the success of drug development is in demand. The hypothesis was that some examples of failed drug candidates are pushed through developmental pipelines based on false confidence and may possess common linguistic features identifiable through sentiment analysis. Here, the concept of using text analysis to discover such features in research publications and investor reports as predictors of success was explored. R studios were used to perform text mining and lexicon-based sentiment analysis to identify affective phrases and determine their frequency in each document, then using SPSS to determine the relationship between our defined variables and the accuracy of predicting outcomes. A total of 161 publications were collected and categorised into 4 groups: (i) Cancer treatment, (ii) Neurodegenerative disease treatment, (iii) Vaccines, and (iv) Others (containing all other drugs that do not fit into the 3 categories). Text analysis was then performed on each document using 2 separate datasets (BING and AFINN) in R within the category of drugs to determine the frequency of positive or negative phrases in each document. A relative positivity and negativity value were then calculated by dividing the frequency of phrases with the word count of each document. Regression analysis was then performed with SPSS statistical software on each dataset (values from using BING or AFINN dataset during text analysis) using a random selection of 61 documents to construct a model. The remaining documents were then used to determine the predictive power of the models. Model constructed from BING predicts the outcome of drug performance in clinical trials with an overall percentage of 65.3%. AFINN model had a lower accuracy at predicting outcomes compared to the BING model at 62.5% but was not effective at predicting the failure of drugs in clinical trials. Overall, the study did not show significant efficacy of the model at predicting outcomes of drugs in development. Many improvements may need to be made to later iterations of the model to sufficiently increase the accuracy.

Keywords: data analysis, drug development, sentiment analysis, text-mining

Procedia PDF Downloads 162
3885 Wireless Sensor Anomaly Detection Using Soft Computing

Authors: Mouhammd Alkasassbeh, Alaa Lasasmeh

Abstract:

We live in an era of rapid development as a result of significant scientific growth. Like other technologies, wireless sensor networks (WSNs) are playing one of the main roles. Based on WSNs, ZigBee adds many features to devices, such as minimum cost and power consumption, and increasing the range and connect ability of sensor nodes. ZigBee technology has come to be used in various fields, including science, engineering, and networks, and even in medicinal aspects of intelligence building. In this work, we generated two main datasets, the first being based on tree topology and the second on star topology. The datasets were evaluated by three machine learning (ML) algorithms: J48, meta.j48 and multilayer perceptron (MLP). Each topology was classified into normal and abnormal (attack) network traffic. The dataset used in our work contained simulated data from network simulation 2 (NS2). In each database, the Bayesian network meta.j48 classifier achieved the highest accuracy level among other classifiers, of 99.7% and 99.2% respectively.

Keywords: IDS, Machine learning, WSN, ZigBee technology

Procedia PDF Downloads 546
3884 Data Model to Predict Customize Skin Care Product Using Biosensor

Authors: Ashi Gautam, Isha Shukla, Akhil Seghal

Abstract:

Biosensors are analytical devices that use a biological sensing element to detect and measure a specific chemical substance or biomolecule in a sample. These devices are widely used in various fields, including medical diagnostics, environmental monitoring, and food analysis, due to their high specificity, sensitivity, and selectivity. In this research paper, a machine learning model is proposed for predicting the suitability of skin care products based on biosensor readings. The proposed model takes in features extracted from biosensor readings, such as biomarker concentration, skin hydration level, inflammation presence, sensitivity, and free radicals, and outputs the most appropriate skin care product for an individual. This model is trained on a dataset of biosensor readings and corresponding skin care product information. The model's performance is evaluated using several metrics, including accuracy, precision, recall, and F1 score. The aim of this research is to develop a personalised skin care product recommendation system using biosensor data. By leveraging the power of machine learning, the proposed model can accurately predict the most suitable skin care product for an individual based on their biosensor readings. This is particularly useful in the skin care industry, where personalised recommendations can lead to better outcomes for consumers. The developed model is based on supervised learning, which means that it is trained on a labeled dataset of biosensor readings and corresponding skin care product information. The model uses these labeled data to learn patterns and relationships between the biosensor readings and skin care products. Once trained, the model can predict the most suitable skin care product for an individual based on their biosensor readings. The results of this study show that the proposed machine learning model can accurately predict the most appropriate skin care product for an individual based on their biosensor readings. The evaluation metrics used in this study demonstrate the effectiveness of the model in predicting skin care products. This model has significant potential for practical use in the skin care industry for personalised skin care product recommendations. The proposed machine learning model for predicting the suitability of skin care products based on biosensor readings is a promising development in the skin care industry. The model's ability to accurately predict the most appropriate skin care product for an individual based on their biosensor readings can lead to better outcomes for consumers. Further research can be done to improve the model's accuracy and effectiveness.

Keywords: biosensors, data model, machine learning, skin care

Procedia PDF Downloads 102
3883 Conversational Assistive Technology of Visually Impaired Person for Social Interaction

Authors: Komal Ghafoor, Tauqir Ahmad, Murtaza Hanif, Hira Zaheer

Abstract:

Assistive technology has been developed to support visually impaired people in their social interactions. Conversation assistive technology is designed to enhance communication skills, facilitate social interaction, and improve the quality of life of visually impaired individuals. This technology includes speech recognition, text-to-speech features, and other communication devices that enable users to communicate with others in real time. The technology uses natural language processing and machine learning algorithms to analyze spoken language and provide appropriate responses. It also includes features such as voice commands and audio feedback to provide users with a more immersive experience. These technologies have been shown to increase the confidence and independence of visually impaired individuals in social situations and have the potential to improve their social skills and relationships with others. Overall, conversation-assistive technology is a promising tool for empowering visually impaired people and improving their social interactions. One of the key benefits of conversation-assistive technology is that it allows visually impaired individuals to overcome communication barriers that they may face in social situations. It can help them to communicate more effectively with friends, family, and colleagues, as well as strangers in public spaces. By providing a more seamless and natural way to communicate, this technology can help to reduce feelings of isolation and improve overall quality of life. The main objective of this research is to give blind users the capability to move around in unfamiliar environments through a user-friendly device by face, object, and activity recognition system. This model evaluates the accuracy of activity recognition. This device captures the front view of the blind, detects the objects, recognizes the activities, and answers the blind query. It is implemented using the front view of the camera. The local dataset is collected that includes different 1st-person human activities. The results obtained are the identification of the activities that the VGG-16 model was trained on, where Hugging, Shaking Hands, Talking, Walking, Waving video, etc.

Keywords: dataset, visually impaired person, natural language process, human activity recognition

Procedia PDF Downloads 67
3882 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 152
3881 Performance Comparison of Cooperative Banks in the EU, USA and Canada

Authors: Matěj Kuc

Abstract:

This paper compares different types of profitability measures of cooperative banks from two developed regions: the European Union and the United States of America together with Canada. We created balanced dataset of more than 200 cooperative banks covering 2011-2016 period. We made series of tests and run Random Effects estimation on panel data. We found that American and Canadian cooperatives are more profitable in terms of return on assets (ROA) and return on equity (ROE). There is no significant difference in net interest margin (NIM). Our results show that the North American cooperative banks accommodated better to the current market environment.

Keywords: cooperative banking, panel data, profitability measures, random effects

Procedia PDF Downloads 115
3880 Automated Localization of Palpebral Conjunctiva and Hemoglobin Determination Using Smart Phone Camera

Authors: Faraz Tahir, M. Usman Akram, Albab Ahmad Khan, Mujahid Abbass, Ahmad Tariq, Nuzhat Qaiser

Abstract:

The objective of this study was to evaluate the Degree of anemia by taking the picture of the palpebral conjunctiva using Smartphone Camera. We have first localized the region of interest from the image and then extracted certain features from that Region of interest and trained SVM classifier on those features and then, as a result, our system classifies the image in real-time on their level of hemoglobin. The proposed system has given an accuracy of 70%. We have trained our classifier on a locally gathered dataset of 30 patients.

Keywords: anemia, palpebral conjunctiva, SVM, smartphone

Procedia PDF Downloads 509
3879 A Mean–Variance–Skewness Portfolio Optimization Model

Authors: Kostas Metaxiotis

Abstract:

Portfolio optimization is one of the most important topics in finance. This paper proposes a mean–variance–skewness (MVS) portfolio optimization model. Traditionally, the portfolio optimization problem is solved by using the mean–variance (MV) framework. In this study, we formulate the proposed model as a three-objective optimization problem, where the portfolio's expected return and skewness are maximized whereas the portfolio risk is minimized. For solving the proposed three-objective portfolio optimization model we apply an adapted version of the non-dominated sorting genetic algorithm (NSGAII). Finally, we use a real dataset from FTSE-100 for validating the proposed model.

Keywords: evolutionary algorithms, portfolio optimization, skewness, stock selection

Procedia PDF Downloads 205
3878 Prediction of Mental Health: Heuristic Subjective Well-Being Model on Perceived Stress Scale

Authors: Ahmet Karakuş, Akif Can Kilic, Emre Alptekin

Abstract:

A growing number of studies have been conducted to determine how well-being may be predicted using well-designed models. It is necessary to investigate the backgrounds of features in order to construct a viable Subjective Well-Being (SWB) model. We have picked the suitable variables from the literature on SWB that are acceptable for real-world data instructions. The goal of this work is to evaluate the model by feeding it with SWB characteristics and then categorizing the stress levels using machine learning methods to see how well it performs on a real dataset. Despite the fact that it is a multiclass classification issue, we have achieved significant metric scores, which may be taken into account for a specific task.

Keywords: machine learning, multiclassification problem, subjective well-being, perceived stress scale

Procedia PDF Downloads 138
3877 Analysis of Brownfield Soil Contamination Using Local Government Planning Data

Authors: Emma E. Hellawell, Susan J. Hughes

Abstract:

BBrownfield sites are currently being redeveloped for residential use. Information on soil contamination on these former industrial sites is collected as part of the planning process by the local government. This research project analyses this untapped resource of environmental data, using site investigation data submitted to a local Borough Council, in Surrey, UK. Over 150 site investigation reports were collected and interrogated to extract relevant information. This study involved three phases. Phase 1 was the development of a database for soil contamination information from local government reports. This database contained information on the source, history, and quality of the data together with the chemical information on the soil that was sampled. Phase 2 involved obtaining site investigation reports for development within the study area and extracting the required information for the database. Phase 3 was the data analysis and interpretation of key contaminants to evaluate typical levels of contaminants, their distribution within the study area, and relating these results to current guideline levels of risk for future site users. Preliminary results for a pilot study using a sample of the dataset have been obtained. This pilot study showed there is some inconsistency in the quality of the reports and measured data, and careful interpretation of the data is required. Analysis of the information has found high levels of lead in shallow soil samples, with mean and median levels exceeding the current guidance for residential use. The data also showed elevated (but below guidance) levels of potentially carcinogenic polyaromatic hydrocarbons. Of particular concern from the data was the high detection rate for asbestos fibers. These were found at low concentrations in 25% of the soil samples tested (however, the sample set was small). Contamination levels of the remaining chemicals tested were all below the guidance level for residential site use. These preliminary pilot study results will be expanded, and results for the whole local government area will be presented at the conference. The pilot study has demonstrated the potential for this extensive dataset to provide greater information on local contamination levels. This can help inform regulators and developers and lead to more targeted site investigations, improving risk assessments, and brownfield development.

Keywords: Brownfield development, contaminated land, local government planning data, site investigation

Procedia PDF Downloads 142
3876 An Improvement of ComiR Algorithm for MicroRNA Target Prediction by Exploiting Coding Region Sequences of mRNAs

Authors: Giorgio Bertolazzi, Panayiotis Benos, Michele Tumminello, Claudia Coronnello

Abstract:

MicroRNAs are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR (Combinatorial miRNA targeting) is a user friendly web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR incorporates miRNA expression in a thermodynamic binding model, and it associates each gene with the probability of being a target of a set of miRNAs. ComiR algorithms were trained with the information regarding binding sites in the 3’UTR region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein; this protein is a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in the ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that the ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'UTR and coding regions, should be considered in a comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’UTR based one.

Keywords: AGO1, coding region, Drosophila melanogaster, microRNA target prediction

Procedia PDF Downloads 455
3875 Leveraging Unannotated Data to Improve Question Answering for French Contract Analysis

Authors: Touila Ahmed, Elie Louis, Hamza Gharbi

Abstract:

State of the art question answering models have recently shown impressive performance especially in a zero-shot setting. This approach is particularly useful when confronted with a highly diverse domain such as the legal field, in which it is increasingly difficult to have a dataset covering every notion and concept. In this work, we propose a flexible generative question answering approach to contract analysis as well as a weakly supervised procedure to leverage unannotated data and boost our models’ performance in general, and their zero-shot performance in particular.

Keywords: question answering, contract analysis, zero-shot, natural language processing, generative models, self-supervision

Procedia PDF Downloads 200
3874 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 89
3873 Using Greywolf Optimized Machine Learning Algorithms to Improve Accuracy for Predicting Hospital Readmission for Diabetes

Authors: Vincent Liu

Abstract:

Machine learning algorithms (ML) can achieve high accuracy in predicting outcomes compared to classical models. Metaheuristic, nature-inspired algorithms can enhance traditional ML algorithms by optimizing them such as by performing feature selection. We compare ten ML algorithms to predict 30-day hospital readmission rates for diabetes patients in the US using a dataset from UCI Machine Learning Repository with feature selection performed by Greywolf nature-inspired algorithm. The baseline accuracy for the initial random forest model was 65%. After performing feature engineering, SMOTE for class balancing, and Greywolf optimization, the machine learning algorithms showed better metrics, including F1 scores, accuracy, and confusion matrix with improvements ranging in 10%-30%, and a best model of XGBoost with an accuracy of 95%. Applying machine learning this way can improve patient outcomes as unnecessary rehospitalizations can be prevented by focusing on patients that are at a higher risk of readmission.

Keywords: diabetes, machine learning, 30-day readmission, metaheuristic

Procedia PDF Downloads 65
3872 Application of a New Efficient Normal Parameter Reduction Algorithm of Soft Sets in Online Shopping

Authors: Xiuqin Ma, Hongwu Qin

Abstract:

A new efficient normal parameter reduction algorithm of soft set in decision making was proposed. However, up to the present, few documents have focused on real-life applications of this algorithm. Accordingly, we apply a New Efficient Normal Parameter Reduction algorithm into real-life datasets of online shopping, such as Blackberry Mobile Phone Dataset. Experimental results show that this algorithm is not only suitable but feasible for dealing with the online shopping.

Keywords: soft sets, parameter reduction, normal parameter reduction, online shopping

Procedia PDF Downloads 517
3871 EQMamba - Method Suggestion for Earthquake Detection and Phase Picking

Authors: Noga Bregman

Abstract:

Accurate and efficient earthquake detection and phase picking are crucial for seismic hazard assessment and emergency response. This study introduces EQMamba, a deep-learning method that combines the strengths of the Earthquake Transformer and the Mamba model for simultaneous earthquake detection and phase picking. EQMamba leverages the computational efficiency of Mamba layers to process longer seismic sequences while maintaining a manageable model size. The proposed architecture integrates convolutional neural networks (CNNs), bidirectional long short-term memory (BiLSTM) networks, and Mamba blocks. The model employs an encoder composed of convolutional layers and max pooling operations, followed by residual CNN blocks for feature extraction. Mamba blocks are applied to the outputs of BiLSTM blocks, efficiently capturing long-range dependencies in seismic data. Separate decoders are used for earthquake detection, P-wave picking, and S-wave picking. We trained and evaluated EQMamba using a subset of the STEAD dataset, a comprehensive collection of labeled seismic waveforms. The model was trained using a weighted combination of binary cross-entropy loss functions for each task, with the Adam optimizer and a scheduled learning rate. Data augmentation techniques were employed to enhance the model's robustness. Performance comparisons were conducted between EQMamba and the EQTransformer over 20 epochs on this modest-sized STEAD subset. Results demonstrate that EQMamba achieves superior performance, with higher F1 scores and faster convergence compared to EQTransformer. EQMamba reached F1 scores of 0.8 by epoch 5 and maintained higher scores throughout training. The model also exhibited more stable validation performance, indicating good generalization capabilities. While both models showed lower accuracy in phase-picking tasks compared to detection, EQMamba's overall performance suggests significant potential for improving seismic data analysis. The rapid convergence and superior F1 scores of EQMamba, even on a modest-sized dataset, indicate promising scalability for larger datasets. This study contributes to the field of earthquake engineering by presenting a computationally efficient and accurate method for simultaneous earthquake detection and phase picking. Future work will focus on incorporating Mamba layers into the P and S pickers and further optimizing the architecture for seismic data specifics. The EQMamba method holds the potential for enhancing real-time earthquake monitoring systems and improving our understanding of seismic events.

Keywords: earthquake, detection, phase picking, s waves, p waves, transformer, deep learning, seismic waves

Procedia PDF Downloads 63
3870 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning

Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang

Abstract:

Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.

Keywords: intelligence, software architecture, re-architecting, software reuse, High level design

Procedia PDF Downloads 124
3869 Evaluating the Effects of a Positive Bitcoin Shock on the U.S Economy: A TVP-FAVAR Model with Stochastic Volatility

Authors: Olfa Kaabia, Ilyes Abid, Khaled Guesmi

Abstract:

This pioneer paper studies whether and how Bitcoin shocks are transmitted to the U.S economy. We employ a new methodology: TVP FAVAR model with stochastic volatility. We use a large dataset of 111 major U.S variables from 1959:m1 to 2016:m12. The results show that Bitcoin shocks significantly impact the U.S. economy. This significant impact is pronounced in a volatile and increasing U.S economy. The Bitcoin has a positive relationship on the U.S real activity, and a negative one on U.S prices and interest rates. Effects on the Monetary Policy exist via the inter-est rates and the Money, Credit and Finance transmission channels.

Keywords: bitcoin, US economy, FAVAR models, stochastic volatility

Procedia PDF Downloads 249
3868 Intelligent Prediction System for Diagnosis of Heart Attack

Authors: Oluwaponmile David Alao

Abstract:

Due to an increase in the death rate as a result of heart attack. There is need to develop a system that can be useful in the diagnosis of the disease at the medical centre. This system will help in preventing misdiagnosis that may occur from the medical practitioner or the physicians. In this research work, heart disease dataset obtained from UCI repository has been used to develop an intelligent prediction diagnosis system. The system is modeled on a feedforwad neural network and trained with back propagation neural network. A recognition rate of 86% is obtained from the testing of the network.

Keywords: heart disease, artificial neural network, diagnosis, prediction system

Procedia PDF Downloads 453
3867 Do the Health Benefits of Oil-Led Economic Development Outweigh the Potential Health Harms from Environmental Pollution in Nigeria?

Authors: Marian Emmanuel Okon

Abstract:

Introduction: The Niger Delta region of Nigeria has a vast reserve of oil and gas, which has globally positioned the nation as the sixth largest exporter of crude oil. Production rapidly rose following oil discovery. In most oil producing nations of the world, the wealth generated from oil production and export has propelled economic advancement, enabling the development of industries and other relevant infrastructures. Therefore, it can be assumed that majority of the oil resource such as Nigeria’s, has the potential to improve the health of the population via job creation and derived revenues. However, the health benefits of this economic development might be offset by the environmental consequences of oil exploitation and production. Objective: This research aims to evaluate the balance between the health benefits of oil-led economic development and harmful environmental consequences of crude oil exploitation in Nigeria. Study Design: A pathway has been designed to guide data search and this study. The model created will assess the relationship between oil-led economic development and population health development via job creation, improvement of education, development of infrastructure and other forms of development as well as through harmful environmental consequences from oil activities. Data/Emerging Findings: Diverse potentially suitable datasets which are at different geographical scales have been identified, obtained or applied for and the dataset from the World Bank has been the most thoroughly explored. This large dataset contains information that would enable the longitudinal assessment of both the health benefits and harms from oil exploitation in Nigeria as well as identify the disparities that exist between the communities, states and regions. However, these data do not extend far back enough in time to capture the start of crude oil production. Thus, it is possible that the maximum economic benefits and health harms could be missed. To deal with this shortcoming, the potential for a comparative study with countries like United Kingdom, Morocco and Cote D’ivoire has also been taken into consideration, so as to evaluate the differences between these countries as well as identify the areas of improvement in Nigeria’s environmental and health policies. Notwithstanding, these data have shown some differences in each country’s economic, environmental and health state over time as well as a corresponding summary statistics. Conclusion: In theory, the beneficial effects of oil exploitation to the health of the population may be substantial as large swaths of the ‘wider determinants’ of population heath are influenced by the wealth of a nation. However, if uncontrolled, the consequences from environmental pollution and degradation may outweigh these benefits. Thus, there is a need to address this, in order to improve environmental and population health in Nigeria.

Keywords: environmental pollution, health benefits, oil-led economic development, petroleum exploitation

Procedia PDF Downloads 343
3866 Machine Learning Approach for Automating Electronic Component Error Classification and Detection

Authors: Monica Racha, Siva Chandrasekaran, Alex Stojcevski

Abstract:

The engineering programs focus on promoting students' personal and professional development by ensuring that students acquire technical and professional competencies during four-year studies. The traditional engineering laboratory provides an opportunity for students to "practice by doing," and laboratory facilities aid them in obtaining insight and understanding of their discipline. Due to rapid technological advancements and the current COVID-19 outbreak, the traditional labs were transforming into virtual learning environments. Aim: To better understand the limitations of the physical laboratory, this research study aims to use a Machine Learning (ML) algorithm that interfaces with the Augmented Reality HoloLens and predicts the image behavior to classify and detect the electronic components. The automated electronic components error classification and detection automatically detect and classify the position of all components on a breadboard by using the ML algorithm. This research will assist first-year undergraduate engineering students in conducting laboratory practices without any supervision. With the help of HoloLens, and ML algorithm, students will reduce component placement error on a breadboard and increase the efficiency of simple laboratory practices virtually. Method: The images of breadboards, resistors, capacitors, transistors, and other electrical components will be collected using HoloLens 2 and stored in a database. The collected image dataset will then be used for training a machine learning model. The raw images will be cleaned, processed, and labeled to facilitate further analysis of components error classification and detection. For instance, when students conduct laboratory experiments, the HoloLens captures images of students placing different components on a breadboard. The images are forwarded to the server for detection in the background. A hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm will be used to train the dataset for object recognition and classification. The convolution layer extracts image features, which are then classified using Support Vector Machine (SVM). By adequately labeling the training data and classifying, the model will predict, categorize, and assess students in placing components correctly. As a result, the data acquired through HoloLens includes images of students assembling electronic components. It constantly checks to see if students appropriately position components in the breadboard and connect the components to function. When students misplace any components, the HoloLens predicts the error before the user places the components in the incorrect proportion and fosters students to correct their mistakes. This hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm automating electronic component error classification and detection approach eliminates component connection problems and minimizes the risk of component damage. Conclusion: These augmented reality smart glasses powered by machine learning provide a wide range of benefits to supervisors, professionals, and students. It helps customize the learning experience, which is particularly beneficial in large classes with limited time. It determines the accuracy with which machine learning algorithms can forecast whether students are making the correct decisions and completing their laboratory tasks.

Keywords: augmented reality, machine learning, object recognition, virtual laboratories

Procedia PDF Downloads 142
3865 Empirical Study of Partitions Similarity Measures

Authors: Abdelkrim Alfalah, Lahcen Ouarbya, John Howroyd

Abstract:

This paper investigates and compares the performance of four existing distances and similarity measures between partitions. The partition measures considered are Rand Index (RI), Adjusted Rand Index (ARI), Variation of Information (VI), and Normalised Variation of Information (NVI). This work investigates the ability of these partition measures to capture three predefined intuitions: the variation within randomly generated partitions, the sensitivity to small perturbations, and finally the independence from the dataset scale. It has been shown that the Adjusted Rand Index performed well overall, with regards to these three intuitions.

Keywords: clustering, comparing partitions, similarity measure, partition distance, partition metric, similarity between partitions, clustering comparison.

Procedia PDF Downloads 209
3864 Predicting Costs in Construction Projects with Machine Learning: A Detailed Study Based on Activity-Level Data

Authors: Soheila Sadeghi

Abstract:

Construction projects are complex and often subject to significant cost overruns due to the multifaceted nature of the activities involved. Accurate cost estimation is crucial for effective budget planning and resource allocation. Traditional methods for predicting overruns often rely on expert judgment or analysis of historical data, which can be time-consuming, subjective, and may fail to consider important factors. However, with the increasing availability of data from construction projects, machine learning techniques can be leveraged to improve the accuracy of overrun predictions. This study applied machine learning algorithms to enhance the prediction of cost overruns in a case study of a construction project. The methodology involved the development and evaluation of two machine learning models: Random Forest and Neural Networks. Random Forest can handle high-dimensional data, capture complex relationships, and provide feature importance estimates. Neural Networks, particularly Deep Neural Networks (DNNs), are capable of automatically learning and modeling complex, non-linear relationships between input features and the target variable. These models can adapt to new data, reduce human bias, and uncover hidden patterns in the dataset. The findings of this study demonstrate that both Random Forest and Neural Networks can significantly improve the accuracy of cost overrun predictions compared to traditional methods. The Random Forest model also identified key cost drivers and risk factors, such as changes in the scope of work and delays in material delivery, which can inform better project risk management. However, the study acknowledges several limitations. First, the findings are based on a single construction project, which may limit the generalizability of the results to other projects or contexts. Second, the dataset, although comprehensive, may not capture all relevant factors influencing cost overruns, such as external economic conditions or political factors. Third, the study focuses primarily on cost overruns, while schedule overruns are not explicitly addressed. Future research should explore the application of machine learning techniques to a broader range of projects, incorporate additional data sources, and investigate the prediction of both cost and schedule overruns simultaneously.

Keywords: cost prediction, machine learning, project management, random forest, neural networks

Procedia PDF Downloads 65
3863 A Machine Learning Approach for Efficient Resource Management in Construction Projects

Authors: Soheila Sadeghi

Abstract:

Construction projects are complex and often subject to significant cost overruns due to the multifaceted nature of the activities involved. Accurate cost estimation is crucial for effective budget planning and resource allocation. Traditional methods for predicting overruns often rely on expert judgment or analysis of historical data, which can be time-consuming, subjective, and may fail to consider important factors. However, with the increasing availability of data from construction projects, machine learning techniques can be leveraged to improve the accuracy of overrun predictions. This study applied machine learning algorithms to enhance the prediction of cost overruns in a case study of a construction project. The methodology involved the development and evaluation of two machine learning models: Random Forest and Neural Networks. Random Forest can handle high-dimensional data, capture complex relationships, and provide feature importance estimates. Neural Networks, particularly Deep Neural Networks (DNNs), are capable of automatically learning and modeling complex, non-linear relationships between input features and the target variable. These models can adapt to new data, reduce human bias, and uncover hidden patterns in the dataset. The findings of this study demonstrate that both Random Forest and Neural Networks can significantly improve the accuracy of cost overrun predictions compared to traditional methods. The Random Forest model also identified key cost drivers and risk factors, such as changes in the scope of work and delays in material delivery, which can inform better project risk management. However, the study acknowledges several limitations. First, the findings are based on a single construction project, which may limit the generalizability of the results to other projects or contexts. Second, the dataset, although comprehensive, may not capture all relevant factors influencing cost overruns, such as external economic conditions or political factors. Third, the study focuses primarily on cost overruns, while schedule overruns are not explicitly addressed. Future research should explore the application of machine learning techniques to a broader range of projects, incorporate additional data sources, and investigate the prediction of both cost and schedule overruns simultaneously.

Keywords: resource allocation, machine learning, optimization, data-driven decision-making, project management

Procedia PDF Downloads 45
3862 Causal Relationship between Corporate Governance and Financial Information Transparency: A Simultaneous Equations Approach

Authors: Maali Kachouri, Anis Jarboui

Abstract:

We focus on the causal relationship between governance and information transparency as well as interrelation among the various governance mechanisms. This paper employs a simultaneous equations approach to show this relationship in the Tunisian context. Based on an 8-year dataset, our sample covers 28 listed companies over 2006-2013. Our findings suggest that internal and external governance mechanisms are interdependent. Moreover, in order to analyze the causal effect between information transparency and governance mechanisms, we found evidence that information transparency tends to increase good corporate governance practices.

Keywords: simultaneous equations approach, transparency, causal relationship, corporate governance

Procedia PDF Downloads 359
3861 Deep Learning to Enhance Mathematics Education for Secondary Students in Sri Lanka

Authors: Selvavinayagan Babiharan

Abstract:

This research aims to develop a deep learning platform to enhance mathematics education for secondary students in Sri Lanka. The platform will be designed to incorporate interactive and user-friendly features to engage students in active learning and promote their mathematical skills. The proposed platform will be developed using TensorFlow and Keras, two widely used deep learning frameworks. The system will be trained on a large dataset of math problems, which will be collected from Sri Lankan school curricula. The results of this research will contribute to the improvement of mathematics education in Sri Lanka and provide a valuable tool for teachers to enhance the learning experience of their students.

Keywords: information technology, education, machine learning, mathematics

Procedia PDF Downloads 87