Search results for: engineering dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3975

Search results for: engineering dataset

3705 Deep Learning-Based Liver 3D Slicer for Image-Guided Therapy: Segmentation and Needle Aspiration

Authors: Ahmedou Moulaye Idriss, Tfeil Yahya, Tamas Ungi, Gabor Fichtinger

Abstract:

Image-guided therapy (IGT) plays a crucial role in minimally invasive procedures for liver interventions. Accurate segmentation of the liver and precise needle placement is essential for successful interventions such as needle aspiration. In this study, we propose a deep learning-based liver 3D slicer designed to enhance segmentation accuracy and facilitate needle aspiration procedures. The developed 3D slicer leverages state-of-the-art convolutional neural networks (CNNs) for automatic liver segmentation in medical images. The CNN model is trained on a diverse dataset of liver images obtained from various imaging modalities, including computed tomography (CT) and magnetic resonance imaging (MRI). The trained model demonstrates robust performance in accurately delineating liver boundaries, even in cases with anatomical variations and pathological conditions. Furthermore, the 3D slicer integrates advanced image registration techniques to ensure accurate alignment of preoperative images with real-time interventional imaging. This alignment enhances the precision of needle placement during aspiration procedures, minimizing the risk of complications and improving overall intervention outcomes. To validate the efficacy of the proposed deep learning-based 3D slicer, a comprehensive evaluation is conducted using a dataset of clinical cases. Quantitative metrics, including the Dice similarity coefficient and Hausdorff distance, are employed to assess the accuracy of liver segmentation. Additionally, the performance of the 3D slicer in guiding needle aspiration procedures is evaluated through simulated and clinical interventions. Preliminary results demonstrate the effectiveness of the developed 3D slicer in achieving accurate liver segmentation and guiding needle aspiration procedures with high precision. The integration of deep learning techniques into the IGT workflow shows great promise for enhancing the efficiency and safety of liver interventions, ultimately contributing to improved patient outcomes.

Keywords: deep learning, liver segmentation, 3D slicer, image guided therapy, needle aspiration

Procedia PDF Downloads 15
3704 Comparison of Existing Predictor and Development of Computational Method for S- Palmitoylation Site Identification in Arabidopsis Thaliana

Authors: Ayesha Sanjana Kawser Parsha

Abstract:

S-acylation is an irreversible bond in which cysteine residues are linked to fatty acids palmitate (74%) or stearate (22%), either at the COOH or NH2 terminal, via a thioester linkage. There are several experimental methods that can be used to identify the S-palmitoylation site; however, since they require a lot of time, computational methods are becoming increasingly necessary. There aren't many predictors, however, that can locate S- palmitoylation sites in Arabidopsis Thaliana with sufficient accuracy. This research is based on the importance of building a better prediction tool. To identify the type of machine learning algorithm that predicts this site more accurately for the experimental dataset, several prediction tools were examined in this research, including the GPS PALM 6.0, pCysMod, GPS LIPID 1.0, CSS PALM 4.0, and NBA PALM. These analyses were conducted by constructing the receiver operating characteristics plot and the area under the curve score. An AI-driven deep learning-based prediction tool has been developed utilizing the analysis and three sequence-based input data, such as the amino acid composition, binary encoding profile, and autocorrelation features. The model was developed using five layers, two activation functions, associated parameters, and hyperparameters. The model was built using various combinations of features, and after training and validation, it performed better when all the features were present while using the experimental dataset for 8 and 10-fold cross-validations. While testing the model with unseen and new data, such as the GPS PALM 6.0 plant and pCysMod mouse, the model performed better, and the area under the curve score was near 1. It can be demonstrated that this model outperforms the prior tools in predicting the S- palmitoylation site in the experimental data set by comparing the area under curve score of 10-fold cross-validation of the new model with the established tools' area under curve score with their respective training sets. The objective of this study is to develop a prediction tool for Arabidopsis Thaliana that is more accurate than current tools, as measured by the area under the curve score. Plant food production and immunological treatment targets can both be managed by utilizing this method to forecast S- palmitoylation sites.

Keywords: S- palmitoylation, ROC PLOT, area under the curve, cross- validation score

Procedia PDF Downloads 46
3703 Developing Primary Care Datasets for a National Asthma Audit

Authors: Rachael Andrews, Viktoria McMillan, Shuaib Nasser, Christopher M. Roberts

Abstract:

Background and objective: The National Review of Asthma Deaths (NRAD) found that asthma management and care was inadequate in 26% of cases reviewed. Major shortfalls identified were adherence to national guidelines and standards and, particularly, the organisation of care, including supervision and monitoring in primary care, with 70% of cases reviewed having at least one avoidable factor in this area. 5.4 million people in the UK are diagnosed with and actively treated for asthma, and approximately 60,000 are admitted to hospital with acute exacerbations each year. The majority of people with asthma receive management and treatment solely in primary care. This has therefore created concern that many people within the UK are receiving sub-optimal asthma care resulting in unnecessary morbidity and risk of adverse outcome. NRAD concluded that a national asthma audit programme should be established to measure and improve processes, organisation, and outcomes of asthma care. Objective: To develop a primary care dataset enabling extraction of information from GP practices in Wales and providing robust data by which results and lessons could be drawn and drive service development and improvement. Methods: A multidisciplinary group of experts, including general practitioners, primary care organisation representatives, and asthma patients was formed and used as a source of governance and guidance. A review of asthma literature, guidance, and standards took place and was used to identify areas of asthma care which, if improved, would lead to better patient outcomes. Modified Delphi methodology was used to gain consensus from the expert group on which of the areas identified were to be prioritised, and an asthma patient and carer focus group held to seek views and feedback on areas of asthma care that were important to them. Areas of asthma care identified by both groups were mapped to asthma guidelines and standards to inform and develop primary and secondary care datasets covering both adult and pediatric care. Dataset development consisted of expert review and a targeted consultation process in order to seek broad stakeholder views and feedback. Results: Areas of asthma care identified as requiring prioritisation by the National Asthma Audit were: (i) Prescribing, (ii) Asthma diagnosis (iii) Asthma Reviews (iv) Personalised Asthma Action Plans (PAAPs) (v) Primary care follow-up after discharge from hospital (vi) Methodologies and primary care queries were developed to cover each of the areas of poor and variable asthma care identified and the queries designed to extract information directly from electronic patients’ records. Conclusion: This paper describes the methodological approach followed to develop primary care datasets for a National Asthma Audit. It sets out the principles behind the establishment of a National Asthma Audit programme in response to a national asthma mortality review and describes the development activities undertaken. Key process elements included: (i) mapping identified areas of poor and variable asthma care to national guidelines and standards, (ii) early engagement of experts, including clinicians and patients in the process, and (iii) targeted consultation of the queries to provide further insight into measures that were collectable, reproducible and relevant.

Keywords: asthma, primary care, general practice, dataset development

Procedia PDF Downloads 143
3702 DEEPMOTILE: Motility Analysis of Human Spermatozoa Using Deep Learning in Sri Lankan Population

Authors: Chamika Chiran Perera, Dananjaya Perera, Chirath Dasanayake, Banuka Athuraliya

Abstract:

Male infertility is a major problem in the world, and it is a neglected and sensitive health issue in Sri Lanka. It can be determined by analyzing human semen samples. Sperm motility is one of many factors that can evaluate male’s fertility potential. In Sri Lanka, this analysis is performed manually. Manual methods are time consuming and depend on the person, but they are reliable and it can depend on the expert. Machine learning and deep learning technologies are currently being investigated to automate the spermatozoa motility analysis, and these methods are unreliable. These automatic methods tend to produce false positive results and false detection. Current automatic methods support different techniques, and some of them are very expensive. Due to the geographical variance in spermatozoa characteristics, current automatic methods are not reliable for motility analysis in Sri Lanka. The suggested system, DeepMotile, is to explore a method to analyze motility of human spermatozoa automatically and present it to the andrology laboratories to overcome current issues. DeepMotile is a novel deep learning method for analyzing spermatozoa motility parameters in the Sri Lankan population. To implement the current approach, Sri Lanka patient data were collected anonymously as a dataset, and glass slides were used as a low-cost technique to analyze semen samples. Current problem was identified as microscopic object detection and tackling the problem. YOLOv5 was customized and used as the object detector, and it achieved 94 % mAP (mean average precision), 86% Precision, and 90% Recall with the gathered dataset. StrongSORT was used as the object tracker, and it was validated with andrology experts due to the unavailability of annotated ground truth data. Furthermore, this research has identified many potential ways for further investigation, and andrology experts can use this system to analyze motility parameters with realistic accuracy.

Keywords: computer vision, deep learning, convolutional neural networks, multi-target tracking, microscopic object detection and tracking, male infertility detection, motility analysis of human spermatozoa

Procedia PDF Downloads 69
3701 Comparative Analysis of Feature Extraction and Classification Techniques

Authors: R. L. Ujjwal, Abhishek Jain

Abstract:

In the field of computer vision, most facial variations such as identity, expression, emotions and gender have been extensively studied. Automatic age estimation has been rarely explored. With age progression of a human, the features of the face changes. This paper is providing a new comparable study of different type of algorithm to feature extraction [Hybrid features using HAAR cascade & HOG features] & classification [KNN & SVM] training dataset. By using these algorithms we are trying to find out one of the best classification algorithms. Same thing we have done on the feature selection part, we extract the feature by using HAAR cascade and HOG. This work will be done in context of age group classification model.

Keywords: computer vision, age group, face detection

Procedia PDF Downloads 331
3700 Predicting Success and Failure in Drug Development Using Text Analysis

Authors: Zhi Hao Chow, Cian Mulligan, Jack Walsh, Antonio Garzon Vico, Dimitar Krastev

Abstract:

Drug development is resource-intensive, time-consuming, and increasingly expensive with each developmental stage. The success rates of drug development are also relatively low, and the resources committed are wasted with each failed candidate. As such, a reliable method of predicting the success of drug development is in demand. The hypothesis was that some examples of failed drug candidates are pushed through developmental pipelines based on false confidence and may possess common linguistic features identifiable through sentiment analysis. Here, the concept of using text analysis to discover such features in research publications and investor reports as predictors of success was explored. R studios were used to perform text mining and lexicon-based sentiment analysis to identify affective phrases and determine their frequency in each document, then using SPSS to determine the relationship between our defined variables and the accuracy of predicting outcomes. A total of 161 publications were collected and categorised into 4 groups: (i) Cancer treatment, (ii) Neurodegenerative disease treatment, (iii) Vaccines, and (iv) Others (containing all other drugs that do not fit into the 3 categories). Text analysis was then performed on each document using 2 separate datasets (BING and AFINN) in R within the category of drugs to determine the frequency of positive or negative phrases in each document. A relative positivity and negativity value were then calculated by dividing the frequency of phrases with the word count of each document. Regression analysis was then performed with SPSS statistical software on each dataset (values from using BING or AFINN dataset during text analysis) using a random selection of 61 documents to construct a model. The remaining documents were then used to determine the predictive power of the models. Model constructed from BING predicts the outcome of drug performance in clinical trials with an overall percentage of 65.3%. AFINN model had a lower accuracy at predicting outcomes compared to the BING model at 62.5% but was not effective at predicting the failure of drugs in clinical trials. Overall, the study did not show significant efficacy of the model at predicting outcomes of drugs in development. Many improvements may need to be made to later iterations of the model to sufficiently increase the accuracy.

Keywords: data analysis, drug development, sentiment analysis, text-mining

Procedia PDF Downloads 124
3699 Drug-Drug Interaction Prediction in Diabetes Mellitus

Authors: Rashini Maduka, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

Drug-drug interactions (DDIs) can happen when two or more drugs are taken together. Today DDIs have become a serious health issue due to adverse drug effects. In vivo and in vitro methods for identifying DDIs are time-consuming and costly. Therefore, in-silico-based approaches are preferred in DDI identification. Most machine learning models for DDI prediction are used chemical and biological drug properties as features. However, some drug features are not available and costly to extract. Therefore, it is better to make automatic feature engineering. Furthermore, people who have diabetes already suffer from other diseases and take more than one medicine together. Then adverse drug effects may happen to diabetic patients and cause unpleasant reactions in the body. In this study, we present a model with a graph convolutional autoencoder and a graph decoder using a dataset from DrugBank version 5.1.3. The main objective of the model is to identify unknown interactions between antidiabetic drugs and the drugs taken by diabetic patients for other diseases. We considered automatic feature engineering and used Known DDIs only as the input for the model. Our model has achieved 0.86 in AUC and 0.86 in AP.

Keywords: drug-drug interaction prediction, graph embedding, graph convolutional networks, adverse drug effects

Procedia PDF Downloads 68
3698 Data Model to Predict Customize Skin Care Product Using Biosensor

Authors: Ashi Gautam, Isha Shukla, Akhil Seghal

Abstract:

Biosensors are analytical devices that use a biological sensing element to detect and measure a specific chemical substance or biomolecule in a sample. These devices are widely used in various fields, including medical diagnostics, environmental monitoring, and food analysis, due to their high specificity, sensitivity, and selectivity. In this research paper, a machine learning model is proposed for predicting the suitability of skin care products based on biosensor readings. The proposed model takes in features extracted from biosensor readings, such as biomarker concentration, skin hydration level, inflammation presence, sensitivity, and free radicals, and outputs the most appropriate skin care product for an individual. This model is trained on a dataset of biosensor readings and corresponding skin care product information. The model's performance is evaluated using several metrics, including accuracy, precision, recall, and F1 score. The aim of this research is to develop a personalised skin care product recommendation system using biosensor data. By leveraging the power of machine learning, the proposed model can accurately predict the most suitable skin care product for an individual based on their biosensor readings. This is particularly useful in the skin care industry, where personalised recommendations can lead to better outcomes for consumers. The developed model is based on supervised learning, which means that it is trained on a labeled dataset of biosensor readings and corresponding skin care product information. The model uses these labeled data to learn patterns and relationships between the biosensor readings and skin care products. Once trained, the model can predict the most suitable skin care product for an individual based on their biosensor readings. The results of this study show that the proposed machine learning model can accurately predict the most appropriate skin care product for an individual based on their biosensor readings. The evaluation metrics used in this study demonstrate the effectiveness of the model in predicting skin care products. This model has significant potential for practical use in the skin care industry for personalised skin care product recommendations. The proposed machine learning model for predicting the suitability of skin care products based on biosensor readings is a promising development in the skin care industry. The model's ability to accurately predict the most appropriate skin care product for an individual based on their biosensor readings can lead to better outcomes for consumers. Further research can be done to improve the model's accuracy and effectiveness.

Keywords: biosensors, data model, machine learning, skin care

Procedia PDF Downloads 48
3697 Conversational Assistive Technology of Visually Impaired Person for Social Interaction

Authors: Komal Ghafoor, Tauqir Ahmad, Murtaza Hanif, Hira Zaheer

Abstract:

Assistive technology has been developed to support visually impaired people in their social interactions. Conversation assistive technology is designed to enhance communication skills, facilitate social interaction, and improve the quality of life of visually impaired individuals. This technology includes speech recognition, text-to-speech features, and other communication devices that enable users to communicate with others in real time. The technology uses natural language processing and machine learning algorithms to analyze spoken language and provide appropriate responses. It also includes features such as voice commands and audio feedback to provide users with a more immersive experience. These technologies have been shown to increase the confidence and independence of visually impaired individuals in social situations and have the potential to improve their social skills and relationships with others. Overall, conversation-assistive technology is a promising tool for empowering visually impaired people and improving their social interactions. One of the key benefits of conversation-assistive technology is that it allows visually impaired individuals to overcome communication barriers that they may face in social situations. It can help them to communicate more effectively with friends, family, and colleagues, as well as strangers in public spaces. By providing a more seamless and natural way to communicate, this technology can help to reduce feelings of isolation and improve overall quality of life. The main objective of this research is to give blind users the capability to move around in unfamiliar environments through a user-friendly device by face, object, and activity recognition system. This model evaluates the accuracy of activity recognition. This device captures the front view of the blind, detects the objects, recognizes the activities, and answers the blind query. It is implemented using the front view of the camera. The local dataset is collected that includes different 1st-person human activities. The results obtained are the identification of the activities that the VGG-16 model was trained on, where Hugging, Shaking Hands, Talking, Walking, Waving video, etc.

Keywords: dataset, visually impaired person, natural language process, human activity recognition

Procedia PDF Downloads 31
3696 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 121
3695 Performance Comparison of Cooperative Banks in the EU, USA and Canada

Authors: Matěj Kuc

Abstract:

This paper compares different types of profitability measures of cooperative banks from two developed regions: the European Union and the United States of America together with Canada. We created balanced dataset of more than 200 cooperative banks covering 2011-2016 period. We made series of tests and run Random Effects estimation on panel data. We found that American and Canadian cooperatives are more profitable in terms of return on assets (ROA) and return on equity (ROE). There is no significant difference in net interest margin (NIM). Our results show that the North American cooperative banks accommodated better to the current market environment.

Keywords: cooperative banking, panel data, profitability measures, random effects

Procedia PDF Downloads 93
3694 Automated Localization of Palpebral Conjunctiva and Hemoglobin Determination Using Smart Phone Camera

Authors: Faraz Tahir, M. Usman Akram, Albab Ahmad Khan, Mujahid Abbass, Ahmad Tariq, Nuzhat Qaiser

Abstract:

The objective of this study was to evaluate the Degree of anemia by taking the picture of the palpebral conjunctiva using Smartphone Camera. We have first localized the region of interest from the image and then extracted certain features from that Region of interest and trained SVM classifier on those features and then, as a result, our system classifies the image in real-time on their level of hemoglobin. The proposed system has given an accuracy of 70%. We have trained our classifier on a locally gathered dataset of 30 patients.

Keywords: anemia, palpebral conjunctiva, SVM, smartphone

Procedia PDF Downloads 455
3693 AutoML: Comprehensive Review and Application to Engineering Datasets

Authors: Parsa Mahdavi, M. Amin Hariri-Ardebili

Abstract:

The development of accurate machine learning and deep learning models traditionally demands hands-on expertise and a solid background to fine-tune hyperparameters. With the continuous expansion of datasets in various scientific and engineering domains, researchers increasingly turn to machine learning methods to unveil hidden insights that may elude classic regression techniques. This surge in adoption raises concerns about the adequacy of the resultant meta-models and, consequently, the interpretation of the findings. In response to these challenges, automated machine learning (AutoML) emerges as a promising solution, aiming to construct machine learning models with minimal intervention or guidance from human experts. AutoML encompasses crucial stages such as data preparation, feature engineering, hyperparameter optimization, and neural architecture search. This paper provides a comprehensive overview of the principles underpinning AutoML, surveying several widely-used AutoML platforms. Additionally, the paper offers a glimpse into the application of AutoML on various engineering datasets. By comparing these results with those obtained through classical machine learning methods, the paper quantifies the uncertainties inherent in the application of a single ML model versus the holistic approach provided by AutoML. These examples showcase the efficacy of AutoML in extracting meaningful patterns and insights, emphasizing its potential to revolutionize the way we approach and analyze complex datasets.

Keywords: automated machine learning, uncertainty, engineering dataset, regression

Procedia PDF Downloads 34
3692 A Mean–Variance–Skewness Portfolio Optimization Model

Authors: Kostas Metaxiotis

Abstract:

Portfolio optimization is one of the most important topics in finance. This paper proposes a mean–variance–skewness (MVS) portfolio optimization model. Traditionally, the portfolio optimization problem is solved by using the mean–variance (MV) framework. In this study, we formulate the proposed model as a three-objective optimization problem, where the portfolio's expected return and skewness are maximized whereas the portfolio risk is minimized. For solving the proposed three-objective portfolio optimization model we apply an adapted version of the non-dominated sorting genetic algorithm (NSGAII). Finally, we use a real dataset from FTSE-100 for validating the proposed model.

Keywords: evolutionary algorithms, portfolio optimization, skewness, stock selection

Procedia PDF Downloads 153
3691 Prediction of Mental Health: Heuristic Subjective Well-Being Model on Perceived Stress Scale

Authors: Ahmet Karakuş, Akif Can Kilic, Emre Alptekin

Abstract:

A growing number of studies have been conducted to determine how well-being may be predicted using well-designed models. It is necessary to investigate the backgrounds of features in order to construct a viable Subjective Well-Being (SWB) model. We have picked the suitable variables from the literature on SWB that are acceptable for real-world data instructions. The goal of this work is to evaluate the model by feeding it with SWB characteristics and then categorizing the stress levels using machine learning methods to see how well it performs on a real dataset. Despite the fact that it is a multiclass classification issue, we have achieved significant metric scores, which may be taken into account for a specific task.

Keywords: machine learning, multiclassification problem, subjective well-being, perceived stress scale

Procedia PDF Downloads 93
3690 Analysis of Brownfield Soil Contamination Using Local Government Planning Data

Authors: Emma E. Hellawell, Susan J. Hughes

Abstract:

BBrownfield sites are currently being redeveloped for residential use. Information on soil contamination on these former industrial sites is collected as part of the planning process by the local government. This research project analyses this untapped resource of environmental data, using site investigation data submitted to a local Borough Council, in Surrey, UK. Over 150 site investigation reports were collected and interrogated to extract relevant information. This study involved three phases. Phase 1 was the development of a database for soil contamination information from local government reports. This database contained information on the source, history, and quality of the data together with the chemical information on the soil that was sampled. Phase 2 involved obtaining site investigation reports for development within the study area and extracting the required information for the database. Phase 3 was the data analysis and interpretation of key contaminants to evaluate typical levels of contaminants, their distribution within the study area, and relating these results to current guideline levels of risk for future site users. Preliminary results for a pilot study using a sample of the dataset have been obtained. This pilot study showed there is some inconsistency in the quality of the reports and measured data, and careful interpretation of the data is required. Analysis of the information has found high levels of lead in shallow soil samples, with mean and median levels exceeding the current guidance for residential use. The data also showed elevated (but below guidance) levels of potentially carcinogenic polyaromatic hydrocarbons. Of particular concern from the data was the high detection rate for asbestos fibers. These were found at low concentrations in 25% of the soil samples tested (however, the sample set was small). Contamination levels of the remaining chemicals tested were all below the guidance level for residential site use. These preliminary pilot study results will be expanded, and results for the whole local government area will be presented at the conference. The pilot study has demonstrated the potential for this extensive dataset to provide greater information on local contamination levels. This can help inform regulators and developers and lead to more targeted site investigations, improving risk assessments, and brownfield development.

Keywords: Brownfield development, contaminated land, local government planning data, site investigation

Procedia PDF Downloads 109
3689 Wireless Sensor Anomaly Detection Using Soft Computing

Authors: Mouhammd Alkasassbeh, Alaa Lasasmeh

Abstract:

We live in an era of rapid development as a result of significant scientific growth. Like other technologies, wireless sensor networks (WSNs) are playing one of the main roles. Based on WSNs, ZigBee adds many features to devices, such as minimum cost and power consumption, and increasing the range and connect ability of sensor nodes. ZigBee technology has come to be used in various fields, including science, engineering, and networks, and even in medicinal aspects of intelligence building. In this work, we generated two main datasets, the first being based on tree topology and the second on star topology. The datasets were evaluated by three machine learning (ML) algorithms: J48, meta.j48 and multilayer perceptron (MLP). Each topology was classified into normal and abnormal (attack) network traffic. The dataset used in our work contained simulated data from network simulation 2 (NS2). In each database, the Bayesian network meta.j48 classifier achieved the highest accuracy level among other classifiers, of 99.7% and 99.2% respectively.

Keywords: IDS, Machine learning, WSN, ZigBee technology

Procedia PDF Downloads 516
3688 An Improvement of ComiR Algorithm for MicroRNA Target Prediction by Exploiting Coding Region Sequences of mRNAs

Authors: Giorgio Bertolazzi, Panayiotis Benos, Michele Tumminello, Claudia Coronnello

Abstract:

MicroRNAs are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR (Combinatorial miRNA targeting) is a user friendly web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR incorporates miRNA expression in a thermodynamic binding model, and it associates each gene with the probability of being a target of a set of miRNAs. ComiR algorithms were trained with the information regarding binding sites in the 3’UTR region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein; this protein is a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in the ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that the ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'UTR and coding regions, should be considered in a comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’UTR based one.

Keywords: AGO1, coding region, Drosophila melanogaster, microRNA target prediction

Procedia PDF Downloads 412
3687 Leveraging Unannotated Data to Improve Question Answering for French Contract Analysis

Authors: Touila Ahmed, Elie Louis, Hamza Gharbi

Abstract:

State of the art question answering models have recently shown impressive performance especially in a zero-shot setting. This approach is particularly useful when confronted with a highly diverse domain such as the legal field, in which it is increasingly difficult to have a dataset covering every notion and concept. In this work, we propose a flexible generative question answering approach to contract analysis as well as a weakly supervised procedure to leverage unannotated data and boost our models’ performance in general, and their zero-shot performance in particular.

Keywords: question answering, contract analysis, zero-shot, natural language processing, generative models, self-supervision

Procedia PDF Downloads 145
3686 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity

Authors: Shaan Khosla, Jon Krohn

Abstract:

In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.

Keywords: AI, machine learning, NLP, recruiting

Procedia PDF Downloads 58
3685 Application of a New Efficient Normal Parameter Reduction Algorithm of Soft Sets in Online Shopping

Authors: Xiuqin Ma, Hongwu Qin

Abstract:

A new efficient normal parameter reduction algorithm of soft set in decision making was proposed. However, up to the present, few documents have focused on real-life applications of this algorithm. Accordingly, we apply a New Efficient Normal Parameter Reduction algorithm into real-life datasets of online shopping, such as Blackberry Mobile Phone Dataset. Experimental results show that this algorithm is not only suitable but feasible for dealing with the online shopping.

Keywords: soft sets, parameter reduction, normal parameter reduction, online shopping

Procedia PDF Downloads 485
3684 Evaluating the Effects of a Positive Bitcoin Shock on the U.S Economy: A TVP-FAVAR Model with Stochastic Volatility

Authors: Olfa Kaabia, Ilyes Abid, Khaled Guesmi

Abstract:

This pioneer paper studies whether and how Bitcoin shocks are transmitted to the U.S economy. We employ a new methodology: TVP FAVAR model with stochastic volatility. We use a large dataset of 111 major U.S variables from 1959:m1 to 2016:m12. The results show that Bitcoin shocks significantly impact the U.S. economy. This significant impact is pronounced in a volatile and increasing U.S economy. The Bitcoin has a positive relationship on the U.S real activity, and a negative one on U.S prices and interest rates. Effects on the Monetary Policy exist via the inter-est rates and the Money, Credit and Finance transmission channels.

Keywords: bitcoin, US economy, FAVAR models, stochastic volatility

Procedia PDF Downloads 220
3683 Intelligent Prediction System for Diagnosis of Heart Attack

Authors: Oluwaponmile David Alao

Abstract:

Due to an increase in the death rate as a result of heart attack. There is need to develop a system that can be useful in the diagnosis of the disease at the medical centre. This system will help in preventing misdiagnosis that may occur from the medical practitioner or the physicians. In this research work, heart disease dataset obtained from UCI repository has been used to develop an intelligent prediction diagnosis system. The system is modeled on a feedforwad neural network and trained with back propagation neural network. A recognition rate of 86% is obtained from the testing of the network.

Keywords: heart disease, artificial neural network, diagnosis, prediction system

Procedia PDF Downloads 421
3682 Do the Health Benefits of Oil-Led Economic Development Outweigh the Potential Health Harms from Environmental Pollution in Nigeria?

Authors: Marian Emmanuel Okon

Abstract:

Introduction: The Niger Delta region of Nigeria has a vast reserve of oil and gas, which has globally positioned the nation as the sixth largest exporter of crude oil. Production rapidly rose following oil discovery. In most oil producing nations of the world, the wealth generated from oil production and export has propelled economic advancement, enabling the development of industries and other relevant infrastructures. Therefore, it can be assumed that majority of the oil resource such as Nigeria’s, has the potential to improve the health of the population via job creation and derived revenues. However, the health benefits of this economic development might be offset by the environmental consequences of oil exploitation and production. Objective: This research aims to evaluate the balance between the health benefits of oil-led economic development and harmful environmental consequences of crude oil exploitation in Nigeria. Study Design: A pathway has been designed to guide data search and this study. The model created will assess the relationship between oil-led economic development and population health development via job creation, improvement of education, development of infrastructure and other forms of development as well as through harmful environmental consequences from oil activities. Data/Emerging Findings: Diverse potentially suitable datasets which are at different geographical scales have been identified, obtained or applied for and the dataset from the World Bank has been the most thoroughly explored. This large dataset contains information that would enable the longitudinal assessment of both the health benefits and harms from oil exploitation in Nigeria as well as identify the disparities that exist between the communities, states and regions. However, these data do not extend far back enough in time to capture the start of crude oil production. Thus, it is possible that the maximum economic benefits and health harms could be missed. To deal with this shortcoming, the potential for a comparative study with countries like United Kingdom, Morocco and Cote D’ivoire has also been taken into consideration, so as to evaluate the differences between these countries as well as identify the areas of improvement in Nigeria’s environmental and health policies. Notwithstanding, these data have shown some differences in each country’s economic, environmental and health state over time as well as a corresponding summary statistics. Conclusion: In theory, the beneficial effects of oil exploitation to the health of the population may be substantial as large swaths of the ‘wider determinants’ of population heath are influenced by the wealth of a nation. However, if uncontrolled, the consequences from environmental pollution and degradation may outweigh these benefits. Thus, there is a need to address this, in order to improve environmental and population health in Nigeria.

Keywords: environmental pollution, health benefits, oil-led economic development, petroleum exploitation

Procedia PDF Downloads 300
3681 Empirical Study of Partitions Similarity Measures

Authors: Abdelkrim Alfalah, Lahcen Ouarbya, John Howroyd

Abstract:

This paper investigates and compares the performance of four existing distances and similarity measures between partitions. The partition measures considered are Rand Index (RI), Adjusted Rand Index (ARI), Variation of Information (VI), and Normalised Variation of Information (NVI). This work investigates the ability of these partition measures to capture three predefined intuitions: the variation within randomly generated partitions, the sensitivity to small perturbations, and finally the independence from the dataset scale. It has been shown that the Adjusted Rand Index performed well overall, with regards to these three intuitions.

Keywords: clustering, comparing partitions, similarity measure, partition distance, partition metric, similarity between partitions, clustering comparison.

Procedia PDF Downloads 148
3680 Causal Relationship between Corporate Governance and Financial Information Transparency: A Simultaneous Equations Approach

Authors: Maali Kachouri, Anis Jarboui

Abstract:

We focus on the causal relationship between governance and information transparency as well as interrelation among the various governance mechanisms. This paper employs a simultaneous equations approach to show this relationship in the Tunisian context. Based on an 8-year dataset, our sample covers 28 listed companies over 2006-2013. Our findings suggest that internal and external governance mechanisms are interdependent. Moreover, in order to analyze the causal effect between information transparency and governance mechanisms, we found evidence that information transparency tends to increase good corporate governance practices.

Keywords: simultaneous equations approach, transparency, causal relationship, corporate governance

Procedia PDF Downloads 327
3679 Deep Learning to Enhance Mathematics Education for Secondary Students in Sri Lanka

Authors: Selvavinayagan Babiharan

Abstract:

This research aims to develop a deep learning platform to enhance mathematics education for secondary students in Sri Lanka. The platform will be designed to incorporate interactive and user-friendly features to engage students in active learning and promote their mathematical skills. The proposed platform will be developed using TensorFlow and Keras, two widely used deep learning frameworks. The system will be trained on a large dataset of math problems, which will be collected from Sri Lankan school curricula. The results of this research will contribute to the improvement of mathematics education in Sri Lanka and provide a valuable tool for teachers to enhance the learning experience of their students.

Keywords: information technology, education, machine learning, mathematics

Procedia PDF Downloads 56
3678 Proposed Anticipating Learning Classifier System for Cloud Intrusion Detection (ALCS-CID)

Authors: Wafa' Slaibi Alsharafat

Abstract:

Cloud computing is a modern approach in network environment. According to increased number of network users and online systems, there is a need to help these systems to be away from unauthorized resource access and detect any attempts for privacy contravention. For that purpose, Intrusion Detection System is an effective security mechanism to detect any attempts of attacks for cloud resources and their information. In this paper, Cloud Intrusion Detection System has been proposed in term of reducing or eliminating any attacks. This model concerns about achieving high detection rate after conducting a set of experiments using benchmarks dataset called KDD'99.

Keywords: IDS, cloud computing, anticipating classifier system, intrusion detection

Procedia PDF Downloads 446
3677 Building and Tree Detection Using Multiscale Matched Filtering

Authors: Abdullah H. Özcan, Dilara Hisar, Yetkin Sayar, Cem Ünsalan

Abstract:

In this study, an automated building and tree detection method is proposed using DSM data and true orthophoto image. A multiscale matched filtering is used on DSM data. Therefore, first watershed transform is applied. Then, Otsu’s thresholding method is used as an adaptive threshold to segment each watershed region. Detected objects are masked with NDVI to separate buildings and trees. The proposed method is able to detect buildings and trees without entering any elevation threshold. We tested our method on ISPRS semantic labeling dataset and obtained promising results.

Keywords: building detection, local maximum filtering, matched filtering, multiscale

Procedia PDF Downloads 295
3676 Urdu Text Extraction Method from Images

Authors: Samabia Tehsin, Sumaira Kausar

Abstract:

Due to the vast increase in the multimedia data in recent years, efficient and robust retrieval techniques are needed to retrieve and index images/ videos. Text embedded in the images can serve as the strong retrieval tool for images. This is the reason that text extraction is an area of research with increasing attention. English text extraction is the focus of many researchers but very less work has been done on other languages like Urdu. This paper is focusing on Urdu text extraction from video frames. This paper presents a text detection feature set, which has the ability to deal up with most of the problems connected with the text extraction process. To test the validity of the method, it is tested on Urdu news dataset, which gives promising results.

Keywords: caption text, content-based image retrieval, document analysis, text extraction

Procedia PDF Downloads 481