Search results for: imbalanced datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 746

Search results for: imbalanced datasets

266 Complete Ensemble Empirical Mode Decomposition with Adaptive Noise Temporal Convolutional Network for Remaining Useful Life Prediction of Lithium Ion Batteries

Authors: Jing Zhao, Dayong Liu, Shihao Wang, Xinghua Zhu, Delong Li

Abstract:

Uhumanned Underwater Vehicles generally operate in the deep sea, which has its own unique working conditions. Lithium-ion power batteries should have the necessary stability and endurance for use as an underwater vehicle’s power source. Therefore, it is essential to accurately forecast how long lithium-ion batteries will last in order to maintain the system’s reliability and safety. In order to model and forecast lithium battery Remaining Useful Life (RUL), this research suggests a model based on Complete Ensemble Empirical Mode Decomposition with Adaptive noise-Temporal Convolutional Net (CEEMDAN-TCN). In this study, two datasets, NASA and CALCE, which have a specific gap in capacity data fluctuation, are used to verify the model and examine the experimental results in order to demonstrate the generalizability of the concept. The experiments demonstrate the network structure’s strong universality and ability to achieve good fitting outcomes on the test set for various battery dataset types. The evaluation metrics reveal that the CEEMDAN-TCN prediction performance of TCN is 25% to 35% better than that of a single neural network, proving that feature expansion and modal decomposition can both enhance the model’s generalizability and be extremely useful in industrial settings.

Keywords: lithium-ion battery, remaining useful life, complete EEMD with adaptive noise, temporal convolutional net

Procedia PDF Downloads 114
265 Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

Abstract:

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Keywords: daily activity recognition, healthcare, IoT sensors, transfer learning

Procedia PDF Downloads 110
264 A Case Study for User Rating Prediction on Automobile Recommendation System Using Mapreduce

Authors: Jiao Sun, Li Pan, Shijun Liu

Abstract:

Recommender systems have been widely used in contemporary industry, and plenty of work has been done in this field to help users to identify items of interest. Collaborative Filtering (CF, for short) algorithm is an important technology in recommender systems. However, less work has been done in automobile recommendation system with the sharp increase of the amount of automobiles. What’s more, the computational speed is a major weakness for collaborative filtering technology. Therefore, using MapReduce framework to optimize the CF algorithm is a vital solution to this performance problem. In this paper, we present a recommendation of the users’ comment on industrial automobiles with various properties based on real world industrial datasets of user-automobile comment data collection, and provide recommendation for automobile providers and help them predict users’ comment on automobiles with new-coming property. Firstly, we solve the sparseness of matrix using previous construction of score matrix. Secondly, we solve the data normalization problem by removing dimensional effects from the raw data of automobiles, where different dimensions of automobile properties bring great error to the calculation of CF. Finally, we use the MapReduce framework to optimize the CF algorithm, and the computational speed has been improved times. UV decomposition used in this paper is an often used matrix factorization technology in CF algorithm, without calculating the interpolation weight of neighbors, which will be more convenient in industry.

Keywords: collaborative filtering, recommendation, data normalization, mapreduce

Procedia PDF Downloads 187
263 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 403
262 Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison between Central Processing Unit vs. Graphics Processing Unit Functions for Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

Neural network approaches are machine learning methods used in many domains, such as healthcare and cyber security. Neural networks are mostly known for dealing with image datasets. While training with the images, several fundamental mathematical operations are carried out in the Neural Network. The operation includes a number of algebraic and mathematical functions, including derivative, convolution, and matrix inversion and transposition. Such operations require higher processing power than is typically needed for computer usage. Central Processing Unit (CPU) is not appropriate for a large image size of the dataset as it is built with serial processing. While Graphics Processing Unit (GPU) has parallel processing capabilities and, therefore, has higher speed. This paper uses advanced Neural Network techniques such as VGG16, Resnet50, Densenet, Inceptionv3, Xception, Mobilenet, XGBOOST-VGG16, and our proposed models to compare CPU and GPU resources. A system for classifying autism disease using face images of an autistic and non-autistic child was used to compare performance during testing. We used evaluation matrices such as Accuracy, F1 score, Precision, Recall, and Execution time. It has been observed that GPU runs faster than the CPU in all tests performed. Moreover, the performance of the Neural Network models in terms of accuracy increases on GPU compared to CPU.

Keywords: autism disease, neural network, CPU, GPU, transfer learning

Procedia PDF Downloads 86
261 Traffic Prediction with Raw Data Utilization and Context Building

Authors: Zhou Yang, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is essential in a multitude of ways in modern urban life. The researchers of earlier work in this domain carry out the investigation chiefly with two major focuses: (1) the accurate forecast of future values in multiple time series and (2) knowledge extraction from spatial-temporal correlations. However, two key considerations for traffic prediction are often missed: the completeness of raw data and the full context of the prediction timestamp. Concentrating on the two drawbacks of earlier work, we devise an approach that can address these issues in a two-phase framework. First, we utilize the raw trajectories to a greater extent through building a VLA table and data compression. We obtain the intra-trajectory features with graph-based encoding and the intertrajectory ones with a grid-based model and the technique of back projection that restore their surrounding high-resolution spatial-temporal environment. To the best of our knowledge, we are the first to study direct feature extraction from raw trajectories for traffic prediction and attempt the use of raw data with the least degree of reduction. In the prediction phase, we provide a broader context for the prediction timestamp by taking into account the information that are around it in the training dataset. Extensive experiments on several well-known datasets have verified the effectiveness of our solution that combines the strength of raw trajectory data and prediction context. In terms of performance, our approach surpasses several state-of-the-art methods for traffic prediction.

Keywords: traffic prediction, raw data utilization, context building, data reduction

Procedia PDF Downloads 99
260 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 202
259 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 50
258 Carbon Skimming: Towards an Application to Summarise and Compare Embodied Carbon to Aid Early-Stage Decision Making

Authors: Rivindu Nethmin Bandara Menik Hitihamy Mudiyanselage, Matthias Hank Haeusler, Ben Doherty

Abstract:

Investors and clients in the Architectural, Engineering and Construction industry find it difficult to understand complex datasets and reports with little to no graphic representation. The stakeholders examined in this paper include designers, design clients and end-users. Communicating embodied carbon information graphically and concisely can aid with decision support early in a building's life cycle. It is essential to create a common visualisation approach as the level of knowledge about embodied carbon varies between stakeholders. The tool, designed in conjunction with Bates Smart, condenses Tally Life Cycle Assessment data to a carbon hot-spotting visualisation, highlighting the sections with the highest amounts of embodied carbon. This allows stakeholders at every stage of a given project to have a better understanding of the carbon implications with minimal effort. It further allows stakeholders to differentiate building elements by their carbon values, which enables the evaluation of the cost-effectiveness of the selected materials at an early stage. To examine and build a decision-support tool, an action-design research methodology of cycles of iterations was used along with precedents of embodied carbon visualising tools. Accordingly, the importance of visualisation and Building Information Modelling are also explored to understand the best format for relaying these results.

Keywords: embodied carbon, visualisation, summarisation, data filtering, early-stage decision-making, materiality

Procedia PDF Downloads 56
257 3D Object Detection for Autonomous Driving: A Comprehensive Review

Authors: Ahmed Soliman Nagiub, Mahmoud Fayez, Heba Khaled, Said Ghoniemy

Abstract:

Accurate perception is a critical component in enabling autonomous vehicles to understand their driving environment. The acquisition of 3D information about objects, including their location and pose, is essential for achieving this understanding. This survey paper presents a comprehensive review of 3D object detection techniques specifically tailored for autonomous vehicles. The survey begins with an introduction to 3D object detection, elucidating the significance of the third dimension in perceiving the driving environment. It explores the types of sensors utilized in this context and the corresponding data extracted from these sensors. Additionally, the survey investigates the different types of datasets employed, including their formats, sizes, and provides a comparative analysis. Furthermore, the paper categorizes and thoroughly examines the perception methods employed for 3D object detection based on the diverse range of sensors utilized. Each method is evaluated based on its effectiveness in accurately detecting objects in a three-dimensional space. Additionally, the evaluation metrics used to assess the performance of these methods are discussed. By offering a comprehensive overview of 3D object detection techniques for autonomous vehicles, this survey aims to advance the field of perception systems. It serves as a valuable resource for researchers and practitioners, providing insights into the techniques, sensors, and evaluation metrics employed in 3D object detection for autonomous vehicles.

Keywords: computer vision, 3D object detection, autonomous vehicles, deep learning

Procedia PDF Downloads 32
256 Exploring Digital Media’s Impact on Sports Sponsorship: A Global Perspective

Authors: Sylvia Chan-Olmsted, Lisa-Charlotte Wolter

Abstract:

With the continuous proliferation of media platforms, there have been tremendous changes in media consumption behaviors. From the perspective of sports sponsorship, while there is now a multitude of platforms to create brand associations, the changing media landscape and shift of message control also mean that sports sponsors will have to take into account the nature of and consumer responses toward these emerging digital media to devise effective marketing strategies. Utilizing the personal interview methodology, this study is qualitative and exploratory in nature. A total of 18 experts from European and American academics, sports marketing industry, and sports leagues/teams were interviewed to address three main research questions: 1) What are the major changes in digital technologies that are relevant to sports sponsorship; 2) How have digital media influenced the channels and platforms of sports sponsorship; and 3) How have these technologies affected the goals, strategies, and measurement of sports sponsorship. The study found that sports sponsorship has moved from consumer engagement, engagement measurement, and consequences of engagement on brand behaviors to micro-targeting one on one, engagement by context, time, and space, and activation and leveraging based on tracking and databases. From the perspective of platforms and channels, the use of mobile devices is prominent during sports content consumption. Increasing multiscreen media consumption means that sports sponsors need to optimize their investment decisions in leagues, teams, or game-related content sources, as they need to go where the fans are most engaged in. The study observed an imbalanced strategic leveraging of technology and digital infrastructure. While sports leagues have had less emphasis on brand value management via technology, sports sponsors have been much more active in utilizing technologies like mobile/LBS tools, big data/user info, real-time marketing and programmatic, and social media activation. Regardless of the new media/platforms, the study found that integration and contextualization are the two essential means of improving sports sponsorship effectiveness through technology. That is, how sponsors effectively integrate social media/mobile/second screen into their existing legacy media sponsorship plan so technology works for the experience/message instead of distracting fans. Additionally, technological advancement and attention economy amplify the importance of consumer data gathering, but sports consumer data does not mean loyalty or engagement. This study also affirms the benefit of digital media as they offer viral and pre-event activations through storytelling way before the actual event, which is critical for leveraging brand association before and after. That is, sponsors now have multiple opportunities and platforms to tell stories about their brands for longer time period. In summary, digital media facilitate fan experience, access to the brand message, multiplatform/channel presentations, storytelling, and content sharing. Nevertheless, rather than focusing on technology and media, today’s sponsors need to define what they want to focus on in terms of content themes that connect with their brands and then identify the channels/platforms. The big challenge for sponsors is to play to the venues/media’s specificity and its fit with the target audience and not uniformly deliver the same message in the same format on different platforms/channels.

Keywords: digital media, mobile media, social media, technology, sports sponsorship

Procedia PDF Downloads 264
255 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 124
254 Enhancement of Underwater Haze Image with Edge Reveal Using Pixel Normalization

Authors: M. Dhana Lakshmi, S. Sakthivel Murugan

Abstract:

As light passes from source to observer in the water medium, it is scattered by the suspended particulate matter. This scattering effect will plague the captured images with non-uniform illumination, blurring details, halo artefacts, weak edges, etc. To overcome this, pixel normalization with an Amended Unsharp Mask (AUM) filter is proposed to enhance the degraded image. To validate the robustness of the proposed technique irrespective of atmospheric light, the considered datasets are collected on dual locations. For those images, the maxima and minima pixel intensity value is computed and normalized; then the AUM filter is applied to strengthen the blurred edges. Finally, the enhanced image is obtained with good illumination and contrast. Thus, the proposed technique removes the effect of scattering called de-hazing and restores the perceptual information with enhanced edge detail. Both qualitative and quantitative analyses are done on considering the standard non-reference metric called underwater image sharpness measure (UISM), and underwater image quality measure (UIQM) is used to measure color, sharpness, and contrast for both of the location images. It is observed that the proposed technique has shown overwhelming performance compared to other deep-based enhancement networks and traditional techniques in an adaptive manner.

Keywords: underwater drone imagery, pixel normalization, thresholding, masking, unsharp mask filter

Procedia PDF Downloads 162
253 Multimodal Database of Retina Images for Africa: The First Open Access Digital Repository for Retina Images in Sub Saharan Africa

Authors: Simon Arunga, Teddy Kwaga, Rita Kageni, Michael Gichangi, Nyawira Mwangi, Fred Kagwa, Rogers Mwavu, Amos Baryashaba, Luis F. Nakayama, Katharine Morley, Michael Morley, Leo A. Celi, Jessica Haberer, Celestino Obua

Abstract:

Purpose: The main aim for creating the Multimodal Database of Retinal Images for Africa (MoDRIA) was to provide a publicly available repository of retinal images for responsible researchers to conduct algorithm development in a bid to curb the challenges of ophthalmic artificial intelligence (AI) in Africa. Methods: Data and retina images were ethically sourced from sites in Uganda and Kenya. Data on medical history, visual acuity, ocular examination, blood pressure, and blood sugar were collected. Retina images were captured using fundus cameras (Foru3-nethra and Canon CR-Mark-1). Images were stored on a secure online database. Results: The database consists of 7,859 retinal images in portable network graphics format from 1,988 participants. Images from patients with human immunodeficiency virus were 18.9%, 18.2% of images were from hypertensive patients, 12.8% from diabetic patients, and the rest from normal’ participants. Conclusion: Publicly available data repositories are a valuable asset in the development of AI technology. Therefore, is a need for the expansion of MoDRIA so as to provide larger datasets that are more representative of Sub-Saharan data.

Keywords: retina images, MoDRIA, image repository, African database

Procedia PDF Downloads 89
252 The “Bright Side” of COVID-19: Effects of Livestream Affordances on Consumer Purchase Willingness: Explicit IT Affordances Perspective

Authors: Isaac Owusu Asante, Yushi Jiang, Hailin Tao

Abstract:

Live streaming marketing, the new electronic commerce element, became an optional marketing channel following the COVID-19 pandemic. Many sellers have leveraged the features presented by live streaming to increase sales. Studies on live streaming have focused on gaming and consumers’ loyalty to brands through live streaming, using interview questionnaires. This study, however, was conducted to measure real-time observable interactions between consumers and sellers. Based on the affordance theory, this study conceptualized constructs representing the interactive features and examined how they drive consumers’ purchase willingness during live streaming sessions using 1238 datasets from Amazon Live, following the manual observation of transaction records. Using structural equation modeling, the ordinary least square regression suggests that live viewers, new followers, live chats, and likes positively affect purchase willingness. The Sobel and Monte Carlo tests show that new followers, live chats, and likes significantly mediate the relationship between live viewers and purchase willingness. The study introduces a new way of measuring interactions in live streaming commerce and proposes a way to manually gather data on consumer behaviors in live streaming platforms when the application programming interface (API) of such platforms does not support data mining algorithms.

Keywords: livestreaming marketing, live chats, live viewers, likes, new followers, purchase willingness

Procedia PDF Downloads 45
251 A Spatio-Temporal Analysis and Change Detection of Wetlands in Diamond Harbour, West Bengal, India Using Normalized Difference Water Index

Authors: Lopita Pal, Suresh V. Madha

Abstract:

Wetlands are areas of marsh, fen, peat land or water, whether natural or artificial, permanent or temporary, with water that is static or flowing, fresh, brackish or salt, including areas of marine water the depth of which at low tide does not exceed six metres. The rapidly expanding human population, large scale changes in land use/land cover, burgeoning development projects and improper use of watersheds all has caused a substantial decline of wetland resources in the world. Major degradations have been impacted from agricultural, industrial and urban developments leading to various types of pollutions and hydrological perturbations. Regular fishing activities and unsustainable grazing of animals are degrading the wetlands in a slow pace. The paper focuses on the spatio-temporal change detection of the area of the water body and the main cause of this depletion. The total area under study (22°19’87’’ N, 88°20’23’’ E) is a wetland region in West Bengal of 213 sq.km. The procedure used is the Normalized Difference Water Index (NDWI) from multi-spectral imagery and Landsat to detect the presence of surface water, and the datasets have been compared of the years 2016, 2006 and 1996. The result shows a sharp decline in the area of water body due to a rapid increase in the agricultural practices and the growing urbanization.

Keywords: spatio-temporal change, NDWI, urbanization, wetland

Procedia PDF Downloads 260
250 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production

Authors: Deepak Singh, Rail Kuliev

Abstract:

This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.

Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring

Procedia PDF Downloads 51
249 Impact of Drought on Agriculture in the Upper Middle Gangetic Plain in India

Authors: Reshmita Nath

Abstract:

In this study, we investigate the spatiotemporal characteristics of drought in India and its impact on agriculture during the summer season (April to September). For our analysis, we have used Standardized Precipitation Evapotranspiration Index (SPEI) datasets between 1982 and 2012 at six-month timescale. Based on the criteria SPEI<-1 we obtain the vulnerability map and have found that the Humid subtropical Upper Middle Gangetic Plain (UMGP) region is highly drought prone with an occurrence frequency of 40-45%. This UMGP region contributes at least 18-20% of India’s annual cereal production. Not only the probability, but the region becomes more and more drought-prone in the recent decades. Moreover, the cereal production in the UMGP has experienced a gradual declining trend from 2000 onwards and this feature is consistent with the increase in drought affected areas from 20-25% to 50-60%, before and after 2000, respectively. The higher correlation coefficient (-0.69) between the changes in cereal production and drought affected areas confirms that at least 50% of the agricultural (cereal) losses is associated with drought. While analyzing the individual impact of precipitation and surface temperature anomalies on SPEI (6), we have found that in the UMGP region surface temperature plays the primary role in lowering of SPEI. The linkage is further confirmed by the correlation analysis between the SPEI (6) and surface temperature rise, which exhibits strong negative values in the UMGP region. Higher temperature might have caused more evaporation and drying, which therefore increases the area affected by drought in the recent decade.

Keywords: drought, agriculture, SPEI, Indo-Gangetic plain

Procedia PDF Downloads 228
248 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall

Procedia PDF Downloads 252
247 Human Immunodeficiency Virus (HIV) Test Predictive Modeling and Identify Determinants of HIV Testing for People with Age above Fourteen Years in Ethiopia Using Data Mining Techniques: EDHS 2011

Authors: S. Abera, T. Gidey, W. Terefe

Abstract:

Introduction: Testing for HIV is the key entry point to HIV prevention, treatment, and care and support services. Hence, predictive data mining techniques can greatly benefit to analyze and discover new patterns from huge datasets like that of EDHS 2011 data. Objectives: The objective of this study is to build a predictive modeling for HIV testing and identify determinants of HIV testing for adults with age above fourteen years using data mining techniques. Methods: Cross-Industry Standard Process for Data Mining (CRISP-DM) was used to predict the model for HIV testing and explore association rules between HIV testing and the selected attributes among adult Ethiopians. Decision tree, Naïve-Bayes, logistic regression and artificial neural networks of data mining techniques were used to build the predictive models. Results: The target dataset contained 30,625 study participants; of which 16, 515 (53.9%) were women. Nearly two-fifth; 17,719 (58%), have never been tested for HIV while the rest 12,906 (42%) had been tested. Ethiopians with higher wealth index, higher educational level, belonging 20 to 29 years old, having no stigmatizing attitude towards HIV positive person, urban residents, having HIV related knowledge, information about family planning on mass media and knowing a place where to get testing for HIV showed an increased patterns with respect to HIV testing. Conclusion and Recommendation: Public health interventions should consider the identified determinants to promote people to get testing for HIV.

Keywords: data mining, HIV, testing, ethiopia

Procedia PDF Downloads 467
246 Detection of Atrial Fibrillation Using Wearables via Attentional Two-Stream Heterogeneous Networks

Authors: Huawei Bai, Jianguo Yao, Fellow, IEEE

Abstract:

Atrial fibrillation (AF) is the most common form of heart arrhythmia and is closely associated with mortality and morbidity in heart failure, stroke, and coronary artery disease. The development of single spot optical sensors enables widespread photoplethysmography (PPG) screening, especially for AF, since it represents a more convenient and noninvasive approach. To our knowledge, most existing studies based on public and unbalanced datasets can barely handle the multiple noises sources in the real world and, also, lack interpretability. In this paper, we construct a large- scale PPG dataset using measurements collected from PPG wrist- watch devices worn by volunteers and propose an attention-based two-stream heterogeneous neural network (TSHNN). The first stream is a hybrid neural network consisting of a three-layer one-dimensional convolutional neural network (1D-CNN) and two-layer attention- based bidirectional long short-term memory (Bi-LSTM) network to learn representations from temporally sampled signals. The second stream extracts latent representations from the PPG time-frequency spectrogram using a five-layer CNN. The outputs from both streams are fed into a fusion layer for the outcome. Visualization of the attention weights learned demonstrates the effectiveness of the attention mechanism against noise. The experimental results show that the TSHNN outperforms all the competitive baseline approaches and with 98.09% accuracy, achieves state-of-the-art performance.

Keywords: PPG wearables, atrial fibrillation, feature fusion, attention mechanism, hyber network

Procedia PDF Downloads 87
245 Predicting the Compressive Strength of Geopolymer Concrete Using Machine Learning Algorithms: Impact of Chemical Composition and Curing Conditions

Authors: Aya Belal, Ahmed Maher Eltair, Maggie Ahmed Mashaly

Abstract:

Geopolymer concrete is gaining recognition as a sustainable alternative to conventional Portland Cement concrete due to its environmentally friendly nature, which is a key goal for Smart City initiatives. It has demonstrated its potential as a reliable material for the design of structural elements. However, the production of Geopolymer concrete is hindered by batch-to-batch variations, which presents a significant challenge to the widespread adoption of Geopolymer concrete. To date, Machine learning has had a profound impact on various fields by enabling models to learn from large datasets and predict outputs accurately. This paper proposes an integration between the current drift to Artificial Intelligence and the composition of Geopolymer mixtures to predict their mechanical properties. This study employs Python software to develop machine learning model in specific Decision Trees. The research uses the percentage oxides and the chemical composition of the Alkali Solution along with the curing conditions as the input independent parameters, irrespective of the waste products used in the mixture yielding the compressive strength of the mix as the output parameter. The results showed 90 % agreement of the predicted values to the actual values having the ratio of the Sodium Silicate to the Sodium Hydroxide solution being the dominant parameter in the mixture.

Keywords: decision trees, geopolymer concrete, machine learning, smart cities, sustainability

Procedia PDF Downloads 56
244 Decision Making System for Clinical Datasets

Authors: P. Bharathiraja

Abstract:

Computer Aided decision making system is used to enhance diagnosis and prognosis of diseases and also to assist clinicians and junior doctors in clinical decision making. Medical Data used for decision making should be definite and consistent. Data Mining and soft computing techniques are used for cleaning the data and for incorporating human reasoning in decision making systems. Fuzzy rule based inference technique can be used for classification in order to incorporate human reasoning in the decision making process. In this work, missing values are imputed using the mean or mode of the attribute. The data are normalized using min-ma normalization to improve the design and efficiency of the fuzzy inference system. The fuzzy inference system is used to handle the uncertainties that exist in the medical data. Equal-width-partitioning is used to partition the attribute values into appropriate fuzzy intervals. Fuzzy rules are generated using Class Based Associative rule mining algorithm. The system is trained and tested using heart disease data set from the University of California at Irvine (UCI) Machine Learning Repository. The data was split using a hold out approach into training and testing data. From the experimental results it can be inferred that classification using fuzzy inference system performs better than trivial IF-THEN rule based classification approaches. Furthermore it is observed that the use of fuzzy logic and fuzzy inference mechanism handles uncertainty and also resembles human decision making. The system can be used in the absence of a clinical expert to assist junior doctors and clinicians in clinical decision making.

Keywords: decision making, data mining, normalization, fuzzy rule, classification

Procedia PDF Downloads 490
243 Alpha: A Groundbreaking Avatar Merging User Dialogue with OpenAI's GPT-3.5 for Enhanced Reflective Thinking

Authors: Jonas Colin

Abstract:

Standing at the vanguard of AI development, Alpha represents an unprecedented synthesis of logical rigor and human abstraction, meticulously crafted to mirror the user's unique persona and personality, a feat previously unattainable in AI development. Alpha, an avant-garde artefact in the realm of artificial intelligence, epitomizes a paradigmatic shift in personalized digital interaction, amalgamating user-specific dialogic patterns with the sophisticated algorithmic prowess of OpenAI's GPT-3.5 to engender a platform for enhanced metacognitive engagement and individualized user experience. Underpinned by a sophisticated algorithmic framework, Alpha integrates vast datasets through a complex interplay of neural network models and symbolic AI, facilitating a dynamic, adaptive learning process. This integration enables the system to construct a detailed user profile, encompassing linguistic preferences, emotional tendencies, and cognitive styles, tailoring interactions to align with individual characteristics and conversational contexts. Furthermore, Alpha incorporates advanced metacognitive elements, enabling real-time reflection and adaptation in communication strategies. This self-reflective capability ensures continuous refinement of its interaction model, positioning Alpha not just as a technological marvel but as a harbinger of a new era in human-computer interaction, where machines engage with us on a deeply personal and cognitive level, transforming our interaction with the digital world.

Keywords: chatbot, GPT 3.5, metacognition, symbiose

Procedia PDF Downloads 33
242 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 223
241 Using Risk Management Indicators in Decision Tree Analysis

Authors: Adel Ali Elshaibani

Abstract:

Risk management indicators augment the reporting infrastructure, particularly for the board and senior management, to identify, monitor, and manage risks. This enhancement facilitates improved decision-making throughout the banking organization. Decision tree analysis is a tool that visually outlines potential outcomes, costs, and consequences of complex decisions. It is particularly beneficial for analyzing quantitative data and making decisions based on numerical values. By calculating the expected value of each outcome, decision tree analysis can help assess the best course of action. In the context of banking, decision tree analysis can assist lenders in evaluating a customer’s creditworthiness, thereby preventing losses. However, applying these tools in developing countries may face several limitations, such as data availability, lack of technological infrastructure and resources, lack of skilled professionals, cultural factors, and cost. Moreover, decision trees can create overly complex models that do not generalize well to new data, known as overfitting. They can also be sensitive to small changes in the data, which can result in different tree structures and can become computationally expensive when dealing with large datasets. In conclusion, while risk management indicators and decision tree analysis are beneficial for decision-making in banks, their effectiveness is contingent upon how they are implemented and utilized by the board of directors, especially in the context of developing countries. It’s important to consider these limitations when planning to implement these tools in developing countries.

Keywords: risk management indicators, decision tree analysis, developing countries, board of directors, bank performance, risk management strategy, banking institutions

Procedia PDF Downloads 32
240 Heterogeneity of Soil Moisture and Its Impacts on the Mountainous Watershed Hydrology in Northwest China

Authors: Chansheng He, Zhongfu Wang, Xiao Bai, Jie Tian, Xin Jin

Abstract:

Heterogeneity of soil hydraulic properties directly affects hydrological processes at different scales. Understanding heterogeneity of soil hydraulic properties such as soil moisture is therefore essential for modeling watershed ecohydrological processes, particularly in hard to access, topographically complex mountainous watersheds. This study maps spatial variations of soil moisture by in situ observation network that consists of sampling points, zones, and tributaries, and monitors corresponding hydrological variables of air and soil temperatures, evapotranspiration, infiltration, and runoff in the Upper Reach of the Heihe River Watershed, a second largest inland river (terminal lake) with a drainage area of over 128,000 km² in Northwest China. Subsequently, the study uses a hydrological model, SWAT (Soil and Water Assessment Tool) to simulate the effects of heterogeneity of soil moisture on watershed hydrological processes. The spatial clustering method, Full-Order-CLK was employed to derive five soil heterogeneous zones (Configuration 97, 80, 65, 40, and 20) for soil input to SWAT. Results show the simulations by the SWAT model with the spatially clustered soil hydraulic information from the field sampling data had much better representation of the soil heterogeneity and more accurate performance than the model using the average soil property values for each soil type derived from the coarse soil datasets. Thus, incorporating detailed field sampling soil heterogeneity data greatly improves performance in hydrologic modeling.

Keywords: heterogeneity, soil moisture, SWAT, up-scaling

Procedia PDF Downloads 320
239 Extracellular Enzymes as Promising Soil Health Indicators: Assessing Response to Different Land Uses Using Long-Term Experiments

Authors: Munisath Khandoker, Stephan Haefele, Andy Gregory

Abstract:

Extracellular enzymes play a key role in soil organic carbon (SOC) decomposition and nutrient cycling and are known indicators for soil health; however, it is not understood how these enzymes respond to different land uses and their relationships to other soil properties have not been extensively reviewed. The relationships among the activities of three soil enzymes: β-glucosaminidase (NAG), phosphomonoesterase (PHO) and β-glucosidase (GLU), were examined. The impact of soil organic amendments, soil types and land management on soil enzyme activities were reviewed, and it was hypothesized that soils with increased SOC have increased enzyme activity. Long-term experiments at Rothamsted Research Woburn and Harpenden sites in the UK were used to evaluate how different management practices affect enzyme activity involved in carbon (C) and nitrogen (N) cycling in the soil. Samples were collected from soils with different organic treatments such as straw, farmyard manure (FYM), compost additions, cover crops and permanent grass cover to assess whether SOC can be linked with increased levels of enzymatic activity and what influence, if any, enzymatic activity has on total C and N in the soil. Investigating the interactions of important enzymes with soil characteristics and SOC can help to better understand the health of soils. Studies on long-term experiments with known histories and large datasets can better help with this. SOC tends to decrease during land use changes from natural ecosystems to agricultural systems; therefore, it is imperative that agricultural lands find ways to increase and/or maintain SOC in the soil.

Keywords: biological soil health indicators, extracellular enzymes, soil health, soil, microbiology

Procedia PDF Downloads 47
238 Combustion Variability and Uniqueness in Cylinders of a Radial Aircraft Piston Engine

Authors: Michal Geca, Grzegorz Baranski, Ksenia Siadkowska

Abstract:

The work is a part of the project which aims at developing innovative power and control systems for the high power aircraft piston engine ASz62IR. Developed electronically controlled ignition system will reduce emissions of toxic compounds as a result of lowered fuel consumption, optimized combustion and engine capability of efficient combustion of ecological fuels. The tested unit is an air-cooled four-stroke gasoline engine of 9 cylinders in a radial setup, mechanically charged by a radial compressor powered by the engine crankshaft. The total engine cubic capac-ity is 29.87 dm3, and the compression ratio is 6.4:1. The maximum take-off power is 1000 HP at 2200 rpm. The maximum fuel consumption is 280 kg/h. Engine powers aircrafts: An-2, M-18 „Dromader”, DHC-3 „OTTER”, DC-3 „Dakota”, GAF-125 „HAWK” i Y5. The main problems of the engine includes the imbalanced work of cylinders. The non-uniformity value in each cylinder results in non-uniformity of their work. In radial engine cylinders arrangement causes that the mixture movement that takes place in accordance (lower cylinder) or the opposite (upper cylinders) to the direction of gravity. Preliminary tests confirmed the presence of uneven workflow of individual cylinders. The phenomenon is most intense at low speed. The non-uniformity is visible on the waveform of cylinder pressure. Therefore two studies were conducted to determine the impact of this phenomenon on the engine performance: simulation and real tests. Simplified simulation was conducted on the element of the intake system coated with fuel film. The study shows that there is an effect of gravity on the movement of the fuel film inside the radial engine intake channels. Both in the lower and the upper inlet channels the film flows downwards. It follows from the fact that gravity assists the movement of the film in the lower cylinder channels and prevents the movement in the upper cylinder channels. Real tests on aircraft engine ASz62IR was conducted in transients condition (rapid change of the excess air in each cylinder were performed. Calculations were conducted for mass of fuel reaching the cylinders theoretically and really and on this basis, the factors of fuel evaporation “x” were determined. Therefore a simplified model of the fuel supply to cylinder was adopted. Model includes time constant of the fuel film τ, the number of engine transport cycles of non-evaporating fuel along the intake pipe γ and time between next cycles Δt. The calculation results of identification of the model parameters are presented in the form of radar graphs. The figures shows the averages declines and increases of the injection time and the average values for both types of stroke. These studies shown, that the change of the position of the cylinder will cause changes in the formation of fuel-air mixture and thus changes in the combustion process. Based on the results of the work of simulation and experiments was possible to develop individual algorithms for ignition control. This work has been financed by the Polish National Centre for Research and Development, INNOLOT, under Grant Agreement No. INNOLOT/I/1/NCBR/2013.

Keywords: radial engine, ignition system, non-uniformity, combustion process

Procedia PDF Downloads 337
237 A Convolutional Neural Network-Based Model for Lassa fever Virus Prediction Using Patient Blood Smear Image

Authors: A. M. John-Otumu, M. M. Rahman, M. C. Onuoha, E. P. Ojonugwa

Abstract:

A Convolutional Neural Network (CNN) model for predicting Lassa fever was built using Python 3.8.0 programming language, alongside Keras 2.2.4 and TensorFlow 2.6.1 libraries as the development environment in order to reduce the current high risk of Lassa fever in West Africa, particularly in Nigeria. The study was prompted by some major flaws in existing conventional laboratory equipment for diagnosing Lassa fever (RT-PCR), as well as flaws in AI-based techniques that have been used for probing and prognosis of Lassa fever based on literature. There were 15,679 blood smear microscopic image datasets collected in total. The proposed model was trained on 70% of the dataset and tested on 30% of the microscopic images in avoid overfitting. A 3x3x3 convolution filter was also used in the proposed system to extract features from microscopic images. The proposed CNN-based model had a recall value of 96%, a precision value of 93%, an F1 score of 95%, and an accuracy of 94% in predicting and accurately classifying the images into clean or infected samples. Based on empirical evidence from the results of the literature consulted, the proposed model outperformed other existing AI-based techniques evaluated. If properly deployed, the model will assist physicians, medical laboratory scientists, and patients in making accurate diagnoses for Lassa fever cases, allowing the mortality rate due to the Lassa fever virus to be reduced through sound decision-making.

Keywords: artificial intelligence, ANN, blood smear, CNN, deep learning, Lassa fever

Procedia PDF Downloads 86