Search results for: categorical datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 814

Search results for: categorical datasets

274 Traffic Prediction with Raw Data Utilization and Context Building

Authors: Zhou Yang, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is essential in a multitude of ways in modern urban life. The researchers of earlier work in this domain carry out the investigation chiefly with two major focuses: (1) the accurate forecast of future values in multiple time series and (2) knowledge extraction from spatial-temporal correlations. However, two key considerations for traffic prediction are often missed: the completeness of raw data and the full context of the prediction timestamp. Concentrating on the two drawbacks of earlier work, we devise an approach that can address these issues in a two-phase framework. First, we utilize the raw trajectories to a greater extent through building a VLA table and data compression. We obtain the intra-trajectory features with graph-based encoding and the intertrajectory ones with a grid-based model and the technique of back projection that restore their surrounding high-resolution spatial-temporal environment. To the best of our knowledge, we are the first to study direct feature extraction from raw trajectories for traffic prediction and attempt the use of raw data with the least degree of reduction. In the prediction phase, we provide a broader context for the prediction timestamp by taking into account the information that are around it in the training dataset. Extensive experiments on several well-known datasets have verified the effectiveness of our solution that combines the strength of raw trajectory data and prediction context. In terms of performance, our approach surpasses several state-of-the-art methods for traffic prediction.

Keywords: traffic prediction, raw data utilization, context building, data reduction

Procedia PDF Downloads 101
273 In-Context Meta Learning for Automatic Designing Pretext Tasks for Self-Supervised Image Analysis

Authors: Toktam Khatibi

Abstract:

Self-supervised learning (SSL) includes machine learning models that are trained on one aspect and/or one part of the input to learn other aspects and/or part of it. SSL models are divided into two different categories, including pre-text task-based models and contrastive learning ones. Pre-text tasks are some auxiliary tasks learning pseudo-labels, and the trained models are further fine-tuned for downstream tasks. However, one important disadvantage of SSL using pre-text task solving is defining an appropriate pre-text task for each image dataset with a variety of image modalities. Therefore, it is required to design an appropriate pretext task automatically for each dataset and each downstream task. To the best of our knowledge, the automatic designing of pretext tasks for image analysis has not been considered yet. In this paper, we present a framework based on In-context learning that describes each task based on its input and output data using a pre-trained image transformer. Our proposed method combines the input image and its learned description for optimizing the pre-text task design and its hyper-parameters using Meta-learning models. The representations learned from the pre-text tasks are fine-tuned for solving the downstream tasks. We demonstrate that our proposed framework outperforms the compared ones on unseen tasks and image modalities in addition to its superior performance for previously known tasks and datasets.

Keywords: in-context learning (ICL), meta learning, self-supervised learning (SSL), vision-language domain, transformers

Procedia PDF Downloads 51
272 Carbon Skimming: Towards an Application to Summarise and Compare Embodied Carbon to Aid Early-Stage Decision Making

Authors: Rivindu Nethmin Bandara Menik Hitihamy Mudiyanselage, Matthias Hank Haeusler, Ben Doherty

Abstract:

Investors and clients in the Architectural, Engineering and Construction industry find it difficult to understand complex datasets and reports with little to no graphic representation. The stakeholders examined in this paper include designers, design clients and end-users. Communicating embodied carbon information graphically and concisely can aid with decision support early in a building's life cycle. It is essential to create a common visualisation approach as the level of knowledge about embodied carbon varies between stakeholders. The tool, designed in conjunction with Bates Smart, condenses Tally Life Cycle Assessment data to a carbon hot-spotting visualisation, highlighting the sections with the highest amounts of embodied carbon. This allows stakeholders at every stage of a given project to have a better understanding of the carbon implications with minimal effort. It further allows stakeholders to differentiate building elements by their carbon values, which enables the evaluation of the cost-effectiveness of the selected materials at an early stage. To examine and build a decision-support tool, an action-design research methodology of cycles of iterations was used along with precedents of embodied carbon visualising tools. Accordingly, the importance of visualisation and Building Information Modelling are also explored to understand the best format for relaying these results.

Keywords: embodied carbon, visualisation, summarisation, data filtering, early-stage decision-making, materiality

Procedia PDF Downloads 59
271 3D Object Detection for Autonomous Driving: A Comprehensive Review

Authors: Ahmed Soliman Nagiub, Mahmoud Fayez, Heba Khaled, Said Ghoniemy

Abstract:

Accurate perception is a critical component in enabling autonomous vehicles to understand their driving environment. The acquisition of 3D information about objects, including their location and pose, is essential for achieving this understanding. This survey paper presents a comprehensive review of 3D object detection techniques specifically tailored for autonomous vehicles. The survey begins with an introduction to 3D object detection, elucidating the significance of the third dimension in perceiving the driving environment. It explores the types of sensors utilized in this context and the corresponding data extracted from these sensors. Additionally, the survey investigates the different types of datasets employed, including their formats, sizes, and provides a comparative analysis. Furthermore, the paper categorizes and thoroughly examines the perception methods employed for 3D object detection based on the diverse range of sensors utilized. Each method is evaluated based on its effectiveness in accurately detecting objects in a three-dimensional space. Additionally, the evaluation metrics used to assess the performance of these methods are discussed. By offering a comprehensive overview of 3D object detection techniques for autonomous vehicles, this survey aims to advance the field of perception systems. It serves as a valuable resource for researchers and practitioners, providing insights into the techniques, sensors, and evaluation metrics employed in 3D object detection for autonomous vehicles.

Keywords: computer vision, 3D object detection, autonomous vehicles, deep learning

Procedia PDF Downloads 33
270 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 129
269 Enhancement of Underwater Haze Image with Edge Reveal Using Pixel Normalization

Authors: M. Dhana Lakshmi, S. Sakthivel Murugan

Abstract:

As light passes from source to observer in the water medium, it is scattered by the suspended particulate matter. This scattering effect will plague the captured images with non-uniform illumination, blurring details, halo artefacts, weak edges, etc. To overcome this, pixel normalization with an Amended Unsharp Mask (AUM) filter is proposed to enhance the degraded image. To validate the robustness of the proposed technique irrespective of atmospheric light, the considered datasets are collected on dual locations. For those images, the maxima and minima pixel intensity value is computed and normalized; then the AUM filter is applied to strengthen the blurred edges. Finally, the enhanced image is obtained with good illumination and contrast. Thus, the proposed technique removes the effect of scattering called de-hazing and restores the perceptual information with enhanced edge detail. Both qualitative and quantitative analyses are done on considering the standard non-reference metric called underwater image sharpness measure (UISM), and underwater image quality measure (UIQM) is used to measure color, sharpness, and contrast for both of the location images. It is observed that the proposed technique has shown overwhelming performance compared to other deep-based enhancement networks and traditional techniques in an adaptive manner.

Keywords: underwater drone imagery, pixel normalization, thresholding, masking, unsharp mask filter

Procedia PDF Downloads 166
268 Multimodal Database of Retina Images for Africa: The First Open Access Digital Repository for Retina Images in Sub Saharan Africa

Authors: Simon Arunga, Teddy Kwaga, Rita Kageni, Michael Gichangi, Nyawira Mwangi, Fred Kagwa, Rogers Mwavu, Amos Baryashaba, Luis F. Nakayama, Katharine Morley, Michael Morley, Leo A. Celi, Jessica Haberer, Celestino Obua

Abstract:

Purpose: The main aim for creating the Multimodal Database of Retinal Images for Africa (MoDRIA) was to provide a publicly available repository of retinal images for responsible researchers to conduct algorithm development in a bid to curb the challenges of ophthalmic artificial intelligence (AI) in Africa. Methods: Data and retina images were ethically sourced from sites in Uganda and Kenya. Data on medical history, visual acuity, ocular examination, blood pressure, and blood sugar were collected. Retina images were captured using fundus cameras (Foru3-nethra and Canon CR-Mark-1). Images were stored on a secure online database. Results: The database consists of 7,859 retinal images in portable network graphics format from 1,988 participants. Images from patients with human immunodeficiency virus were 18.9%, 18.2% of images were from hypertensive patients, 12.8% from diabetic patients, and the rest from normal’ participants. Conclusion: Publicly available data repositories are a valuable asset in the development of AI technology. Therefore, is a need for the expansion of MoDRIA so as to provide larger datasets that are more representative of Sub-Saharan data.

Keywords: retina images, MoDRIA, image repository, African database

Procedia PDF Downloads 90
267 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 282
266 The Effectiveness of Psychosocial Interventions for Survivors of Natural Disasters: A Systematic Review

Authors: Santhani M. Selveindran

Abstract:

Background: Natural disasters are traumatic global events that are becoming increasing more common, with significant psychosocial impact on survivors. This impact results not only in psychosocial distress but, for many, can lead to psychosocial disorders and chronic psychopathology. While there are currently available interventions that seek to prevent and treat these psychosocial sequelae, their effectiveness is uncertain. The evidence-base is emerging with more primary studies evaluating the effectiveness of various psychosocial interventions for survivors of natural disasters, which remains to be synthesized. Aim of Review: To identify, critically appraise and synthesize the current evidence-base on the effectiveness of psychosocial interventions in preventing or treating Post-Traumatic Stress Disorder (PTSD), Major Depressive Disorder (MDD) and/or Generalized Anxiety Disorder (GAD) in adults and children who are survivors of natural disasters. Methods: A protocol was developed as a guide to carry out this review. A systematic search was conducted in eight international electronic databases, three grey literature databases, one dissertation and thesis repository, websites of six humanitarian and non-governmental organizations renowned for their work on natural disasters, as well as bibliographic and citation searching for eligible articles. Papers meeting the specific inclusion criteria underwent quality assessment using the Downs and Black checklist. Data were extracted from the included papers and analysed by way of narrative synthesis. Results: Database and website searching returned 3777 papers where 31 met the criteria for inclusion. Additional 2 papers were obtained through bibliographic and citation searching. Methodological quality of most papers was fair. Twenty-five studies evaluated psychological interventions, five, social interventions whereas three studies evaluated ‘mixed’ psychological and social interventions. All studies, irrespective of methodological quality, reported post-intervention reductions in symptom scores for PTSD, depression and/or anxiety and where assessed, reduced diagnosis of PTSD and MDD, and produced improvements in self-efficacy and quality of life. Statistically significant results were seen in 27 studies. However, three studies demonstrated that the evaluated interventions may not have been very beneficial. Conclusions: The overall positive results suggest that any psychosocial interventions are favourable and should be delivered to all natural disaster survivors, irrespective of age, country, and phase of disaster. Yet, heterogeneity and methodological shortcomings of the current evidence-base makes it difficult to draw definite conclusions needed to formulate categorical guidance or frameworks. Further, rigorously conducted research is needed in this area, although the feasibility of such, given the context and nature of the problem, is also recognized.

Keywords: psychosocial interventions, natural disasters, survivors, effectiveness

Procedia PDF Downloads 131
265 The “Bright Side” of COVID-19: Effects of Livestream Affordances on Consumer Purchase Willingness: Explicit IT Affordances Perspective

Authors: Isaac Owusu Asante, Yushi Jiang, Hailin Tao

Abstract:

Live streaming marketing, the new electronic commerce element, became an optional marketing channel following the COVID-19 pandemic. Many sellers have leveraged the features presented by live streaming to increase sales. Studies on live streaming have focused on gaming and consumers’ loyalty to brands through live streaming, using interview questionnaires. This study, however, was conducted to measure real-time observable interactions between consumers and sellers. Based on the affordance theory, this study conceptualized constructs representing the interactive features and examined how they drive consumers’ purchase willingness during live streaming sessions using 1238 datasets from Amazon Live, following the manual observation of transaction records. Using structural equation modeling, the ordinary least square regression suggests that live viewers, new followers, live chats, and likes positively affect purchase willingness. The Sobel and Monte Carlo tests show that new followers, live chats, and likes significantly mediate the relationship between live viewers and purchase willingness. The study introduces a new way of measuring interactions in live streaming commerce and proposes a way to manually gather data on consumer behaviors in live streaming platforms when the application programming interface (API) of such platforms does not support data mining algorithms.

Keywords: livestreaming marketing, live chats, live viewers, likes, new followers, purchase willingness

Procedia PDF Downloads 47
264 A Spatio-Temporal Analysis and Change Detection of Wetlands in Diamond Harbour, West Bengal, India Using Normalized Difference Water Index

Authors: Lopita Pal, Suresh V. Madha

Abstract:

Wetlands are areas of marsh, fen, peat land or water, whether natural or artificial, permanent or temporary, with water that is static or flowing, fresh, brackish or salt, including areas of marine water the depth of which at low tide does not exceed six metres. The rapidly expanding human population, large scale changes in land use/land cover, burgeoning development projects and improper use of watersheds all has caused a substantial decline of wetland resources in the world. Major degradations have been impacted from agricultural, industrial and urban developments leading to various types of pollutions and hydrological perturbations. Regular fishing activities and unsustainable grazing of animals are degrading the wetlands in a slow pace. The paper focuses on the spatio-temporal change detection of the area of the water body and the main cause of this depletion. The total area under study (22°19’87’’ N, 88°20’23’’ E) is a wetland region in West Bengal of 213 sq.km. The procedure used is the Normalized Difference Water Index (NDWI) from multi-spectral imagery and Landsat to detect the presence of surface water, and the datasets have been compared of the years 2016, 2006 and 1996. The result shows a sharp decline in the area of water body due to a rapid increase in the agricultural practices and the growing urbanization.

Keywords: spatio-temporal change, NDWI, urbanization, wetland

Procedia PDF Downloads 261
263 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production

Authors: Deepak Singh, Rail Kuliev

Abstract:

This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.

Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring

Procedia PDF Downloads 58
262 Impact of Drought on Agriculture in the Upper Middle Gangetic Plain in India

Authors: Reshmita Nath

Abstract:

In this study, we investigate the spatiotemporal characteristics of drought in India and its impact on agriculture during the summer season (April to September). For our analysis, we have used Standardized Precipitation Evapotranspiration Index (SPEI) datasets between 1982 and 2012 at six-month timescale. Based on the criteria SPEI<-1 we obtain the vulnerability map and have found that the Humid subtropical Upper Middle Gangetic Plain (UMGP) region is highly drought prone with an occurrence frequency of 40-45%. This UMGP region contributes at least 18-20% of India’s annual cereal production. Not only the probability, but the region becomes more and more drought-prone in the recent decades. Moreover, the cereal production in the UMGP has experienced a gradual declining trend from 2000 onwards and this feature is consistent with the increase in drought affected areas from 20-25% to 50-60%, before and after 2000, respectively. The higher correlation coefficient (-0.69) between the changes in cereal production and drought affected areas confirms that at least 50% of the agricultural (cereal) losses is associated with drought. While analyzing the individual impact of precipitation and surface temperature anomalies on SPEI (6), we have found that in the UMGP region surface temperature plays the primary role in lowering of SPEI. The linkage is further confirmed by the correlation analysis between the SPEI (6) and surface temperature rise, which exhibits strong negative values in the UMGP region. Higher temperature might have caused more evaporation and drying, which therefore increases the area affected by drought in the recent decade.

Keywords: drought, agriculture, SPEI, Indo-Gangetic plain

Procedia PDF Downloads 231
261 A Measurement Instrument to Determine Curricula Competency of Licensure Track Graduate Psychotherapy Programs in the United States

Authors: Laith F. Gulli, Nicole M. Mallory

Abstract:

We developed a novel measurement instrument to assess Knowledge of Educational Programs in Professional Psychotherapy Programs (KEP-PPP or KEP-Triple P) within the United States. The instrument was designed by a Panel of Experts (PoE) that consisted of Licensed Psychotherapists and Medical Care Providers. Licensure track psychotherapy programs are listed in the databases of the Commission on Accreditation for Marriage and Family Therapy Education (COAMFTE); American Psychological Association (APA); Council on Social Work Education (CSWE); and the Council for Accreditation of Counseling & Related Educational Programs (CACREP). A complete list of psychotherapy programs can be obtained from these professional databases, selecting search fields of (All Programs) in (All States). Each program has a Web link that electronically and directly connects to the institutional program, which can be researched using the KEP-Triple P. The 29-item KEP Triple P was designed to consist of six categorical fields; Institutional Type: Degree: Educational Delivery: Accreditation: Coursework Competency: and Special Program Considerations. The KEP-Triple P was designed to determine whether a specific course(s) is offered in licensure track psychotherapy programs. The KEP-Triple P is designed to be modified to assess any part or the entire curriculum of licensure graduate programs. We utilized the KEP-Triple P instrument to study whether a graduate course in Addictions was offered in Marriage and Family Therapy (MFT) programs. Marriage and Family Therapists are likely to commonly encounter patients with Addiction(s) due to the broad treatment scope providing psychotherapy services to individuals, couples and families of all age groups. Our study of 124 MFT programs which concluded at the end of 2016 found that we were able to assess 61 % of programs (N = 76) since 27 % (N = 34) of programs were inaccessible due to broken Web links. From the total of all MFT programs 11 % (N = 14) did not have a published curriculum on their Institutional Web site. From the sample study, we found that 66 % (N = 50) of curricula did not offer a course in Addiction Treatment and that 34 % (N =26) of curricula did require a mandatory course in Addiction Treatment. From our study sample, we determined that 15 % (N = 11) of MFT doctorate programs did not require an Addictions Treatment course and that 1 % (N = 1) did require such a course. We found that 99 % of our study sample offered a Campus based program and 1 % offered a hybrid program with both online and residential components. From the total sample studied, we determined that 84 % of programs would be able to obtain reaccreditation within a five-year period. We recommend that MFT programs initiate procedures to revise curricula to include a required course in Addiction Treatment prior to their next accreditation cycle, to improve the escalating addiction crisis in the United States. This disparity in MFT curricula raises serious ethical and legal consideration for national and Federal stakeholders as well as for patients seeking a competently trained psychotherapist.

Keywords: addiction, competency, curriculum, psychotherapy

Procedia PDF Downloads 129
260 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall

Procedia PDF Downloads 254
259 Human Immunodeficiency Virus (HIV) Test Predictive Modeling and Identify Determinants of HIV Testing for People with Age above Fourteen Years in Ethiopia Using Data Mining Techniques: EDHS 2011

Authors: S. Abera, T. Gidey, W. Terefe

Abstract:

Introduction: Testing for HIV is the key entry point to HIV prevention, treatment, and care and support services. Hence, predictive data mining techniques can greatly benefit to analyze and discover new patterns from huge datasets like that of EDHS 2011 data. Objectives: The objective of this study is to build a predictive modeling for HIV testing and identify determinants of HIV testing for adults with age above fourteen years using data mining techniques. Methods: Cross-Industry Standard Process for Data Mining (CRISP-DM) was used to predict the model for HIV testing and explore association rules between HIV testing and the selected attributes among adult Ethiopians. Decision tree, Naïve-Bayes, logistic regression and artificial neural networks of data mining techniques were used to build the predictive models. Results: The target dataset contained 30,625 study participants; of which 16, 515 (53.9%) were women. Nearly two-fifth; 17,719 (58%), have never been tested for HIV while the rest 12,906 (42%) had been tested. Ethiopians with higher wealth index, higher educational level, belonging 20 to 29 years old, having no stigmatizing attitude towards HIV positive person, urban residents, having HIV related knowledge, information about family planning on mass media and knowing a place where to get testing for HIV showed an increased patterns with respect to HIV testing. Conclusion and Recommendation: Public health interventions should consider the identified determinants to promote people to get testing for HIV.

Keywords: data mining, HIV, testing, ethiopia

Procedia PDF Downloads 468
258 Prediction of Outcome after Endovascular Thrombectomy for Anterior and Posterior Ischemic Stroke: ASPECTS on CT

Authors: Angela T. H. Kwan, Wenjun Liang, Jack Wellington, Mohammad Mofatteh, Thanh N. Nguyen, Pingzhong Fu, Juanmei Chen, Zile Yan, Weijuan Wu, Yongting Zhou, Shuiquan Yang, Sijie Zhou, Yimin Chen

Abstract:

Background: Endovascular Therapy (EVT)—in the form of mechanical thrombectomy—following intravenous thrombolysis is the standard gold treatment for patients with acute ischemic stroke (AIS) due to large vessel occlusion (LVO). It is well established that an ASPECTS ≥ 7 is associated with an increased likelihood of positive post-EVT outcomes, as compared to an ASPECTS < 7. There is also prognostic utility in coupling posterior circulation ASPECTS (pc-ASPECTS) with magnetic resonance imaging for evaluating the post-EVT functional outcome. However, the value of pc-ASPECTS applied to CT must be explored further to determine its usefulness in predicting functional outcomes following EVT. Objective: In this study, we aimed to determine whether pc-ASPECTS on CT can predict post-EVT functional outcomes among patients with AIS due to LVO. Methods: A total of 247 consecutive patients aged 18 and over receiving EVT for LVO-related AIS were recruited into a prospective database. The data were retrospectively analyzed between March 2019 to February 2022 from two comprehensive tertiary care stroke centers: Foshan Sanshui District People’s Hospital and First People's Hospital of Foshan in China. Patient parameters included EVT within 24hrs of symptom onset, premorbid modified Rankin Scale (mRS) ≤ 2, presence of distal and terminal cerebral blood vessel occlusion, and subsequent 24–72-hour post-stroke onset CT scan. Univariate comparisons were performed using the Fisher exact test or χ2 test for categorical variables and the Mann–Whitney U test for continuous variables. A p-value of ≤ 0.05 was statistically significant. Results: A total of 247 patients met the inclusion criteria; however, 3 were excluded due to the absence of post-CTs and 8 for pre-EVT ASPECTS < 7. Overall, 236 individuals were examined: 196 anterior circulation ischemic strokes and 40 posterior strokes of basilar artery occlusion. We found that both baseline post- and pc-ASPECTS ≥ 7 serve as strong positive markers of favorable outcomes at 90 days post-EVT. Moreover, lower rates of inpatient mortality/hospice discharge, 90-day mortality, and 90-day poor outcome were observed. Moreover, patients in the post-ASPECTS ≥ 7 anterior circulation group had shorter door-to-recanalization time (DRT), puncture-to-recanalization time (PRT), and last known normal-to-puncture-time (LKNPT). Conclusion: Patients of anterior and posterior circulation ischemic strokes with baseline post- and pc-ASPECTS ≥ 7 may benefit from EVT.

Keywords: endovascular therapy, thrombectomy, large vessel occlusion, cerebral ischemic stroke, ASPECTS

Procedia PDF Downloads 77
257 Detection of Atrial Fibrillation Using Wearables via Attentional Two-Stream Heterogeneous Networks

Authors: Huawei Bai, Jianguo Yao, Fellow, IEEE

Abstract:

Atrial fibrillation (AF) is the most common form of heart arrhythmia and is closely associated with mortality and morbidity in heart failure, stroke, and coronary artery disease. The development of single spot optical sensors enables widespread photoplethysmography (PPG) screening, especially for AF, since it represents a more convenient and noninvasive approach. To our knowledge, most existing studies based on public and unbalanced datasets can barely handle the multiple noises sources in the real world and, also, lack interpretability. In this paper, we construct a large- scale PPG dataset using measurements collected from PPG wrist- watch devices worn by volunteers and propose an attention-based two-stream heterogeneous neural network (TSHNN). The first stream is a hybrid neural network consisting of a three-layer one-dimensional convolutional neural network (1D-CNN) and two-layer attention- based bidirectional long short-term memory (Bi-LSTM) network to learn representations from temporally sampled signals. The second stream extracts latent representations from the PPG time-frequency spectrogram using a five-layer CNN. The outputs from both streams are fed into a fusion layer for the outcome. Visualization of the attention weights learned demonstrates the effectiveness of the attention mechanism against noise. The experimental results show that the TSHNN outperforms all the competitive baseline approaches and with 98.09% accuracy, achieves state-of-the-art performance.

Keywords: PPG wearables, atrial fibrillation, feature fusion, attention mechanism, hyber network

Procedia PDF Downloads 87
256 Predicting the Compressive Strength of Geopolymer Concrete Using Machine Learning Algorithms: Impact of Chemical Composition and Curing Conditions

Authors: Aya Belal, Ahmed Maher Eltair, Maggie Ahmed Mashaly

Abstract:

Geopolymer concrete is gaining recognition as a sustainable alternative to conventional Portland Cement concrete due to its environmentally friendly nature, which is a key goal for Smart City initiatives. It has demonstrated its potential as a reliable material for the design of structural elements. However, the production of Geopolymer concrete is hindered by batch-to-batch variations, which presents a significant challenge to the widespread adoption of Geopolymer concrete. To date, Machine learning has had a profound impact on various fields by enabling models to learn from large datasets and predict outputs accurately. This paper proposes an integration between the current drift to Artificial Intelligence and the composition of Geopolymer mixtures to predict their mechanical properties. This study employs Python software to develop machine learning model in specific Decision Trees. The research uses the percentage oxides and the chemical composition of the Alkali Solution along with the curing conditions as the input independent parameters, irrespective of the waste products used in the mixture yielding the compressive strength of the mix as the output parameter. The results showed 90 % agreement of the predicted values to the actual values having the ratio of the Sodium Silicate to the Sodium Hydroxide solution being the dominant parameter in the mixture.

Keywords: decision trees, geopolymer concrete, machine learning, smart cities, sustainability

Procedia PDF Downloads 56
255 Decision Making System for Clinical Datasets

Authors: P. Bharathiraja

Abstract:

Computer Aided decision making system is used to enhance diagnosis and prognosis of diseases and also to assist clinicians and junior doctors in clinical decision making. Medical Data used for decision making should be definite and consistent. Data Mining and soft computing techniques are used for cleaning the data and for incorporating human reasoning in decision making systems. Fuzzy rule based inference technique can be used for classification in order to incorporate human reasoning in the decision making process. In this work, missing values are imputed using the mean or mode of the attribute. The data are normalized using min-ma normalization to improve the design and efficiency of the fuzzy inference system. The fuzzy inference system is used to handle the uncertainties that exist in the medical data. Equal-width-partitioning is used to partition the attribute values into appropriate fuzzy intervals. Fuzzy rules are generated using Class Based Associative rule mining algorithm. The system is trained and tested using heart disease data set from the University of California at Irvine (UCI) Machine Learning Repository. The data was split using a hold out approach into training and testing data. From the experimental results it can be inferred that classification using fuzzy inference system performs better than trivial IF-THEN rule based classification approaches. Furthermore it is observed that the use of fuzzy logic and fuzzy inference mechanism handles uncertainty and also resembles human decision making. The system can be used in the absence of a clinical expert to assist junior doctors and clinicians in clinical decision making.

Keywords: decision making, data mining, normalization, fuzzy rule, classification

Procedia PDF Downloads 492
254 Comparison Between Bispectral Index Guided Anesthesia and Standard Anesthesia Care in Middle Age Adult Patients Undergoing Modified Radical Mastectomy

Authors: Itee Chowdhury, Shikha Modi

Abstract:

Introduction: Cancer is beginning to outpace cardiovascular disease as a cause of death affecting every major organ system with profound implications for perioperative management. Breast cancer is the most common cancer in women in India, accounting for 27% of all cancers. The small changes in analgesic management of cancer patients can greatly improve prognosis and reduce the risk of postsurgical cancer recurrence as opioid-based analgesia has a deleterious effect on cancer outcomes. Shortened postsurgical recovery time facilitates earlier return to intended oncological therapy maximising the chance of successful treatment. Literature reveals that the role of BIS since FDA approval has been assessed in various types of surgeries, but clinical data on its use in oncosurgical patients are scanty. Our study focuses on the role of BIS-guided anaesthesia for breast cancer surgery patients. Methods: A prospective randomized controlled study in patients aged 36-55years scheduled for modified radical mastectomy was conducted in 51 patients in each group who met the inclusion and exclusion criteria, and randomization was done by sealed envelope technique. In BIS guided anaesthesia group (B), sevoflurane was titrated to keep the BIS value 45-60, and thereafter if the patient showed hypertension/tachycardia, an opioid was given. In standard anaesthesia care (group C), sevoflurane was titrated to keep MAC in the range of 0.8-1, and fentanyl was given if the patient showed hypertension/tachycardia. Intraoperative opioid consumption was calculated. Postsurgery recovery characteristics, including Aldrete score, were assessed. Patients were questioned for pain, PONV, and recall of the intraoperative event. A comparison of age, BMI, ASA, recovery characteristics, opioid, and VAS score was made using the non-parametric Mann-Whitney U test. Categorical data like intraoperative awareness of surgery and PONV was studied using the Chi-square test. A comparison of heart rate and MAP was made by an independent sample t-test. #ggplot2 package was used to show the trend of the BIS index for all intraoperative time points for each patient. For a statistical test of significance, the cut-off p-value was set as <0.05. Conclusions: BIS monitoring led to reduced opioid consumption and early recovery from anaesthesia in breast cancer patients undergoing MRM resulting in less postoperative nausea and vomiting and less pain intensity in the immediate postoperative period without any recall of the intraoperative event. Thus, the use of a Bispectral index monitor allows for tailoring of anaesthesia administration with a good outcome.

Keywords: bispectral index, depth of anaesthesia, recovery, opioid consumption

Procedia PDF Downloads 101
253 Alpha: A Groundbreaking Avatar Merging User Dialogue with OpenAI's GPT-3.5 for Enhanced Reflective Thinking

Authors: Jonas Colin

Abstract:

Standing at the vanguard of AI development, Alpha represents an unprecedented synthesis of logical rigor and human abstraction, meticulously crafted to mirror the user's unique persona and personality, a feat previously unattainable in AI development. Alpha, an avant-garde artefact in the realm of artificial intelligence, epitomizes a paradigmatic shift in personalized digital interaction, amalgamating user-specific dialogic patterns with the sophisticated algorithmic prowess of OpenAI's GPT-3.5 to engender a platform for enhanced metacognitive engagement and individualized user experience. Underpinned by a sophisticated algorithmic framework, Alpha integrates vast datasets through a complex interplay of neural network models and symbolic AI, facilitating a dynamic, adaptive learning process. This integration enables the system to construct a detailed user profile, encompassing linguistic preferences, emotional tendencies, and cognitive styles, tailoring interactions to align with individual characteristics and conversational contexts. Furthermore, Alpha incorporates advanced metacognitive elements, enabling real-time reflection and adaptation in communication strategies. This self-reflective capability ensures continuous refinement of its interaction model, positioning Alpha not just as a technological marvel but as a harbinger of a new era in human-computer interaction, where machines engage with us on a deeply personal and cognitive level, transforming our interaction with the digital world.

Keywords: chatbot, GPT 3.5, metacognition, symbiose

Procedia PDF Downloads 35
252 Prevalence of Work-Related Musculoskeletal Disorder among Dental Personnel in Perak

Authors: Nursyafiq Ali Shibramulisi, Nor Farah Fauzi, Nur Azniza Zawin Anuar, Nurul Atikah Azmi, Janice Hew Pei Fang

Abstract:

Background: Work related musculoskeletal disorders (WRMD) among dental personnel have been underestimated and under-reported worldwide and specifically in Malaysia. The problem will arise and progress slowly over time, as it results from accumulated injury throughout the period of work. Several risk factors, such as repetitive movement, static posture, vibration, and adapting poor working postures, have been identified to be contributing to WRMSD in dental practices. Dental personnel is at higher risk of getting this problem as it is their working nature and core business. This would cause pain and dysfunction syndrome among them and result in absence from work and substandard services to their patients. Methodology: A cross-sectional study involving 19 government dental clinics in Perak was done over the period of 3 months. Those who met the criteria were selected to participate in this study. Malay version of the Self-Reported Nordic Musculoskeletal Discomfort Form was used to identify the prevalence of WRMSD, while the intensity of pain in the respective regions was evaluated using a 10-point scale according to ‘Pain as The 5ᵗʰ Vital Sign’ by MOH Malaysia and later on were analyzed using SPSS version 25. Descriptive statistics, including mean and SD and median and IQR, were used for numerical data. Categorical data were described by percentage. Pearson’s Chi-Square Test and Spearman’s Correlation were used to find the association between the prevalence of WRMSD and other socio-demographic data. Results: 159 dentists, 73 dental therapists, 26 dental lab technicians, 81 dental surgery assistants, and 23 dental attendants participated in this study. The mean age for the participants was 34.9±7.4 and their mean years of service was 9.97±7.5. Most of them were female (78.5%), Malay (71.3%), married (69.6%) and right-handed (90.1%). The highest prevalence of WRMSD was neck (58.0%), followed by shoulder (48.1%), upper back (42.0%), lower back (40.6%), hand/wrist (31.5%), feet (21.3%), knee (12.2%), thigh 7.7%) and lastly elbow (6.9%). Most of those who reported having neck pain scaled their pain experiences at 2 out of 10 (19.5%), while for those who suffered upper back discomfort, most of them scaled their pain experience at 6 out of 10 (17.8%). It was found that there was a significant relationship between age and pain at neck (p=0.007), elbow (p=0.027), lower back (p=0.032), thigh (p=0.039), knee (p=0.001) and feet (p=0.000) regions. Job position also had been found to be having a significant relationship with pain experienced at the lower back (p=0.018), thigh (p=0.011), knee, and feet (p=0.000). Conclusion: The prevalence of WRMSD among dental personnel in Perak was found to be high. Age and job position were found to be having a significant relationship with pain experienced in several regions. Intervention programs should be planned and conducted to prevent and reduce the occurrence of WRMSD, as all harmful or unergonomic practices should be avoided at all costs.

Keywords: WRMSD, ergonomic, dentistry, dental

Procedia PDF Downloads 69
251 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: bipartite graph, one-mode projection, clustering, web proxy detection

Procedia PDF Downloads 223
250 Using Risk Management Indicators in Decision Tree Analysis

Authors: Adel Ali Elshaibani

Abstract:

Risk management indicators augment the reporting infrastructure, particularly for the board and senior management, to identify, monitor, and manage risks. This enhancement facilitates improved decision-making throughout the banking organization. Decision tree analysis is a tool that visually outlines potential outcomes, costs, and consequences of complex decisions. It is particularly beneficial for analyzing quantitative data and making decisions based on numerical values. By calculating the expected value of each outcome, decision tree analysis can help assess the best course of action. In the context of banking, decision tree analysis can assist lenders in evaluating a customer’s creditworthiness, thereby preventing losses. However, applying these tools in developing countries may face several limitations, such as data availability, lack of technological infrastructure and resources, lack of skilled professionals, cultural factors, and cost. Moreover, decision trees can create overly complex models that do not generalize well to new data, known as overfitting. They can also be sensitive to small changes in the data, which can result in different tree structures and can become computationally expensive when dealing with large datasets. In conclusion, while risk management indicators and decision tree analysis are beneficial for decision-making in banks, their effectiveness is contingent upon how they are implemented and utilized by the board of directors, especially in the context of developing countries. It’s important to consider these limitations when planning to implement these tools in developing countries.

Keywords: risk management indicators, decision tree analysis, developing countries, board of directors, bank performance, risk management strategy, banking institutions

Procedia PDF Downloads 35
249 Heterogeneity of Soil Moisture and Its Impacts on the Mountainous Watershed Hydrology in Northwest China

Authors: Chansheng He, Zhongfu Wang, Xiao Bai, Jie Tian, Xin Jin

Abstract:

Heterogeneity of soil hydraulic properties directly affects hydrological processes at different scales. Understanding heterogeneity of soil hydraulic properties such as soil moisture is therefore essential for modeling watershed ecohydrological processes, particularly in hard to access, topographically complex mountainous watersheds. This study maps spatial variations of soil moisture by in situ observation network that consists of sampling points, zones, and tributaries, and monitors corresponding hydrological variables of air and soil temperatures, evapotranspiration, infiltration, and runoff in the Upper Reach of the Heihe River Watershed, a second largest inland river (terminal lake) with a drainage area of over 128,000 km² in Northwest China. Subsequently, the study uses a hydrological model, SWAT (Soil and Water Assessment Tool) to simulate the effects of heterogeneity of soil moisture on watershed hydrological processes. The spatial clustering method, Full-Order-CLK was employed to derive five soil heterogeneous zones (Configuration 97, 80, 65, 40, and 20) for soil input to SWAT. Results show the simulations by the SWAT model with the spatially clustered soil hydraulic information from the field sampling data had much better representation of the soil heterogeneity and more accurate performance than the model using the average soil property values for each soil type derived from the coarse soil datasets. Thus, incorporating detailed field sampling soil heterogeneity data greatly improves performance in hydrologic modeling.

Keywords: heterogeneity, soil moisture, SWAT, up-scaling

Procedia PDF Downloads 321
248 The Effect of Fish and Krill Oil on Warfarin Control

Authors: Rebecca Pryce, Nijole Bernaitis, Andrew K. Davey, Shailendra Anoopkumar-Dukie

Abstract:

Background: Warfarin is an oral anticoagulant widely used in the prevention of strokes in patients with atrial fibrillation (AF) and in the treatment and prevention of deep vein thrombosis (DVT). Regular monitoring of Internationalised Normalised Ratio (INR) is required to ensure therapeutic benefit with time in therapeutic range (TTR) used to measure warfarin control. A number of factors influence TTR including diet, concurrent illness, and drug interactions. Extensive literature exists regarding the effect of conventional medicines on warfarin control, but documented interactions relating to complementary medicines are limited. It has been postulated that fish oil and krill oil supplementation may affect warfarin due to their association with bleeding events. However, to date little is known as to whether fish and krill oil significantly alter the incidence of bleeding with warfarin or impact on warfarin control. Aim:To assess the influence of fish oil and krill oil supplementation on warfarin control in AF and DVT patients by determining the influence of these supplements on TTR and bleeding events. Methods:A retrospective cohort analysis was conducted utilising patient information from a large private pathology practice in Queensland. AF and DVT patients receiving warfarin management by the pathology practice were identified and their TTR calculated using the Rosendaal method. Concurrent medications were analysed and patients taking no other interacting medicines were identified and divided according to users of fish oil and krill oil supplements and those taking no supplements. Study variables included TTR and the incidence of bleeding with exclusion criteria being less than 30 days of treatment with warfarin. Subject characteristics were reported as the mean and standard deviation for continuous data and number and percentages for nominal or categorical data. Data was analysed using GraphPad InStat Version 3 with a p value of <0.05 considered to be statistically significant. Results:Of the 2081 patients assessed for inclusion into this study, a total of 573 warfarin users met the inclusion criteria. Of these, 416 (72.6%) patients were AF patients and 157 (27.4%) DVT patients and overall there were 316 (55.1%) male and 257 (44.9%) female patients. 145 patients were included in the fish oil/krill oil group (supplement) and 428 were included in the control group. The mean TTR of supplement users was 86.9% and for the control group 84.7% with no significant difference between these groups. Control patients experienced 1.6 times the number of minor bleeds per person compared to supplement patients and 1.2 times the number of major bleeds per person. However, this was not statistically significant nor was the comparison between thrombotic events. Conclusion: No significant difference was found between supplement and control patients in terms of mean TTR, the number of bleeds and thrombotic events. Fish oil and krill oil supplements when used concurrently with warfarin do not significantly affect warfarin control as measured by TTR and bleeding incidence.

Keywords: atrial fibrillation, deep vein thormbosis, fish oil, krill oil, warfarin

Procedia PDF Downloads 271
247 Extracellular Enzymes as Promising Soil Health Indicators: Assessing Response to Different Land Uses Using Long-Term Experiments

Authors: Munisath Khandoker, Stephan Haefele, Andy Gregory

Abstract:

Extracellular enzymes play a key role in soil organic carbon (SOC) decomposition and nutrient cycling and are known indicators for soil health; however, it is not understood how these enzymes respond to different land uses and their relationships to other soil properties have not been extensively reviewed. The relationships among the activities of three soil enzymes: β-glucosaminidase (NAG), phosphomonoesterase (PHO) and β-glucosidase (GLU), were examined. The impact of soil organic amendments, soil types and land management on soil enzyme activities were reviewed, and it was hypothesized that soils with increased SOC have increased enzyme activity. Long-term experiments at Rothamsted Research Woburn and Harpenden sites in the UK were used to evaluate how different management practices affect enzyme activity involved in carbon (C) and nitrogen (N) cycling in the soil. Samples were collected from soils with different organic treatments such as straw, farmyard manure (FYM), compost additions, cover crops and permanent grass cover to assess whether SOC can be linked with increased levels of enzymatic activity and what influence, if any, enzymatic activity has on total C and N in the soil. Investigating the interactions of important enzymes with soil characteristics and SOC can help to better understand the health of soils. Studies on long-term experiments with known histories and large datasets can better help with this. SOC tends to decrease during land use changes from natural ecosystems to agricultural systems; therefore, it is imperative that agricultural lands find ways to increase and/or maintain SOC in the soil.

Keywords: biological soil health indicators, extracellular enzymes, soil health, soil, microbiology

Procedia PDF Downloads 48
246 A Convolutional Neural Network-Based Model for Lassa fever Virus Prediction Using Patient Blood Smear Image

Authors: A. M. John-Otumu, M. M. Rahman, M. C. Onuoha, E. P. Ojonugwa

Abstract:

A Convolutional Neural Network (CNN) model for predicting Lassa fever was built using Python 3.8.0 programming language, alongside Keras 2.2.4 and TensorFlow 2.6.1 libraries as the development environment in order to reduce the current high risk of Lassa fever in West Africa, particularly in Nigeria. The study was prompted by some major flaws in existing conventional laboratory equipment for diagnosing Lassa fever (RT-PCR), as well as flaws in AI-based techniques that have been used for probing and prognosis of Lassa fever based on literature. There were 15,679 blood smear microscopic image datasets collected in total. The proposed model was trained on 70% of the dataset and tested on 30% of the microscopic images in avoid overfitting. A 3x3x3 convolution filter was also used in the proposed system to extract features from microscopic images. The proposed CNN-based model had a recall value of 96%, a precision value of 93%, an F1 score of 95%, and an accuracy of 94% in predicting and accurately classifying the images into clean or infected samples. Based on empirical evidence from the results of the literature consulted, the proposed model outperformed other existing AI-based techniques evaluated. If properly deployed, the model will assist physicians, medical laboratory scientists, and patients in making accurate diagnoses for Lassa fever cases, allowing the mortality rate due to the Lassa fever virus to be reduced through sound decision-making.

Keywords: artificial intelligence, ANN, blood smear, CNN, deep learning, Lassa fever

Procedia PDF Downloads 88
245 Progress in Combining Image Captioning and Visual Question Answering Tasks

Authors: Prathiksha Kamath, Pratibha Jamkhandi, Prateek Ghanti, Priyanshu Gupta, M. Lakshmi Neelima

Abstract:

Combining Image Captioning and Visual Question Answering (VQA) tasks have emerged as a new and exciting research area. The image captioning task involves generating a textual description that summarizes the content of the image. VQA aims to answer a natural language question about the image. Both these tasks include computer vision and natural language processing (NLP) and require a deep understanding of the content of the image and semantic relationship within the image and the ability to generate a response in natural language. There has been remarkable growth in both these tasks with rapid advancement in deep learning. In this paper, we present a comprehensive review of recent progress in combining image captioning and visual question-answering (VQA) tasks. We first discuss both image captioning and VQA tasks individually and then the various ways in which both these tasks can be integrated. We also analyze the challenges associated with these tasks and ways to overcome them. We finally discuss the various datasets and evaluation metrics used in these tasks. This paper concludes with the need for generating captions based on the context and captions that are able to answer the most likely asked questions about the image so as to aid the VQA task. Overall, this review highlights the significant progress made in combining image captioning and VQA, as well as the ongoing challenges and opportunities for further research in this exciting and rapidly evolving field, which has the potential to improve the performance of real-world applications such as autonomous vehicles, robotics, and image search.

Keywords: image captioning, visual question answering, deep learning, natural language processing

Procedia PDF Downloads 53