Search results for: imbalanced datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 750

Search results for: imbalanced datasets

600 Data Augmentation for Early-Stage Lung Nodules Using Deep Image Prior and Pix2pix

Authors: Qasim Munye, Juned Islam, Haseeb Qureshi, Syed Jung

Abstract:

Lung nodules are commonly identified in computed tomography (CT) scans by experienced radiologists at a relatively late stage. Early diagnosis can greatly increase survival. We propose using a pix2pix conditional generative adversarial network to generate realistic images simulating early-stage lung nodule growth. We have applied deep images prior to 2341 slices from 895 computed tomography (CT) scans from the Lung Image Database Consortium (LIDC) dataset to generate pseudo-healthy medical images. From these images, 819 were chosen to train a pix2pix network. We observed that for most of the images, the pix2pix network was able to generate images where the nodule increased in size and intensity across epochs. To evaluate the images, 400 generated images were chosen at random and shown to a medical student beside their corresponding original image. Of these 400 generated images, 384 were defined as satisfactory - meaning they resembled a nodule and were visually similar to the corresponding image. We believe that this generated dataset could be used as training data for neural networks to detect lung nodules at an early stage or to improve the accuracy of such networks. This is particularly significant as datasets containing the growth of early-stage nodules are scarce. This project shows that the combination of deep image prior and generative models could potentially open the door to creating larger datasets than currently possible and has the potential to increase the accuracy of medical classification tasks.

Keywords: medical technology, artificial intelligence, radiology, lung cancer

Procedia PDF Downloads 46
599 Lost Maritime Culture in the Netherlands: Linking Material and Immaterial Datasets for a Modern Day Perception of the Late Medieval Maritime Cultural Landscape of the Zuiderzee Region

Authors: Y. T. van Popta

Abstract:

This paper focuses on the never thoroughly examined yet in native relevant late medieval maritime cultural landscape of the former Zuiderzee (A.D. 1170-1932) in the center part of the Netherlands. Especially the northeastern part of the region, nowadays known as the Noordoostpolder, testifies of the dynamic battle of the Dutch against the water. This highly dynamic maritime region developed from a lake district into a sea and eventually into a polder. By linking physical and cognitive datasets from the Noordoostpol-der region in a spatial environment, new information on a late medieval maritime culture is brought to light, giving the opportunity to: (i) create a modern day perception on the late medieval maritime cultural landscape of the region and (ii) to underline the value of interdisciplinary and spatial research in maritime archaeology in general. Since the large scale reclamations of the region (A.D. 1932-1968), many remains have been discovered of a drowned and eroded late medieval maritime culture, represented by lost islands, drowned settlements, cultivated lands, shipwrecks and socio-economic networks. Recent archaeological research has proved the existence of this late medieval maritime culture by the discovery of the remains of the drowned settlement Fenehuysen (Veenhuizen) and its surroundings. The fact that this settlement and its cultivated surroundings remained hidden for so long proves that a large part of the maritime cultural landscape is ‘invisible’ and can only be found by extensive interdisciplinary research.

Keywords: drowned settlements, late middle ages, lost islands, maritime cultural landscape, the Netherlands

Procedia PDF Downloads 181
598 Using Machine Learning to Build a Real-Time COVID-19 Mask Safety Monitor

Authors: Yash Jain

Abstract:

The US Center for Disease Control has recommended wearing masks to slow the spread of the virus. The research uses a video feed from a camera to conduct real-time classifications of whether or not a human is correctly wearing a mask, incorrectly wearing a mask, or not wearing a mask at all. Utilizing two distinct datasets from the open-source website Kaggle, a mask detection network had been trained. The first dataset that was used to train the model was titled 'Face Mask Detection' on Kaggle, where the dataset was retrieved from and the second dataset was titled 'Face Mask Dataset, which provided the data in a (YOLO Format)' so that the TinyYoloV3 model could be trained. Based on the data from Kaggle, two machine learning models were implemented and trained: a Tiny YoloV3 Real-time model and a two-stage neural network classifier. The two-stage neural network classifier had a first step of identifying distinct faces within the image, and the second step was a classifier to detect the state of the mask on the face and whether it was worn correctly, incorrectly, or no mask at all. The TinyYoloV3 was used for the live feed as well as for a comparison standpoint against the previous two-stage classifier and was trained using the darknet neural network framework. The two-stage classifier attained a mean average precision (MAP) of 80%, while the model trained using TinyYoloV3 real-time detection had a mean average precision (MAP) of 59%. Overall, both models were able to correctly classify stages/scenarios of no mask, mask, and incorrectly worn masks.

Keywords: datasets, classifier, mask-detection, real-time, TinyYoloV3, two-stage neural network classifier

Procedia PDF Downloads 132
597 Incorporating Lexical-Semantic Knowledge into Convolutional Neural Network Framework for Pediatric Disease Diagnosis

Authors: Xiaocong Liu, Huazhen Wang, Ting He, Xiaozheng Li, Weihan Zhang, Jian Chen

Abstract:

The utilization of electronic medical record (EMR) data to establish the disease diagnosis model has become an important research content of biomedical informatics. Deep learning can automatically extract features from the massive data, which brings about breakthroughs in the study of EMR data. The challenge is that deep learning lacks semantic knowledge, which leads to impracticability in medical science. This research proposes a method of incorporating lexical-semantic knowledge from abundant entities into a convolutional neural network (CNN) framework for pediatric disease diagnosis. Firstly, medical terms are vectorized into Lexical Semantic Vectors (LSV), which are concatenated with the embedded word vectors of word2vec to enrich the feature representation. Secondly, the semantic distribution of medical terms serves as Semantic Decision Guide (SDG) for the optimization of deep learning models. The study evaluate the performance of LSV-SDG-CNN model on four kinds of Chinese EMR datasets. Additionally, CNN, LSV-CNN, and SDG-CNN are designed as baseline models for comparison. The experimental results show that LSV-SDG-CNN model outperforms baseline models on four kinds of Chinese EMR datasets. The best configuration of the model yielded an F1 score of 86.20%. The results clearly demonstrate that CNN has been effectively guided and optimized by lexical-semantic knowledge, and LSV-SDG-CNN model improves the disease classification accuracy with a clear margin.

Keywords: convolutional neural network, electronic medical record, feature representation, lexical semantics, semantic decision

Procedia PDF Downloads 111
596 Enhanced Multi-Scale Feature Extraction Using a DCNN by Proposing Dynamic Soft Margin SoftMax for Face Emotion Detection

Authors: Armin Nabaei, M. Omair Ahmad, M. N. S. Swamy

Abstract:

Many facial expression and emotion recognition methods in the traditional approaches of using LDA, PCA, and EBGM have been proposed. In recent years deep learning models have provided a unique platform addressing by automatically extracting the features for the detection of facial expression and emotions. However, deep networks require large training datasets to extract automatic features effectively. In this work, we propose an efficient emotion detection algorithm using face images when only small datasets are available for training. We design a deep network whose feature extraction capability is enhanced by utilizing several parallel modules between the input and output of the network, each focusing on the extraction of different types of coarse features with fined grained details to break the symmetry of produced information. In fact, we leverage long range dependencies, which is one of the main drawback of CNNs. We develop this work by introducing a Dynamic Soft-Margin SoftMax.The conventional SoftMax suffers from reaching to gold labels very soon, which take the model to over-fitting. Because it’s not able to determine adequately discriminant feature vectors for some variant class labels. We reduced the risk of over-fitting by using a dynamic shape of input tensor instead of static in SoftMax layer with specifying a desired Soft- Margin. In fact, it acts as a controller to how hard the model should work to push dissimilar embedding vectors apart. For the proposed Categorical Loss, by the objective of compacting the same class labels and separating different class labels in the normalized log domain.We select penalty for those predictions with high divergence from ground-truth labels.So, we shorten correct feature vectors and enlarge false prediction tensors, it means we assign more weights for those classes with conjunction to each other (namely, “hard labels to learn”). By doing this work, we constrain the model to generate more discriminate feature vectors for variant class labels. Finally, for the proposed optimizer, our focus is on solving weak convergence of Adam optimizer for a non-convex problem. Our noteworthy optimizer is working by an alternative updating gradient procedure with an exponential weighted moving average function for faster convergence and exploiting a weight decay method to help drastically reducing the learning rate near optima to reach the dominant local minimum. We demonstrate the superiority of our proposed work by surpassing the first rank of three widely used Facial Expression Recognition datasets with 93.30% on FER-2013, and 16% improvement compare to the first rank after 10 years, reaching to 90.73% on RAF-DB, and 100% k-fold average accuracy for CK+ dataset, and shown to provide a top performance to that provided by other networks, which require much larger training datasets.

Keywords: computer vision, facial expression recognition, machine learning, algorithms, depp learning, neural networks

Procedia PDF Downloads 54
595 Tractography Analysis of the Evolutionary Origin of Schizophrenia

Authors: Asmaa Tahiri, Mouktafi Amine

Abstract:

A substantial number of traditional medical research has been put forward to managing and treating mental disorders. At the present time, to our best knowledge, it is believed that fundamental understanding of the underlying causes of the majority psychological disorders needs to be explored further to inform early diagnosis, managing symptoms and treatment. The emerging field of evolutionary psychology is a promising prospect to address the origin of mental disorders, potentially leading to more effective treatments. Schizophrenia as a topical mental disorder has been linked to the evolutionary adaptation of the human brain represented in the brain connectivity and asymmetry directly linked to humans higher brain cognition in contrast to other primates being our direct living representation of the structure and connectivity of our earliest common African ancestors. As proposed in the evolutionary psychology scientific literature the pathophysiology of schizophrenia is expressed and directly linked to altered connectivity between the Hippocampal Formation (HF) and Dorsolateral Prefrontal Cortex (DLPFC). This research paper presents the results of the use of tractography analysis using multiple open access Diffusion Weighted Imaging (DWI) datasets of healthy subjects, schizophrenia-affected subjects and primates to illustrate the relevance of the aforementioned brain regions connectivity and the underlying evolutionary changes in the human brain. Deterministic fiber tracking and streamline analysis were used to generate connectivity matrices from the DWI datasets overlaid to compute distances and highlight disconnectivity patterns in conjunction with other fiber tracking metrics; Fractional Anisotropy (FA), Mean Diffusivity (MD) and Radial Diffusivity (RD).

Keywords: tractography, evolutionary psychology, schizophrenia, brain connectivity

Procedia PDF Downloads 40
594 Differentially Expressed Genes in Atopic Dermatitis: Bioinformatics Analysis Of Pooled Microarray Gene Expression Datasets In Gene Expression Omnibus

Authors: Danna Jia, Bin Li

Abstract:

Background: Atopic dermatitis (AD) is a chronic and refractory inflammatory skin disease characterized by relapsing eczematous and pruritic skin lesions. The global prevalence of AD ranges from 1~ 20%, and its incidence rates are increasing. It affects individuals from infancy to adulthood, significantly impacting their daily lives and social activities. Despite its major health burden, the precise mechanisms underlying AD remain unknown. Understanding the genetic differences associated with AD is crucial for advancing diagnosis and targeted treatment development. This study aims to identify candidate genes of AD by using bioinformatics analysis. Methods: We conducted a comprehensive analysis of four pooled transcriptomic datasets (GSE16161, GSE32924, GSE130588, and GSE120721) obtained from the Gene Expression Omnibus (GEO) database. Differential gene expression analysis was performed using the R statistical language. The differentially expressed genes (DEGs) between AD patients and normal individuals were functionally analyzed using Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment. Furthermore, a protein-protein interaction (PPI) network was constructed to identify candidate genes. Results: Among the patient-level gene expression datasets, we identified 114 shared DEGs, consisting of 53 upregulated genes and 61 downregulated genes. Functional analysis using GO and KEGG revealed that the DEGs were mainly associated with the negative regulation of transcription from RNA polymerase II promoter, membrane-related functions, protein binding, and the Human papillomavirus infection pathway. Through the PPI network analysis, we identified eight core genes: CD44, STAT1, HMMR, AURKA, MKI67, and SMARCA4. Conclusion: This study elucidates key genes associated with AD, providing potential targets for diagnosis and treatment. The identified genes have the potential to contribute to the understanding and management of AD. The bioinformatics analysis conducted in this study offers new insights and directions for further research on AD. Future studies can focus on validating the functional roles of these genes and exploring their therapeutic potential in AD. While these findings will require further verification as achieved with experiments involving in vivo and in vitro models, these results provided some initial insights into dysfunctional inflammatory and immune responses associated with AD. Such information offers the potential to develop novel therapeutic targets for use in preventing and treating AD.

Keywords: atopic dermatitis, bioinformatics, biomarkers, genes

Procedia PDF Downloads 55
593 Shark Detection and Classification with Deep Learning

Authors: Jeremy Jenrette, Z. Y. C. Liu, Pranav Chimote, Edward Fox, Trevor Hastie, Francesco Ferretti

Abstract:

Suitable shark conservation depends on well-informed population assessments. Direct methods such as scientific surveys and fisheries monitoring are adequate for defining population statuses, but species-specific indices of abundance and distribution coming from these sources are rare for most shark species. We can rapidly fill these information gaps by boosting media-based remote monitoring efforts with machine learning and automation. We created a database of shark images by sourcing 24,546 images covering 219 species of sharks from the web application spark pulse and the social network Instagram. We used object detection to extract shark features and inflate this database to 53,345 images. We packaged object-detection and image classification models into a Shark Detector bundle. We developed the Shark Detector to recognize and classify sharks from videos and images using transfer learning and convolutional neural networks (CNNs). We applied these models to common data-generation approaches of sharks: boosting training datasets, processing baited remote camera footage and online videos, and data-mining Instagram. We examined the accuracy of each model and tested genus and species prediction correctness as a result of training data quantity. The Shark Detector located sharks in baited remote footage and YouTube videos with an average accuracy of 89\%, and classified located subjects to the species level with 69\% accuracy (n =\ eight species). The Shark Detector sorted heterogeneous datasets of images sourced from Instagram with 91\% accuracy and classified species with 70\% accuracy (n =\ 17 species). Data-mining Instagram can inflate training datasets and increase the Shark Detector’s accuracy as well as facilitate archiving of historical and novel shark observations. Base accuracy of genus prediction was 68\% across 25 genera. The average base accuracy of species prediction within each genus class was 85\%. The Shark Detector can classify 45 species. All data-generation methods were processed without manual interaction. As media-based remote monitoring strives to dominate methods for observing sharks in nature, we developed an open-source Shark Detector to facilitate common identification applications. Prediction accuracy of the software pipeline increases as more images are added to the training dataset. We provide public access to the software on our GitHub page.

Keywords: classification, data mining, Instagram, remote monitoring, sharks

Procedia PDF Downloads 95
592 Fake News Detection Based on Fusion of Domain Knowledge and Expert Knowledge

Authors: Yulan Wu

Abstract:

The spread of fake news on social media has posed significant societal harm to the public and the nation, with its threats spanning various domains, including politics, economics, health, and more. News on social media often covers multiple domains, and existing models studied by researchers and relevant organizations often perform well on datasets from a single domain. However, when these methods are applied to social platforms with news spanning multiple domains, their performance significantly deteriorates. Existing research has attempted to enhance the detection performance of multi-domain datasets by adding single-domain labels to the data. However, these methods overlook the fact that a news article typically belongs to multiple domains, leading to the loss of domain knowledge information contained within the news text. To address this issue, research has found that news records in different domains often use different vocabularies to describe their content. In this paper, we propose a fake news detection framework that combines domain knowledge and expert knowledge. Firstly, it utilizes an unsupervised domain discovery module to generate a low-dimensional vector for each news article, representing domain embeddings, which can retain multi-domain knowledge of the news content. Then, a feature extraction module uses the domain embeddings discovered through unsupervised domain knowledge to guide multiple experts in extracting news knowledge for the total feature representation. Finally, a classifier is used to determine whether the news is fake or not. Experiments show that this approach can improve multi-domain fake news detection performance while reducing the cost of manually labeling domain labels.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 44
591 Tractography Analysis and the Evolutionary Origin of Schizophrenia

Authors: Mouktafi Amine, Tahiri Asmaa

Abstract:

A substantial number of traditional medical research has been put forward to managing and treating mental disorders. At the present time, to our best knowledge, it is believed that a fundamental understanding of the underlying causes of the majority of psychological disorders needs to be explored further to inform early diagnosis, managing symptoms and treatment. The emerging field of evolutionary psychology is a promising prospect to address the origin of mental disorders, potentially leading to more effective treatments. Schizophrenia as a topical mental disorder has been linked to the evolutionary adaptation of the human brain represented in the brain connectivity and asymmetry directly linked to humans' higher brain cognition in contrast to other primates being our direct living representation of the structure and connectivity of our earliest common African ancestors. As proposed in the evolutionary psychology scientific literature, the pathophysiology of schizophrenia is expressed and directly linked to altered connectivity between the Hippocampal Formation (HF) and Dorsolateral Prefrontal Cortex (DLPFC). This research paper presents the results of the use of tractography analysis using multiple open access Diffusion Weighted Imaging (DWI) datasets of healthy subjects, schizophrenia-affected subjects and primates to illustrate the relevance of the aforementioned brain regions' connectivity and the underlying evolutionary changes in the human brain. Deterministic fiber tracking and streamline analysis were used to generate connectivity matrices from the DWI datasets overlaid to compute distances and highlight disconnectivity patterns in conjunction with other fiber tracking metrics: Fractional Anisotropy (FA), Mean Diffusivity (MD) and Radial Diffusivity (RD).

Keywords: tractography, diffusion weighted imaging, schizophrenia, evolutionary psychology

Procedia PDF Downloads 24
590 Using Visualization Techniques to Support Common Clinical Tasks in Clinical Documentation

Authors: Jonah Kenei, Elisha Opiyo

Abstract:

Electronic health records, as a repository of patient information, is nowadays the most commonly used technology to record, store and review patient clinical records and perform other clinical tasks. However, the accurate identification and retrieval of relevant information from clinical records is a difficult task due to the unstructured nature of clinical documents, characterized in particular by a lack of clear structure. Therefore, medical practice is facing a challenge thanks to the rapid growth of health information in electronic health records (EHRs), mostly in narrative text form. As a result, it's becoming important to effectively manage the growing amount of data for a single patient. As a result, there is currently a requirement to visualize electronic health records (EHRs) in a way that aids physicians in clinical tasks and medical decision-making. Leveraging text visualization techniques to unstructured clinical narrative texts is a new area of research that aims to provide better information extraction and retrieval to support clinical decision support in scenarios where data generated continues to grow. Clinical datasets in electronic health records (EHR) offer a lot of potential for training accurate statistical models to classify facets of information which can then be used to improve patient care and outcomes. However, in many clinical note datasets, the unstructured nature of clinical texts is a common problem. This paper examines the very issue of getting raw clinical texts and mapping them into meaningful structures that can support healthcare professionals utilizing narrative texts. Our work is the result of a collaborative design process that was aided by empirical data collected through formal usability testing.

Keywords: classification, electronic health records, narrative texts, visualization

Procedia PDF Downloads 94
589 Factors Affecting Air Surface Temperature Variations in the Philippines

Authors: John Christian Lequiron, Gerry Bagtasa, Olivia Cabrera, Leoncio Amadore, Tolentino Moya

Abstract:

Changes in air surface temperature play an important role in the Philippine’s economy, industry, health, and food production. While increasing global mean temperature in the recent several decades has prompted a number of climate change and variability studies in the Philippines, most studies still focus on rainfall and tropical cyclones. This study aims to investigate the trend and variability of observed air surface temperature and determine its major influencing factor/s in the Philippines. A non-parametric Mann-Kendall trend test was applied to monthly mean temperature of 17 synoptic stations covering 56 years from 1960 to 2015 and a mean change of 0.58 °C or a positive trend of 0.0105 °C/year (p < 0.05) was found. In addition, wavelet decomposition was used to determine the frequency of temperature variability show a 12-month, 30-80-month and more than 120-month cycles. This indicates strong annual variations, interannual variations that coincide with ENSO events, and interdecadal variations that are attributed to PDO and CO2 concentrations. Air surface temperature was also correlated with smoothed sunspot number and galactic cosmic rays, the results show a low to no effect. The influence of ENSO teleconnection on temperature, wind pattern, cloud cover, and outgoing longwave radiation on different ENSO phases had significant effects on regional temperature variability. Particularly, an anomalous anticyclonic (cyclonic) flow east of the Philippines during the peak and decay phase of El Niño (La Niña) events leads to the advection of warm southeasterly (cold northeasterly) air mass over the country. Furthermore, an apparent increasing cloud cover trend is observed over the West Philippine Sea including portions of the Philippines, and this is believed to lessen the effect of the increasing air surface temperature. However, relative humidity was also found to be increasing especially on the central part of the country, which results in a high positive trend of heat index, exacerbating the effects on human discomfort. Finally, an assessment of gridded temperature datasets was done to look at the viability of using three high-resolution datasets in future climate analysis and model calibration and verification. Several error statistics (i.e. Pearson correlation, Bias, MAE, and RMSE) were used for this validation. Results show that gridded temperature datasets generally follows the observed surface temperature change and anomalies. In addition, it is more representative of regional temperature rather than a substitute to station-observed air temperature.

Keywords: air surface temperature, carbon dioxide, ENSO, galactic cosmic rays, smoothed sunspot number

Procedia PDF Downloads 292
588 Similar Script Character Recognition on Kannada and Telugu

Authors: Gurukiran Veerapur, Nytik Birudavolu, Seetharam U. N., Chandravva Hebbi, R. Praneeth Reddy

Abstract:

This work presents a robust approach for the recognition of characters in Telugu and Kannada, two South Indian scripts with structural similarities in characters. To recognize the characters exhaustive datasets are required, but there are only a few publicly available datasets. As a result, we decided to create a dataset for one language (source language),train the model with it, and then test it with the target language.Telugu is the target language in this work, whereas Kannada is the source language. The suggested method makes use of Canny edge features to increase character identification accuracy on pictures with noise and different lighting. A dataset of 45,150 images containing printed Kannada characters was created. The Nudi software was used to automatically generate printed Kannada characters with different writing styles and variations. Manual labelling was employed to ensure the accuracy of the character labels. The deep learning models like CNN (Convolutional Neural Network) and Visual Attention neural network (VAN) are used to experiment with the dataset. A Visual Attention neural network (VAN) architecture was adopted, incorporating additional channels for Canny edge features as the results obtained were good with this approach. The model's accuracy on the combined Telugu and Kannada test dataset was an outstanding 97.3%. Performance was better with Canny edge characteristics applied than with a model that solely used the original grayscale images. The accuracy of the model was found to be 80.11% for Telugu characters and 98.01% for Kannada words when it was tested with these languages. This model, which makes use of cutting-edge machine learning techniques, shows excellent accuracy when identifying and categorizing characters from these scripts.

Keywords: base characters, modifiers, guninthalu, aksharas, vattakshara, VAN

Procedia PDF Downloads 30
587 Energy Efficiency Analysis of Discharge Modes of an Adiabatic Compressed Air Energy Storage System

Authors: Shane D. Inder, Mehrdad Khamooshi

Abstract:

Efficient energy storage is a crucial factor in facilitating the uptake of renewable energy resources. Among the many options available for energy storage systems required to balance imbalanced supply and demand cycles, compressed air energy storage (CAES) is a proven technology in grid-scale applications. This paper reviews the current state of micro scale CAES technology and describes a micro-scale advanced adiabatic CAES (A-CAES) system, where heat generated during compression is stored for use in the discharge phase. It will also describe a thermodynamic model, developed in EES (Engineering Equation Solver) to evaluate the performance and critical parameters of the discharge phase of the proposed system. Three configurations are explained including: single turbine without preheater, two turbines with preheaters, and three turbines with preheaters. It is shown that the micro-scale A-CAES is highly dependent upon key parameters including; regulator pressure, air pressure and volume, thermal energy storage temperature and flow rate and the number of turbines. It was found that a micro-scale AA-CAES, when optimized with an appropriate configuration, could deliver energy input to output efficiency of up to 70%.

Keywords: CAES, adiabatic compressed air energy storage, expansion phase, micro generation, thermodynamic

Procedia PDF Downloads 291
586 Audit of TPS photon beam dataset for small field output factors using OSLDs against RPC standard dataset

Authors: Asad Yousuf

Abstract:

Purpose: The aim of the present study was to audit treatment planning system beam dataset for small field output factors against standard dataset produced by radiological physics center (RPC) from a multicenter study. Such data are crucial for validity of special techniques, i.e., IMRT or stereotactic radiosurgery. Materials/Method: In this study, multiple small field size output factor datasets were measured and calculated for 6 to 18 MV x-ray beams using the RPC recommend methods. These beam datasets were measured at 10 cm depth for 10 × 10 cm2 to 2 × 2 cm2 field sizes, defined by collimator jaws at 100 cm. The measurements were made with a Landauer’s nanoDot OSLDs whose volume is small enough to gather a full ionization reading even for the 1×1 cm2 field size. At our institute the beam data including output factors have been commissioned at 5 cm depth with an SAD setup. For comparison with the RPC data, the output factors were converted to an SSD setup using tissue phantom ratios. SSD setup also enables coverage of the ion chamber in 2×2 cm2 field size. The measured output factors were also compared with those calculated by Eclipse™ treatment planning software. Result: The measured and calculated output factors are in agreement with RPC dataset within 1% and 4% respectively. The large discrepancies in TPS reflect the increased challenge in converting measured data into a commissioned beam model for very small fields. Conclusion: OSLDs are simple, durable, and accurate tool to verify doses that delivered using small photon beam fields down to a 1x1 cm2 field sizes. The study emphasizes that the treatment planning system should always be evaluated for small field out factors for the accurate dose delivery in clinical setting.

Keywords: small field dosimetry, optically stimulated luminescence, audit treatment, radiological physics center

Procedia PDF Downloads 303
585 Observed Changes in Constructed Precipitation at High Resolution in Southern Vietnam

Authors: Nguyen Tien Thanh, Günter Meon

Abstract:

Precipitation plays a key role in water cycle, defining the local climatic conditions and in ecosystem. It is also an important input parameter for water resources management and hydrologic models. With spatial continuous data, a certainty of discharge predictions or other environmental factors is unquestionably better than without. This is, however, not always willingly available to acquire for a small basin, especially for coastal region in Vietnam due to a low network of meteorological stations (30 stations) on long coast of 3260 km2. Furthermore, available gridded precipitation datasets are not fine enough when applying to hydrologic models. Under conditions of global warming, an application of spatial interpolation methods is a crucial for the climate change impact studies to obtain the spatial continuous data. In recent research projects, although some methods can perform better than others do, no methods draw the best results for all cases. The objective of this paper therefore, is to investigate different spatial interpolation methods for daily precipitation over a small basin (approximately 400 km2) located in coastal region, Southern Vietnam and find out the most efficient interpolation method on this catchment. The five different interpolation methods consisting of cressman, ordinary kriging, regression kriging, dual kriging and inverse distance weighting have been applied to identify the best method for the area of study on the spatio-temporal scale (daily, 10 km x 10 km). A 30-year precipitation database was created and merged into available gridded datasets. Finally, observed changes in constructed precipitation were performed. The results demonstrate that the method of ordinary kriging interpolation is an effective approach to analyze the daily precipitation. The mixed trends of increasing and decreasing monthly, seasonal and annual precipitation have documented at significant levels.

Keywords: interpolation, precipitation, trend, vietnam

Procedia PDF Downloads 259
584 Automatic Near-Infrared Image Colorization Using Synthetic Images

Authors: Yoganathan Karthik, Guhanathan Poravi

Abstract:

Colorizing near-infrared (NIR) images poses unique challenges due to the absence of color information and the nuances in light absorption. In this paper, we present an approach to NIR image colorization utilizing a synthetic dataset generated from visible light images. Our method addresses two major challenges encountered in NIR image colorization: accurately colorizing objects with color variations and avoiding over/under saturation in dimly lit scenes. To tackle these challenges, we propose a Generative Adversarial Network (GAN)-based framework that learns to map NIR images to their corresponding colorized versions. The synthetic dataset ensures diverse color representations, enabling the model to effectively handle objects with varying hues and shades. Furthermore, the GAN architecture facilitates the generation of realistic colorizations while preserving the integrity of dimly lit scenes, thus mitigating issues related to over/under saturation. Experimental results on benchmark NIR image datasets demonstrate the efficacy of our approach in producing high-quality colorizations with improved color accuracy and naturalness. Quantitative evaluations and comparative studies validate the superiority of our method over existing techniques, showcasing its robustness and generalization capability across diverse NIR image scenarios. Our research not only contributes to advancing NIR image colorization but also underscores the importance of synthetic datasets and GANs in addressing domain-specific challenges in image processing tasks. The proposed framework holds promise for various applications in remote sensing, medical imaging, and surveillance where accurate color representation of NIR imagery is crucial for analysis and interpretation.

Keywords: computer vision, near-infrared images, automatic image colorization, generative adversarial networks, synthetic data

Procedia PDF Downloads 20
583 Transcriptomine: The Nuclear Receptor Signaling Transcriptome Database

Authors: Scott A. Ochsner, Christopher M. Watkins, Apollo McOwiti, David L. Steffen Lauren B. Becnel, Neil J. McKenna

Abstract:

Understanding signaling by nuclear receptors (NRs) requires an appreciation of their cognate ligand- and tissue-specific transcriptomes. While target gene regulation data are abundant in this field, they reside in hundreds of discrete publications in formats refractory to routine query and analysis and, accordingly, their full value to the NR signaling community has not been realized. One of the mandates of the Nuclear Receptor Signaling Atlas (NURSA) is to facilitate access of the community to existing public datasets. Pursuant to this mandate we are developing a freely-accessible community web resource, Transcriptomine, to bring together the sum total of available expression array and RNA-Seq data points generated by the field in a single location. Transcriptomine currently contains over 25,000,000 gene fold change datapoints from over 1200 contrasts relevant to over 100 NRs, ligands and coregulators in over 200 tissues and cell lines. Transcriptomine is designed to accommodate a spectrum of end users ranging from the bench researcher to those with advanced bioinformatic training. Visualization tools allow users to build custom charts to compare and contrast patterns of gene regulation across different tissues and in response to different ligands. Our resource affords an entirely new paradigm for leveraging gene expression data in the NR signaling field, empowering users to query gene fold changes across diverse regulatory molecules, tissues and cell lines, target genes, biological functions and disease associations, and that would otherwise be prohibitive in terms of time and effort. Transcriptomine will be regularly updated with gene lists from future genome-wide expression array and expression-sequencing datasets in the NR signaling field.

Keywords: target gene database, informatics, gene expression, transcriptomics

Procedia PDF Downloads 255
582 World on the Edge: Migration and Cross Border Crimes in West Africa

Authors: Adeyemi Kamil Hamzah

Abstract:

The contiguity of nations in international system suggests that world is a composite of socio-economic unit with people exploring and exploiting the potentials in the world via migrations. Thus, cross border migration has made positive contributions to social and economic development of individuals and nations by increasing the household incomes of the host countries. However, the cross border migrations in West Africa are becoming part of a dynamic and unstable world migration system. This is due to the nature and consequences of trans-border crimes in West Africa, with both short and long term effects on the socio-economic viability of developing countries like West African States. The paper identified that migration influenced cross-border crimes as well as the high spate of insurgencies in the sub-region. Furthermore, the consequential effect of a global village has imbalanced population flows, making some countries host and parasites to others. Also, stern and deft cross-border rules and regulations, as well as territorial security and protections, ameliorate cross border crimes and migration in West African sub-regions. Therefore, the study concluded that cross border migration is the linchpin of all kinds of criminal activities which affect the security of states in the sub-region.

Keywords: cross-border migration, border crimes, security, West Africa, development, globalisation

Procedia PDF Downloads 203
581 Gene Expression Meta-Analysis of Potential Shared and Unique Pathways Between Autoimmune Diseases Under anti-TNFα Therapy

Authors: Charalabos Antonatos, Mariza Panoutsopoulou, Georgios K. Georgakilas, Evangelos Evangelou, Yiannis Vasilopoulos

Abstract:

The extended tissue damage and severe clinical outcomes of autoimmune diseases, accompanied by the high annual costs to the overall health care system, highlight the need for an efficient therapy. Increasing knowledge over the pathophysiology of specific chronic inflammatory diseases, namely Psoriasis (PsO), Inflammatory Bowel Diseases (IBD) consisting of Crohn’s disease (CD) and Ulcerative colitis (UC), and Rheumatoid Arthritis (RA), has provided insights into the underlying mechanisms that lead to the maintenance of the inflammation, such as Tumor Necrosis Factor alpha (TNF-α). Hence, the anti-TNFα biological agents pose as an ideal therapeutic approach. Despite the efficacy of anti-TNFα agents, several clinical trials have shown that 20-40% of patients do not respond to treatment. Nowadays, high-throughput technologies have been recruited in order to elucidate the complex interactions in multifactorial phenotypes, with the most ubiquitous ones referring to transcriptome quantification analyses. In this context, a random effects meta-analysis of available gene expression cDNA microarray datasets was performed between responders and non-responders to anti-TNFα therapy in patients with IBD, PsO, and RA. Publicly available datasets were systematically searched from inception to 10th of November 2020 and selected for further analysis if they assessed the response to anti-TNFα therapy with clinical score indexes from inflamed biopsies. Specifically, 4 IBD (79 responders/72 non-responders), 3 PsO (40 responders/11 non-responders) and 2 RA (16 responders/6 non-responders) datasetswere selected. After the separate pre-processing of each dataset, 4 separate meta-analyses were conducted; three disease-specific and a single combined meta-analysis on the disease-specific results. The MetaVolcano R package (v.1.8.0) was utilized for a random-effects meta-analysis through theRestricted Maximum Likelihood (RELM) method. The top 1% of the most consistently perturbed genes in the included datasets was highlighted through the TopConfects approach while maintaining a 5% False Discovery Rate (FDR). Genes were considered as Differentialy Expressed (DEGs) as those with P ≤ 0.05, |log2(FC)| ≥ log2(1.25) and perturbed in at least 75% of the included datasets. Over-representation analysis was performed using Gene Ontology and Reactome Pathways for both up- and down-regulated genes in all 4 performed meta-analyses. Protein-Protein interaction networks were also incorporated in the subsequentanalyses with STRING v11.5 and Cytoscape v3.9. Disease-specific meta-analyses detected multiple distinct pro-inflammatory and immune-related down-regulated genes for each disease, such asNFKBIA, IL36, and IRAK1, respectively. Pathway analyses revealed unique and shared pathways between each disease, such as Neutrophil Degranulation and Signaling by Interleukins. The combined meta-analysis unveiled 436 DEGs, 86 out of which were up- and 350 down-regulated, confirming the aforementioned shared pathways and genes, as well as uncovering genes that participate in anti-inflammatory pathways, namely IL-10 signaling. The identification of key biological pathways and regulatory elements is imperative for the accurate prediction of the patient’s response to biological drugs. Meta-analysis of such gene expression data could aid the challenging approach to unravel the complex interactions implicated in the response to anti-TNFα therapy in patients with PsO, IBD, and RA, as well as distinguish gene clusters and pathways that are altered through this heterogeneous phenotype.

Keywords: anti-TNFα, autoimmune, meta-analysis, microarrays

Procedia PDF Downloads 152
580 Corpus-Assisted Study of Gender Related Tiger Metaphors in the Chinese Context

Authors: Na Xiao

Abstract:

Animal metaphors have many different connotations, ranging from loving emotions to derogatory epithets, but gender expressions using animal metaphors are often imbalanced. Generally, animal metaphors related to females tend to be negative. Little known about the reasons for the negative expressions of animal female metaphors in Chinese contexts still have not been quantified. The Modern Chinese Corpus at the Center for Chinese Linguistics at Peking University (CCL Corpus) provided the data for this research, which aims to identify the influencing variables of gender differences in the description of animal metaphors mapping humans in Chinese by observing the percentage of "tiger" metaphor, which is based on the conceptual metaphor theory. A quantitative research method was used in this study to statistically examine the gender attitude percentage of the "tiger" metaphor using corpus data. This study has proved that the tiger metaphors associated with humans in the Chinese context tend to be negative. Importantly, this study has also shown that the high proportion of tiger metaphorical idioms is what causes the high proportion of negative tiger metaphors that are related to women. This finding can be used as crucial information for future studies on other gender-related animal metaphorical idioms and can offer additional insights for understanding trends in other animal metaphors.

Keywords: Chinese, CCL corpus, gender differences, metaphorical idioms, tigers

Procedia PDF Downloads 88
579 Feature Evaluation Based on Random Subspace and Multiple-K Ensemble

Authors: Jaehong Yu, Seoung Bum Kim

Abstract:

Clustering analysis can facilitate the extraction of intrinsic patterns in a dataset and reveal its natural groupings without requiring class information. For effective clustering analysis in high dimensional datasets, unsupervised dimensionality reduction is an important task. Unsupervised dimensionality reduction can generally be achieved by feature extraction or feature selection. In many situations, feature selection methods are more appropriate than feature extraction methods because of their clear interpretation with respect to the original features. The unsupervised feature selection can be categorized as feature subset selection and feature ranking method, and we focused on unsupervised feature ranking methods which evaluate the features based on their importance scores. Recently, several unsupervised feature ranking methods were developed based on ensemble approaches to achieve their higher accuracy and stability. However, most of the ensemble-based feature ranking methods require the true number of clusters. Furthermore, these algorithms evaluate the feature importance depending on the ensemble clustering solution, and they produce undesirable evaluation results if the clustering solutions are inaccurate. To address these limitations, we proposed an ensemble-based feature ranking method with random subspace and multiple-k ensemble (FRRM). The proposed FRRM algorithm evaluates the importance of each feature with the random subspace ensemble, and all evaluation results are combined with the ensemble importance scores. Moreover, FRRM does not require the determination of the true number of clusters in advance through the use of the multiple-k ensemble idea. Experiments on various benchmark datasets were conducted to examine the properties of the proposed FRRM algorithm and to compare its performance with that of existing feature ranking methods. The experimental results demonstrated that the proposed FRRM outperformed the competitors.

Keywords: clustering analysis, multiple-k ensemble, random subspace-based feature evaluation, unsupervised feature ranking

Procedia PDF Downloads 310
578 Analysis of Urban Slum: Case Study of Korail Slum, Dhaka

Authors: Sanjida Ahmed Sinthia

Abstract:

Bangladesh is one of the poorest countries in the world. There are several reasons for this insufficiency and uncontrolled population growth is one of the prime reasons. Others include low economic progress, imbalanced resource management, unemployment and underemployment, urban migration and natural catastrophes etc. As a result, the rate of urban poor is increasing inevitably in every sphere of urban cities in Bangladesh and Dhaka is the most affected one. Besides there is scarcity of urban land, housing, urban infrastructure and amenities which create pressure on urban cities and mostly encroach the open space, wetlands that causes environmental degradation. Government has no or limited control over these due to poor government policy and management, political pressure and lack of resource management. Unfortunately, over centralization and bureaucracy creates unnecessary delay and interruptions in any government initiations. There is also no coordination between government and private sector developer to solve the problem of urban Poor. To understand the problem of these huge populations this paper analyzes one of the single largest slum areas in Dhaka, Korail Slum. The study focuses on socio demographic analysis, morphological pattern and role of different actors responsible for the improvements of the area and recommended some possible steps for determining the potential outcomes.

Keywords: demographic analysis, environmental degradation, government policy, housing and land management policy

Procedia PDF Downloads 146
577 Energy-Aware Scheduling in Real-Time Systems: An Analysis of Fair Share Scheduling and Priority-Driven Preemptive Scheduling

Authors: Su Xiaohan, Jin Chicheng, Liu Yijing, Burra Venkata Durga Kumar

Abstract:

Energy-aware scheduling in real-time systems aims to minimize energy consumption, but issues related to resource reservation and timing constraints remain challenges. This study focuses on analyzing two scheduling algorithms, Fair-Share Scheduling (FFS) and Priority-Driven Preemptive Scheduling (PDPS), for solving these issues and energy-aware scheduling in real-time systems. Based on research on both algorithms and the processes of solving two problems, it can be found that Fair-Share Scheduling ensures fair allocation of resources but needs to improve with an imbalanced system load, and Priority-Driven Preemptive Scheduling prioritizes tasks based on criticality to meet timing constraints through preemption but relies heavily on task prioritization and may not be energy efficient. Therefore, improvements to both algorithms with energy-aware features will be proposed. Future work should focus on developing hybrid scheduling techniques that minimize energy consumption through intelligent task prioritization, resource allocation, and meeting time constraints.

Keywords: energy-aware scheduling, fair-share scheduling, priority-driven preemptive scheduling, real-time systems, optimization, resource reservation, timing constraints

Procedia PDF Downloads 100
576 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 115
575 Combinational Therapeutic Targeting of BRD4 and CDK7 Synergistically Induces Anticancer Effects in Hepatocellular Carcinoma

Authors: Xinxiu Li, Chuqian Zheng, Yanyan Qian, Hong Fan

Abstract:

Objectives: In hepatocellular carcinoma (HCC), oncogenes are continuously and robustly transcribed due to aberrant expression of essential components of the trans-acting super-enhancers (SE) complex. Preclinical and clinical trials are now being conducted on small-molecule inhibitors that target core-transcriptional components, including as transcriptional bromodomain protein 4 (BRD4) and cyclin-dependent kinase 7 (CDK7), in a number of malignant tumors. This study aims to explore whether co-overexpression of BRD4 and CDK7 is a potential marker of worse prognosis and a combined therapeutic target in HCC. Methods: The expression pattern of BRD4 and CDK7 and their correlation with prognosis in HCC were analyzed by RNA sequencing data and survival data of HCC patients from TCGA and GEO datasets. The protein levels of BRD4 and CDK7 were determined by immunohistochemistry (IHC), and survival data of patients were analyzed using the Kaplan-Meier method. The mRNA expression levels of genes in HCC cell lines were evaluated by quantitative PCR (q-PCR). CCK-8 and colony formation assays were conducted to assess cell proliferation of HCC upon treatment with BRD4 inhibitor JQ1 or/and CDK7 inhibitor THZ1. Results: It was shown that BRD4 and CDK7 were often overexpressed in HCCs and were associated with poor prognosis of HCC by analyzing the TCGA and GEO datasets. BRD4 or CDK7 overexpression was related to a lower survival rate. It's interesting to note that co-overexpression of CDK7 and BRD4 was a worse prognostic factor in HCC. Treatment with JQ1 or THZ1 alone had an inhibitory effect on cell proliferation; however, when JQ1 and THZ1 were combined, there was a more notable suppression of cell growth. At the same time, the combined use of JQ1 and THZ1 synergistically suppresses the expression of HCC driver genes. Conclusion: Our research revealed that BRD4 and CDK7 coupled can be a useful biomarker in HCC prognosis and the combination of JQ1 and THZ1 can be a promising therapeutic therapy against HCC.

Keywords: BRD4, CDK7, cell proliferation, combined inhibition

Procedia PDF Downloads 38
574 Machine Learning Approaches Based on Recency, Frequency, Monetary (RFM) and K-Means for Predicting Electrical Failures and Voltage Reliability in Smart Cities

Authors: Panaya Sudta, Wanchalerm Patanacharoenwong, Prachya Bumrungkun

Abstract:

As With the evolution of smart grids, ensuring the reliability and efficiency of electrical systems in smart cities has become crucial. This paper proposes a distinct approach that combines advanced machine learning techniques to accurately predict electrical failures and address voltage reliability issues. This approach aims to improve the accuracy and efficiency of reliability evaluations in smart cities. The aim of this research is to develop a comprehensive predictive model that accurately predicts electrical failures and voltage reliability in smart cities. This model integrates RFM analysis, K-means clustering, and LSTM networks to achieve this objective. The research utilizes RFM analysis, traditionally used in customer value assessment, to categorize and analyze electrical components based on their failure recency, frequency, and monetary impact. K-means clustering is employed to segment electrical components into distinct groups with similar characteristics and failure patterns. LSTM networks are used to capture the temporal dependencies and patterns in customer data. This integration of RFM, K-means, and LSTM results in a robust predictive tool for electrical failures and voltage reliability. The proposed model has been tested and validated on diverse electrical utility datasets. The results show a significant improvement in prediction accuracy and reliability compared to traditional methods, achieving an accuracy of 92.78% and an F1-score of 0.83. This research contributes to the proactive maintenance and optimization of electrical infrastructures in smart cities. It also enhances overall energy management and sustainability. The integration of advanced machine learning techniques in the predictive model demonstrates the potential for transforming the landscape of electrical system management within smart cities. The research utilizes diverse electrical utility datasets to develop and validate the predictive model. RFM analysis, K-means clustering, and LSTM networks are applied to these datasets to analyze and predict electrical failures and voltage reliability. The research addresses the question of how accurately electrical failures and voltage reliability can be predicted in smart cities. It also investigates the effectiveness of integrating RFM analysis, K-means clustering, and LSTM networks in achieving this goal. The proposed approach presents a distinct, efficient, and effective solution for predicting and mitigating electrical failures and voltage issues in smart cities. It significantly improves prediction accuracy and reliability compared to traditional methods. This advancement contributes to the proactive maintenance and optimization of electrical infrastructures, overall energy management, and sustainability in smart cities.

Keywords: electrical state prediction, smart grids, data-driven method, long short-term memory, RFM, k-means, machine learning

Procedia PDF Downloads 31
573 Arabic Lexicon Learning to Analyze Sentiment in Microblogs

Authors: Mahmoud B. Rokaya

Abstract:

The study of opinion mining and sentiment analysis includes analysis of opinions, sentiments, evaluations, attitudes, and emotions. The rapid growth of social media, social networks, reviews, forum discussions, microblogs, and Twitter, leads to a parallel growth in the field of sentiment analysis. The field of sentiment analysis tries to develop effective tools to make it possible to capture the trends of people. There are two approaches in the field, lexicon-based and corpus-based methods. A lexicon-based method uses a sentiment lexicon which includes sentiment words and phrases with assigned numeric scores. These scores reveal if sentiment phrases are positive or negative, their intensity, and/or their emotional orientations. Creation of manual lexicons is hard. This brings the need for adaptive automated methods for generating a lexicon. The proposed method generates dynamic lexicons based on the corpus and then classifies text using these lexicons. In the proposed method, different approaches are combined to generate lexicons from text. The proposed method classifies the tweets into 5 classes instead of +ve or –ve classes. The sentiment classification problem is written as an optimization problem, finding optimum sentiment lexicons are the goal of the optimization process. The solution was produced based on mathematical programming approaches to find the best lexicon to classify texts. A genetic algorithm was written to find the optimal lexicon. Then, extraction of a meta-level feature was done based on the optimal lexicon. The experiments were conducted on several datasets. Results, in terms of accuracy, recall and F measure, outperformed the state-of-the-art methods proposed in the literature in some of the datasets. A better understanding of the Arabic language and culture of Arab Twitter users and sentiment orientation of words in different contexts can be achieved based on the sentiment lexicons proposed by the algorithm.

Keywords: social media, Twitter sentiment, sentiment analysis, lexicon, genetic algorithm, evolutionary computation

Procedia PDF Downloads 158
572 Employing Remotely Sensed Soil and Vegetation Indices and Predicting ‎by Long ‎Short-Term Memory to Irrigation Scheduling Analysis

Authors: Elham Koohikerade, Silvio Jose Gumiere

Abstract:

In this research, irrigation is highlighted as crucial for improving both the yield and quality of ‎potatoes due to their high sensitivity to soil moisture changes. The study presents a hybrid Long ‎Short-Term Memory (LSTM) model aimed at optimizing irrigation scheduling in potato fields in ‎Quebec City, Canada. This model integrates model-based and satellite-derived datasets to simulate ‎soil moisture content, addressing the limitations of field data. Developed under the guidance of the ‎Food and Agriculture Organization (FAO), the simulation approach compensates for the lack of direct ‎soil sensor data, enhancing the LSTM model's predictions. The model was calibrated using indices ‎like Surface Soil Moisture (SSM), Normalized Vegetation Difference Index (NDVI), Enhanced ‎Vegetation Index (EVI), and Normalized Multi-band Drought Index (NMDI) to effectively forecast ‎soil moisture reductions. Understanding soil moisture and plant development is crucial for assessing ‎drought conditions and determining irrigation needs. This study validated the spectral characteristics ‎of vegetation and soil using ECMWF Reanalysis v5 (ERA5) and Moderate Resolution Imaging ‎Spectrometer (MODIS) data from 2019 to 2023, collected from agricultural areas in Dolbeau and ‎Peribonka, Quebec. Parameters such as surface volumetric soil moisture (0-7 cm), NDVI, EVI, and ‎NMDI were extracted from these images. A regional four-year dataset of soil and vegetation moisture ‎was developed using a machine learning approach combining model-based and satellite-based ‎datasets. The LSTM model predicts soil moisture dynamics hourly across different locations and ‎times, with its accuracy verified through cross-validation and comparison with existing soil moisture ‎datasets. The model effectively captures temporal dynamics, making it valuable for applications ‎requiring soil moisture monitoring over time, such as anomaly detection and memory analysis. By ‎identifying typical peak soil moisture values and observing distribution shapes, irrigation can be ‎scheduled to maintain soil moisture within Volumetric Soil Moisture (VSM) values of 0.25 to 0.30 ‎m²/m², avoiding under and over-watering. The strong correlations between parcels suggest that a ‎uniform irrigation strategy might be effective across multiple parcels, with adjustments based on ‎specific parcel characteristics and historical data trends. The application of the LSTM model to ‎predict soil moisture and vegetation indices yielded mixed results. While the model effectively ‎captures the central tendency and temporal dynamics of soil moisture, it struggles with accurately ‎predicting EVI, NDVI, and NMDI.‎

Keywords: irrigation scheduling, LSTM neural network, remotely sensed indices, soil and vegetation ‎monitoring

Procedia PDF Downloads 17
571 River Habitat Modeling for the Entire Macroinvertebrate Community

Authors: Pinna Beatrice., Laini Alex, Negro Giovanni, Burgazzi Gemma, Viaroli Pierluigi, Vezza Paolo

Abstract:

Habitat models rarely consider macroinvertebrates as ecological targets in rivers. Available approaches mainly focus on single macroinvertebrate species, not addressing the ecological needs and functionality of the entire community. This research aimed to provide an approach to model the habitat of the macroinvertebrate community. The approach is based on the recently developed Flow-T index, together with a Random Forest (RF) regression, which is employed to apply the Flow-T index at the meso-habitat scale. Using different datasets gathered from both field data collection and 2D hydrodynamic simulations, the model has been calibrated in the Trebbia river (2019 campaign), and then validated in the Trebbia, Taro, and Enza rivers (2020 campaign). The three rivers are characterized by a braiding morphology, gravel riverbeds, and summer low flows. The RF model selected 12 mesohabitat descriptors as important for the macroinvertebrate community. These descriptors belong to different frequency classes of water depth, flow velocity, substrate grain size, and connectivity to the main river channel. The cross-validation R² coefficient (R²𝒸ᵥ) of the training dataset is 0.71 for the Trebbia River (2019), whereas the R² coefficient for the validation datasets (Trebbia, Taro, and Enza Rivers 2020) is 0.63. The agreement between the simulated results and the experimental data shows sufficient accuracy and reliability. The outcomes of the study reveal that the model can identify the ecological response of the macroinvertebrate community to possible flow regime alterations and to possible river morphological modifications. Lastly, the proposed approach allows extending the MesoHABSIM methodology, widely used for the fish habitat assessment, to a different ecological target community. Further applications of the approach can be related to flow design in both perennial and non-perennial rivers, including river reaches in which fish fauna is absent.

Keywords: ecological flows, macroinvertebrate community, mesohabitat, river habitat modeling

Procedia PDF Downloads 66