Search results for: imbalanced datasets
113 Atmospheric Circulation Types Related to Dust Transport Episodes over Crete in the Eastern Mediterranean
Authors: K. Alafogiannis, E. E. Houssos, E. Anagnostou, G. Kouvarakis, N. Mihalopoulos, A. Fotiadi
Abstract:
The Mediterranean basin is an area where different aerosol types coexist, including urban/industrial, desert dust, biomass burning and marine particles. Particularly, mineral dust aerosols, mostly originated from North African deserts, significantly contribute to high aerosol loads above the Mediterranean. Dust transport, controlled by the variation of the atmospheric circulation throughout the year, results in a strong spatial and temporal variability of aerosol properties. In this study, the synoptic conditions which favor dust transport over the Eastern Mediterranean are thoroughly investigated. For this reason, three datasets are employed. Firstly, ground-based daily data of aerosol properties, namely Aerosol Optical Thickness (AOT), Ångström exponent (α440-870) and fine fraction from the FORTH-AERONET (Aerosol Robotic Network) station along with measurements of PM10 concentrations from Finokalia station, for the period 2003-2011, are used to identify days with high coarse aerosol load (episodes) over Crete. Then, geopotential height at 1000, 850 and 700 hPa levels obtained from the NCEP/NCAR Reanalysis Project, are utilized to depict the atmospheric circulation during the identified episodes. Additionally, air-mass back trajectories, calculated by HYSPLIT, are used to verify the origin of aerosols from neighbouring deserts. For the 227 identified dust episodes, the statistical methods of Factor and Cluster Analysis are applied on the corresponding atmospheric circulation data to reveal the main types of the synoptic conditions favouring dust transport towards Crete (Eastern Mediterranean). The 227 cases are classified into 11 distinct types (clusters). Dust episodes in Eastern Mediterranean, are found to be more frequent (52%) in spring with a secondary maximum in autumn. The main characteristic of the atmospheric circulation associated with dust episodes, is the presence of a low-pressure system at surface, either in southwestern Europe or western/central Mediterranean, which induces a southerly air flow favouring dust transport from African deserts. The exact position and the intensity of the low-pressure system vary notably among clusters. More rarely dust may originate from deserts of Arabian Peninsula.Keywords: aerosols, atmospheric circulation, dust particles, Eastern Mediterranean
Procedia PDF Downloads 231112 Enhanced CNN for Rice Leaf Disease Classification in Mobile Applications
Authors: Kayne Uriel K. Rodrigo, Jerriane Hillary Heart S. Marcial, Samuel C. Brillo
Abstract:
Rice leaf diseases significantly impact yield production in rice-dependent countries, affecting their agricultural sectors. As part of precision agriculture, early and accurate detection of these diseases is crucial for effective mitigation practices and minimizing crop losses. Hence, this study proposes an enhancement to the Convolutional Neural Network (CNN), a widely-used method for Rice Leaf Disease Image Classification, by incorporating MobileViTV2—a recently advanced architecture that combines CNN and Vision Transformer models while maintaining fewer parameters, making it suitable for broader deployment on edge devices. Our methodology utilizes a publicly available rice disease image dataset from Kaggle, which was validated by a university structural biologist following the guidelines provided by the Philippine Rice Institute (PhilRice). Modifications to the dataset include renaming certain disease categories and augmenting the rice leaf image data through rotation, scaling, and flipping. The enhanced dataset was then used to train the MobileViTV2 model using the Timm library. The results of our approach are as follows: the model achieved notable performance, with 98% accuracy in both training and validation, 6% training and validation loss, and a Receiver Operating Characteristic (ROC) curve ranging from 95% to 100% for each label. Additionally, the F1 score was 97%. These metrics demonstrate a significant improvement compared to a conventional CNN-based approach, which, in a previous 2022 study, achieved only 78% accuracy after using 5 convolutional layers and 2 dense layers. Thus, it can be concluded that MobileViTV2, with its fewer parameters, outperforms traditional CNN models, particularly when applied to Rice Leaf Disease Image Identification. For future work, we recommend extending this model to include datasets validated by international rice experts and broadening the scope to accommodate biotic factors such as rice pest classification, as well as abiotic stressors such as climate, soil quality, and geographic information, which could improve the accuracy of disease prediction.Keywords: convolutional neural network, MobileViTV2, rice leaf disease, precision agriculture, image classification, vision transformer
Procedia PDF Downloads 29111 Spatial Analysis as a Tool to Assess Risk Management in Peru
Authors: Josué Alfredo Tomas Machaca Fajardo, Jhon Elvis Chahua Janampa, Pedro Rau Lavado
Abstract:
A flood vulnerability index was developed for the Piura River watershed in northern Peru using Principal Component Analysis (PCA) to assess flood risk. The official methodology to assess risk from natural hazards in Peru was introduced in 1980 and proved effective for aiding complex decision-making. This method relies in part on decision-makers defining subjective correlations between variables to identify high-risk areas. While risk identification and ensuing response activities benefit from a qualitative understanding of influences, this method does not take advantage of the advent of national and international data collection efforts, which can supplement our understanding of risk. Furthermore, this method does not take advantage of broadly applied statistical methods such as PCA, which highlight central indicators of vulnerability. Nowadays, information processing is much faster and allows for more objective decision-making tools, such as PCA. The approach presented here develops a tool to improve the current flood risk assessment in the Peruvian basin. Hence, the spatial analysis of the census and other datasets provides a better understanding of the current land occupation and a basin-wide distribution of services and human populations, a necessary step toward ultimately reducing flood risk in Peru. PCA allows the simplification of a large number of variables into a few factors regarding social, economic, physical and environmental dimensions of vulnerability. There is a correlation between the location of people and the water availability mainly found in rivers. For this reason, a comprehensive vision of the population location around the river basin is necessary to establish flood prevention policies. The grouping of 5x5 km gridded areas allows the spatial analysis of flood risk rather than assessing political divisions of the territory. The index was applied to the Peruvian region of Piura, where several flood events occurred in recent past years, being one of the most affected regions during the ENSO events in Peru. The analysis evidenced inequalities for the access to basic services, such as water, electricity, internet and sewage, between rural and urban areas.Keywords: assess risk, flood risk, indicators of vulnerability, principal component analysis
Procedia PDF Downloads 187110 The Investigation of Work Stress and Burnout in Nurse Anesthetists: A Cross-Sectional Study
Authors: Yen Ling Liu, Shu-Fen Wu, Chen-Fuh Lam, I-Ling Tsai, Chia-Yu Chen
Abstract:
Purpose: Nurse anesthetists are confronting extraordinarily high job stress in their daily practice, deriving from the fast-track anesthesia care, risk of perioperative complications, routine rotating shifts, teaching programs and interactions with the surgical team in the operating room. This study investigated the influence of work stress on the burnout and turnover intention of nurse anesthetists in a regional general hospital in Southern Taiwan. Methods: This was a descriptive correlational study carried out in 66 full-time nurse anesthetists. Data was collected from March 2017 to June 2017 by in-person interview, and a self-administered structured questionnaire was completed by the interviewee. Outcome measurements included the Practice Environment Scale of the Nursing Work Index (PES-NWI), Maslach Burnout Inventory (MBI) and nursing staff turnover intention. Numerical data were analyzed by descriptive statistics, independent t test, or one-way ANOVA. Categorical data were compared using the chi-square test (x²). Datasets were computed with Pearson product-moment correlation and linear regression. Data were analyzed by using SPSS 20.0 software. Results: The average score for job burnout was 68.7916.67 (out of 100). The three major components of burnout, including emotional depletion (mean score of 26.32), depersonalization (mean score of 13.65), and personal(mean score of 24.48). These average scores suggested that these nurse anesthetists were at high risk of burnout and inversely correlated with turnover intention (t = -4.048, P < 0.05). Using linear regression model, emotional exhaustion and depersonalization were the two independent factors that predicted turnover intention in the nurse anesthetists (19.1% in total variance). Conclusion/Implications for Practice: The study identifies that the high risk of job burnout in the nurse anesthetists is not simply derived from physical overload, but most likely resulted from the additional emotional and psychological stress. The occurrence of job burnout may affect the quality of nursing work, and also influence family harmony, in turn, may increase the turnover rate. Multimodal approach is warranted to reduce work stress and job burnout in nurse anesthetists to enhance their willingness to contribute in anesthesia care.Keywords: anesthesia nurses, burnout, job, turnover intention
Procedia PDF Downloads 297109 Development of a 3D Model of Real Estate Properties in Fort Bonifacio, Taguig City, Philippines Using Geographic Information Systems
Authors: Lyka Selene Magnayi, Marcos Vinas, Roseanne Ramos
Abstract:
As the real estate industry continually grows in the Philippines, Geographic Information Systems (GIS) provide advantages in generating spatial databases for efficient delivery of information and services. The real estate sector is not only providing qualitative data about real estate properties but also utilizes various spatial aspects of these properties for different applications such as hazard mapping and assessment. In this study, a three-dimensional (3D) model and a spatial database of real estate properties in Fort Bonifacio, Taguig City are developed using GIS and SketchUp. Spatial datasets include political boundaries, buildings, road network, digital terrain model (DTM) derived from Interferometric Synthetic Aperture Radar (IFSAR) image, Google Earth satellite imageries, and hazard maps. Multiple model layers were created based on property listings by a partner real estate company, including existing and future property buildings. Actual building dimensions, building facade, and building floorplans are incorporated in these 3D models for geovisualization. Hazard model layers are determined through spatial overlays, and different scenarios of hazards are also presented in the models. Animated maps and walkthrough videos were created for company presentation and evaluation. Model evaluation is conducted through client surveys requiring scores in terms of the appropriateness, information content, and design of the 3D models. Survey results show very satisfactory ratings, with the highest average evaluation score equivalent to 9.21 out of 10. The output maps and videos obtained passing rates based on the criteria and standards set by the intended users of the partner real estate company. The methodologies presented in this study were found useful and have remarkable advantages in the real estate industry. This work may be extended to automated mapping and creation of online spatial databases for better storage, access of real property listings and interactive platform using web-based GIS.Keywords: geovisualization, geographic information systems, GIS, real estate, spatial database, three-dimensional model
Procedia PDF Downloads 159108 Application of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and Multipoint Optimal Minimum Entropy Deconvolution in Railway Bearings Fault Diagnosis
Authors: Yao Cheng, Weihua Zhang
Abstract:
Although the measured vibration signal contains rich information on machine health conditions, the white noise interferences and the discrete harmonic coming from blade, shaft and mash make the fault diagnosis of rolling element bearings difficult. In order to overcome the interferences of useless signals, a new fault diagnosis method combining Complete Ensemble Empirical Mode Decomposition with adaptive noise (CEEMDAN) and Multipoint Optimal Minimum Entropy Deconvolution (MOMED) is proposed for the fault diagnosis of high-speed train bearings. Firstly, the CEEMDAN technique is applied to adaptively decompose the raw vibration signal into a series of finite intrinsic mode functions (IMFs) and a residue. Compared with Ensemble Empirical Mode Decomposition (EEMD), the CEEMDAN can provide an exact reconstruction of the original signal and a better spectral separation of the modes, which improves the accuracy of fault diagnosis. An effective sensitivity index based on the Pearson's correlation coefficients between IMFs and raw signal is adopted to select sensitive IMFs that contain bearing fault information. The composite signal of the sensitive IMFs is applied to further analysis of fault identification. Next, for propose of identifying the fault information precisely, the MOMED is utilized to enhance the periodic impulses in composite signal. As a non-iterative method, the MOMED has better deconvolution performance than the classical deconvolution methods such Minimum Entropy Deconvolution (MED) and Maximum Correlated Kurtosis Deconvolution (MCKD). Third, the envelope spectrum analysis is applied to detect the existence of bearing fault. The simulated bearing fault signals with white noise and discrete harmonic interferences are used to validate the effectiveness of the proposed method. Finally, the superiorities of the proposed method are further demonstrated by high-speed train bearing fault datasets measured from test rig. The analysis results indicate that the proposed method has strong practicability.Keywords: bearing, complete ensemble empirical mode decomposition with adaptive noise, fault diagnosis, multipoint optimal minimum entropy deconvolution
Procedia PDF Downloads 375107 The Impact of Artificial Intelligence on Agricultural Machines and Plant Nutrition
Authors: Kirolos Gerges Yakoub Gerges
Abstract:
Self-sustaining agricultural machines act in stochastic surroundings and therefore, should be capable of perceive the surroundings in real time. This notion can be done using image sensors blended with superior device learning, mainly Deep mastering. Deep convolutional neural networks excel in labeling and perceiving colour pix and since the fee of RGB-cameras is low, the hardware cost of accurate notion relies upon heavily on memory and computation power. This paper investigates the opportunity of designing lightweight convolutional neural networks for semantic segmentation (pixel clever class) with reduced hardware requirements, to allow for embedded usage in self-reliant agricultural machines. The usage of compression techniques, a lightweight convolutional neural community is designed to carry out actual-time semantic segmentation on an embedded platform. The community is skilled on two big datasets, ImageNet and Pascal Context, to apprehend as much as four hundred man or woman instructions. The 400 training are remapped into agricultural superclasses (e.g. human, animal, sky, road, area, shelterbelt and impediment) and the capacity to provide correct actual-time perception of agricultural environment is studied. The network is carried out to the case of self-sufficient grass mowing the usage of the NVIDIA Tegra X1 embedded platform. Feeding case-unique pics to the community consequences in a fully segmented map of the superclasses within the picture. As the network remains being designed and optimized, handiest a qualitative analysis of the technique is entire on the abstract submission deadline. intending this cut-off date, the finalized layout is quantitatively evaluated on 20 annotated grass mowing pictures. Light-weight convolutional neural networks for semantic segmentation can be implemented on an embedded platform and show aggressive performance on the subject of accuracy and speed. It’s miles viable to offer value-efficient perceptive capabilities related to semantic segmentation for autonomous agricultural machines.Keywords: centrifuge pump, hydraulic energy, agricultural applications, irrigationaxial flux machines, axial flux applications, coreless machines, PM machinesautonomous agricultural machines, deep learning, safety, visual perception
Procedia PDF Downloads 28106 Land Cover Mapping Using Sentinel-2, Landsat-8 Satellite Images, and Google Earth Engine: A Study Case of the Beterou Catchment
Authors: Ella Sèdé Maforikan
Abstract:
Accurate land cover mapping is essential for effective environmental monitoring and natural resources management. This study focuses on assessing the classification performance of two satellite datasets and evaluating the impact of different input feature combinations on classification accuracy in the Beterou catchment, situated in the northern part of Benin. Landsat-8 and Sentinel-2 images from June 1, 2020, to March 31, 2021, were utilized. Employing the Random Forest (RF) algorithm on Google Earth Engine (GEE), a supervised classification categorized the land into five classes: forest, savannas, cropland, settlement, and water bodies. GEE was chosen due to its high-performance computing capabilities, mitigating computational burdens associated with traditional land cover classification methods. By eliminating the need for individual satellite image downloads and providing access to an extensive archive of remote sensing data, GEE facilitated efficient model training on remote sensing data. The study achieved commendable overall accuracy (OA), ranging from 84% to 85%, even without incorporating spectral indices and terrain metrics into the model. Notably, the inclusion of additional input sources, specifically terrain features like slope and elevation, enhanced classification accuracy. The highest accuracy was achieved with Sentinel-2 (OA = 91%, Kappa = 0.88), slightly surpassing Landsat-8 (OA = 90%, Kappa = 0.87). This underscores the significance of combining diverse input sources for optimal accuracy in land cover mapping. The methodology presented herein not only enables the creation of precise, expeditious land cover maps but also demonstrates the prowess of cloud computing through GEE for large-scale land cover mapping with remarkable accuracy. The study emphasizes the synergy of different input sources to achieve superior accuracy. As a future recommendation, the application of Light Detection and Ranging (LiDAR) technology is proposed to enhance vegetation type differentiation in the Beterou catchment. Additionally, a cross-comparison between Sentinel-2 and Landsat-8 for assessing long-term land cover changes is suggested.Keywords: land cover mapping, Google Earth Engine, random forest, Beterou catchment
Procedia PDF Downloads 63105 An Alternative Credit Scoring System in China’s Consumer Lendingmarket: A System Based on Digital Footprint Data
Authors: Minjuan Sun
Abstract:
Ever since the late 1990s, China has experienced explosive growth in consumer lending, especially in short-term consumer loans, among which, the growth rate of non-bank lending has surpassed bank lending due to the development in financial technology. On the other hand, China does not have a universal credit scoring and registration system that can guide lenders during the processes of credit evaluation and risk control, for example, an individual’s bank credit records are not available for online lenders to see and vice versa. Given this context, the purpose of this paper is three-fold. First, we explore if and how alternative digital footprint data can be utilized to assess borrower’s creditworthiness. Then, we perform a comparative analysis of machine learning methods for the canonical problem of credit default prediction. Finally, we analyze, from an institutional point of view, the necessity of establishing a viable and nationally universal credit registration and scoring system utilizing online digital footprints, so that more people in China can have better access to the consumption loan market. Two different types of digital footprint data are utilized to match with bank’s loan default records. Each separately captures distinct dimensions of a person’s characteristics, such as his shopping patterns and certain aspects of his personality or inferred demographics revealed by social media features like profile image and nickname. We find both datasets can generate either acceptable or excellent prediction results, and different types of data tend to complement each other to get better performances. Typically, the traditional types of data banks normally use like income, occupation, and credit history, update over longer cycles, hence they can’t reflect more immediate changes, like the financial status changes caused by the business crisis; whereas digital footprints can update daily, weekly, or monthly, thus capable of providing a more comprehensive profile of the borrower’s credit capabilities and risks. From the empirical and quantitative examination, we believe digital footprints can become an alternative information source for creditworthiness assessment, because of their near-universal data coverage, and because they can by and large resolve the "thin-file" issue, due to the fact that digital footprints come in much larger volume and higher frequency.Keywords: credit score, digital footprint, Fintech, machine learning
Procedia PDF Downloads 164104 An Improvement of ComiR Algorithm for MicroRNA Target Prediction by Exploiting Coding Region Sequences of mRNAs
Authors: Giorgio Bertolazzi, Panayiotis Benos, Michele Tumminello, Claudia Coronnello
Abstract:
MicroRNAs are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR (Combinatorial miRNA targeting) is a user friendly web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR incorporates miRNA expression in a thermodynamic binding model, and it associates each gene with the probability of being a target of a set of miRNAs. ComiR algorithms were trained with the information regarding binding sites in the 3’UTR region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein; this protein is a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in the ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that the ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'UTR and coding regions, should be considered in a comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’UTR based one.Keywords: AGO1, coding region, Drosophila melanogaster, microRNA target prediction
Procedia PDF Downloads 452103 Characterisation of Meteorological Drought at Sub-Catchment Scale in Afghanistan Using Time-Series Climate Data
Authors: Yun Chen, David Penton, Fazlul Karim, Santosh Aryal, Shahriar Wahid, Peter Taylor, Susan M. Cuddy
Abstract:
Droughts have severely affected Afghanistan over the last four decades, leading to critical food shortages where two-thirds of the country’s population are in a food crisis. Long years of conflict have lowered the country’s ability to deal with hazards such as drought, which can rapidly escalate into disasters. Understanding the spatial and temporal distribution of droughts is needed to be able to respond effectively to disasters and plan for future occurrences. This study used Standardized Precipitation Evapotranspiration Index (SPEI) at monthly, seasonal, and annual temporal scales to map the spatiotemporal change dynamics of drought characteristics (distribution, frequency, duration, and severity) in Afghanistan. SPEI indices were mapped for river basins, disaggregated into 189 sub-catchments, using monthly precipitation and potential evapotranspiration derived from temperature station observations from 1980 to 2017. The results show these multi-dimensional drought characteristics vary along different years, change among sub-catchments, and differ across temporal scales. During the 38 years, the driest decade and period are the 2000s and 1999–2022, respectively. The 2000–01 water year is the driest, with the whole country experiencing ‘severe’ to ‘extreme’ drought, more than 53% (87 sub-catchments) suffering the worst drought in history, and about 58% (94 sub-catchments) having ‘very frequent’ drought (7 to 8 months) or ‘extremely frequent’ drought (9 to 10 months). The estimated seasonal duration and severity present significant variations across the study area and throughout the study period. The nation also suffered from recurring droughts with varying length and intensity in 2004, 2006, 2008, and, most recently, 2011. There is a trend towards increasing drought with longer duration and higher severity extending all over sub-catchments from southeast to north and central regions. These datasets and maps help to fill the knowledge gap on detailed sub-catchment scale meteorological drought characteristics in Afghanistan. The study findings improve our understanding of the influences of climate change on drought dynamics and can guide catchment planning for reliable adaptation to and mitigation against future droughts.Keywords: SPEI, precipitation, evapotranspiration, climate extremes
Procedia PDF Downloads 92102 Do the Health Benefits of Oil-Led Economic Development Outweigh the Potential Health Harms from Environmental Pollution in Nigeria?
Authors: Marian Emmanuel Okon
Abstract:
Introduction: The Niger Delta region of Nigeria has a vast reserve of oil and gas, which has globally positioned the nation as the sixth largest exporter of crude oil. Production rapidly rose following oil discovery. In most oil producing nations of the world, the wealth generated from oil production and export has propelled economic advancement, enabling the development of industries and other relevant infrastructures. Therefore, it can be assumed that majority of the oil resource such as Nigeria’s, has the potential to improve the health of the population via job creation and derived revenues. However, the health benefits of this economic development might be offset by the environmental consequences of oil exploitation and production. Objective: This research aims to evaluate the balance between the health benefits of oil-led economic development and harmful environmental consequences of crude oil exploitation in Nigeria. Study Design: A pathway has been designed to guide data search and this study. The model created will assess the relationship between oil-led economic development and population health development via job creation, improvement of education, development of infrastructure and other forms of development as well as through harmful environmental consequences from oil activities. Data/Emerging Findings: Diverse potentially suitable datasets which are at different geographical scales have been identified, obtained or applied for and the dataset from the World Bank has been the most thoroughly explored. This large dataset contains information that would enable the longitudinal assessment of both the health benefits and harms from oil exploitation in Nigeria as well as identify the disparities that exist between the communities, states and regions. However, these data do not extend far back enough in time to capture the start of crude oil production. Thus, it is possible that the maximum economic benefits and health harms could be missed. To deal with this shortcoming, the potential for a comparative study with countries like United Kingdom, Morocco and Cote D’ivoire has also been taken into consideration, so as to evaluate the differences between these countries as well as identify the areas of improvement in Nigeria’s environmental and health policies. Notwithstanding, these data have shown some differences in each country’s economic, environmental and health state over time as well as a corresponding summary statistics. Conclusion: In theory, the beneficial effects of oil exploitation to the health of the population may be substantial as large swaths of the ‘wider determinants’ of population heath are influenced by the wealth of a nation. However, if uncontrolled, the consequences from environmental pollution and degradation may outweigh these benefits. Thus, there is a need to address this, in order to improve environmental and population health in Nigeria.Keywords: environmental pollution, health benefits, oil-led economic development, petroleum exploitation
Procedia PDF Downloads 340101 The Use of Artificial Intelligence in Digital Forensics and Incident Response in a Constrained Environment
Authors: Dipo Dunsin, Mohamed C. Ghanem, Karim Ouazzane
Abstract:
Digital investigators often have a hard time spotting evidence in digital information. It has become hard to determine which source of proof relates to a specific investigation. A growing concern is that the various processes, technology, and specific procedures used in the digital investigation are not keeping up with criminal developments. Therefore, criminals are taking advantage of these weaknesses to commit further crimes. In digital forensics investigations, artificial intelligence is invaluable in identifying crime. It has been observed that an algorithm based on artificial intelligence (AI) is highly effective in detecting risks, preventing criminal activity, and forecasting illegal activity. Providing objective data and conducting an assessment is the goal of digital forensics and digital investigation, which will assist in developing a plausible theory that can be presented as evidence in court. Researchers and other authorities have used the available data as evidence in court to convict a person. This research paper aims at developing a multiagent framework for digital investigations using specific intelligent software agents (ISA). The agents communicate to address particular tasks jointly and keep the same objectives in mind during each task. The rules and knowledge contained within each agent are dependent on the investigation type. A criminal investigation is classified quickly and efficiently using the case-based reasoning (CBR) technique. The MADIK is implemented using the Java Agent Development Framework and implemented using Eclipse, Postgres repository, and a rule engine for agent reasoning. The proposed framework was tested using the Lone Wolf image files and datasets. Experiments were conducted using various sets of ISA and VMs. There was a significant reduction in the time taken for the Hash Set Agent to execute. As a result of loading the agents, 5 percent of the time was lost, as the File Path Agent prescribed deleting 1,510, while the Timeline Agent found multiple executable files. In comparison, the integrity check carried out on the Lone Wolf image file using a digital forensic tool kit took approximately 48 minutes (2,880 ms), whereas the MADIK framework accomplished this in 16 minutes (960 ms). The framework is integrated with Python, allowing for further integration of other digital forensic tools, such as AccessData Forensic Toolkit (FTK), Wireshark, Volatility, and Scapy.Keywords: artificial intelligence, computer science, criminal investigation, digital forensics
Procedia PDF Downloads 213100 Ribotaxa: Combined Approaches for Taxonomic Resolution Down to the Species Level from Metagenomics Data Revealing Novelties
Authors: Oshma Chakoory, Sophie Comtet-Marre, Pierre Peyret
Abstract:
Metagenomic classifiers are widely used for the taxonomic profiling of metagenomic data and estimation of taxa relative abundance. Small subunit rRNA genes are nowadays a gold standard for the phylogenetic resolution of complex microbial communities, although the power of this marker comes down to its use as full-length. We benchmarked the performance and accuracy of rRNA-specialized versus general-purpose read mappers, reference-targeted assemblers and taxonomic classifiers. We then built a pipeline called RiboTaxa to generate a highly sensitive and specific metataxonomic approach. Using metagenomics data, RiboTaxa gave the best results compared to other tools (Kraken2, Centrifuge (1), METAXA2 (2), PhyloFlash (3)) with precise taxonomic identification and relative abundance description, giving no false positive detection. Using real datasets from various environments (ocean, soil, human gut) and from different approaches (metagenomics and gene capture by hybridization), RiboTaxa revealed microbial novelties not seen by current bioinformatics analysis opening new biological perspectives in human and environmental health. In a study focused on corals’ health involving 20 metagenomic samples (4), an affiliation of prokaryotes was limited to the family level with Endozoicomonadaceae characterising healthy octocoral tissue. RiboTaxa highlighted 2 species of uncultured Endozoicomonas which were dominant in the healthy tissue. Both species belonged to a genus not yet described, opening new research perspectives on corals’ health. Applied to metagenomics data from a study on human gut and extreme longevity (5), RiboTaxa detected the presence of an uncultured archaeon in semi-supercentenarians (aged 105 to 109 years) highlighting an archaeal genus, not yet described, and 3 uncultured species belonging to the Enorma genus that could be species of interest participating in the longevity process. RiboTaxa is user-friendly, rapid, allowing microbiota structure description from any environment and the results can be easily interpreted. This software is freely available at https://github.com/oschakoory/RiboTaxa under the GNU Affero General Public License 3.0.Keywords: metagenomics profiling, microbial diversity, SSU rRNA genes, full-length phylogenetic marker
Procedia PDF Downloads 12399 Groundwater Potential Mapping using Frequency Ratio and Shannon’s Entropy Models in Lesser Himalaya Zone, Nepal
Authors: Yagya Murti Aryal, Bipin Adhikari, Pradeep Gyawali
Abstract:
The Lesser Himalaya zone of Nepal consists of thrusting and folding belts, which play an important role in the sustainable management of groundwater in the Himalayan regions. The study area is located in the Dolakha and Ramechhap Districts of Bagmati Province, Nepal. Geologically, these districts are situated in the Lesser Himalayas and partly encompass the Higher Himalayan rock sequence, which includes low-grade to high-grade metamorphic rocks. Following the Gorkha Earthquake in 2015, numerous springs dried up, and many others are currently experiencing depletion due to the distortion of the natural groundwater flow. The primary objective of this study is to identify potential groundwater areas and determine suitable sites for artificial groundwater recharge. Two distinct statistical approaches were used to develop models: The Frequency Ratio (FR) and Shannon Entropy (SE) methods. The study utilized both primary and secondary datasets and incorporated significant role and controlling factors derived from field works and literature reviews. Field data collection involved spring inventory, soil analysis, lithology assessment, and hydro-geomorphology study. Additionally, slope, aspect, drainage density, and lineament density were extracted from a Digital Elevation Model (DEM) using GIS and transformed into thematic layers. For training and validation, 114 springs were divided into a 70/30 ratio, with an equal number of non-spring pixels. After assigning weights to each class based on the two proposed models, a groundwater potential map was generated using GIS, classifying the area into five levels: very low, low, moderate, high, and very high. The model's outcome reveals that over 41% of the area falls into the low and very low potential categories, while only 30% of the area demonstrates a high probability of groundwater potential. To evaluate model performance, accuracy was assessed using the Area under the Curve (AUC). The success rate AUC values for the FR and SE methods were determined to be 78.73% and 77.09%, respectively. Additionally, the prediction rate AUC values for the FR and SE methods were calculated as 76.31% and 74.08%. The results indicate that the FR model exhibits greater prediction capability compared to the SE model in this case study.Keywords: groundwater potential mapping, frequency ratio, Shannon’s Entropy, Lesser Himalaya Zone, sustainable groundwater management
Procedia PDF Downloads 8198 Application of Deep Learning and Ensemble Methods for Biomarker Discovery in Diabetic Nephropathy through Fibrosis and Propionate Metabolism Pathways
Authors: Oluwafunmibi Omotayo Fasanya, Augustine Kena Adjei
Abstract:
Diabetic nephropathy (DN) is a major complication of diabetes, with fibrosis and propionate metabolism playing critical roles in its progression. Identifying biomarkers linked to these pathways may provide novel insights into DN diagnosis and treatment. This study aims to identify biomarkers associated with fibrosis and propionate metabolism in DN. Analyze the biological pathways and regulatory mechanisms of these biomarkers. Develop a machine learning model to predict DN-related biomarkers and validate their functional roles. Publicly available transcriptome datasets related to DN (GSE96804 and GSE104948) were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/gds), and 924 propionate metabolism-related genes (PMRGs) and 656 fibrosis-related genes (FRGs) were identified. The analysis began with the extraction of DN-differentially expressed genes (DN-DEGs) and propionate metabolism-related DEGs (PM-DEGs), followed by the intersection of these with fibrosis-related genes to identify key intersected genes. Instead of relying on traditional models, we employed a combination of deep neural networks (DNNs) and ensemble methods such as Gradient Boosting Machines (GBM) and XGBoost to enhance feature selection and biomarker discovery. Recursive feature elimination (RFE) was coupled with these advanced algorithms to refine the selection of the most critical biomarkers. Functional validation was conducted using convolutional neural networks (CNN) for gene set enrichment and immunoinfiltration analysis, revealing seven significant biomarkers—SLC37A4, ACOX2, GPD1, ACE2, SLC9A3, AGT, and PLG. These biomarkers are involved in critical biological processes such as fatty acid metabolism and glomerular development, providing a mechanistic link to DN progression. Furthermore, a TF–miRNA–mRNA regulatory network was constructed using natural language processing models to identify 8 transcription factors and 60 miRNAs that regulate these biomarkers, while a drug–gene interaction network revealed potential therapeutic targets such as UROKINASE–PLG and ATENOLOL–AGT. This integrative approach, leveraging deep learning and ensemble models, not only enhances the accuracy of biomarker discovery but also offers new perspectives on DN diagnosis and treatment, specifically targeting fibrosis and propionate metabolism pathways.Keywords: diabetic nephropathy, deep neural networks, gradient boosting machines (GBM), XGBoost
Procedia PDF Downloads 1297 Private Coded Computation of Matrix Multiplication
Authors: Malihe Aliasgari, Yousef Nejatbakhsh
Abstract:
The era of Big Data and the immensity of real-life datasets compels computation tasks to be performed in a distributed fashion, where the data is dispersed among many servers that operate in parallel. However, massive parallelization leads to computational bottlenecks due to faulty servers and stragglers. Stragglers refer to a few slow or delay-prone processors that can bottleneck the entire computation because one has to wait for all the parallel nodes to finish. The problem of straggling processors, has been well studied in the context of distributed computing. Recently, it has been pointed out that, for the important case of linear functions, it is possible to improve over repetition strategies in terms of the tradeoff between performance and latency by carrying out linear precoding of the data prior to processing. The key idea is that, by employing suitable linear codes operating over fractions of the original data, a function may be completed as soon as enough number of processors, depending on the minimum distance of the code, have completed their operations. The problem of matrix-matrix multiplication in the presence of practically big sized of data sets faced with computational and memory related difficulties, which makes such operations are carried out using distributed computing platforms. In this work, we study the problem of distributed matrix-matrix multiplication W = XY under storage constraints, i.e., when each server is allowed to store a fixed fraction of each of the matrices X and Y, which is a fundamental building of many science and engineering fields such as machine learning, image and signal processing, wireless communication, optimization. Non-secure and secure matrix multiplication are studied. We want to study the setup, in which the identity of the matrix of interest should be kept private from the workers and then obtain the recovery threshold of the colluding model, that is, the number of workers that need to complete their task before the master server can recover the product W. The problem of secure and private distributed matrix multiplication W = XY which the matrix X is confidential, while matrix Y is selected in a private manner from a library of public matrices. We present the best currently known trade-off between communication load and recovery threshold. On the other words, we design an achievable PSGPD scheme for any arbitrary privacy level by trivially concatenating a robust PIR scheme for arbitrary colluding workers and private databases and the proposed SGPD code that provides a smaller computational complexity at the workers.Keywords: coded distributed computation, private information retrieval, secret sharing, stragglers
Procedia PDF Downloads 12596 Cosmetic Recommendation Approach Using Machine Learning
Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake
Abstract:
The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.Keywords: content-based filtering, cosmetics, machine learning, recommendation system
Procedia PDF Downloads 13595 Parallel Fuzzy Rough Support Vector Machine for Data Classification in Cloud Environment
Authors: Arindam Chaudhuri
Abstract:
Classification of data has been actively used for most effective and efficient means of conveying knowledge and information to users. The prima face has always been upon techniques for extracting useful knowledge from data such that returns are maximized. With emergence of huge datasets the existing classification techniques often fail to produce desirable results. The challenge lies in analyzing and understanding characteristics of massive data sets by retrieving useful geometric and statistical patterns. We propose a supervised parallel fuzzy rough support vector machine (PFRSVM) for data classification in cloud environment. The classification is performed by PFRSVM using hyperbolic tangent kernel. The fuzzy rough set model takes care of sensitiveness of noisy samples and handles impreciseness in training samples bringing robustness to results. The membership function is function of center and radius of each class in feature space and is represented with kernel. It plays an important role towards sampling the decision surface. The success of PFRSVM is governed by choosing appropriate parameter values. The training samples are either linear or nonlinear separable. The different input points make unique contributions to decision surface. The algorithm is parallelized with a view to reduce training times. The system is built on support vector machine library using Hadoop implementation of MapReduce. The algorithm is tested on large data sets to check its feasibility and convergence. The performance of classifier is also assessed in terms of number of support vectors. The challenges encountered towards implementing big data classification in machine learning frameworks are also discussed. The experiments are done on the cloud environment available at University of Technology and Management, India. The results are illustrated for Gaussian RBF and Bayesian kernels. The effect of variability in prediction and generalization of PFRSVM is examined with respect to values of parameter C. It effectively resolves outliers’ effects, imbalance and overlapping class problems, normalizes to unseen data and relaxes dependency between features and labels. The average classification accuracy for PFRSVM is better than other classifiers for both Gaussian RBF and Bayesian kernels. The experimental results on both synthetic and real data sets clearly demonstrate the superiority of the proposed technique.Keywords: FRSVM, Hadoop, MapReduce, PFRSVM
Procedia PDF Downloads 49194 Biofilm Text Classifiers Developed Using Natural Language Processing and Unsupervised Learning Approach
Authors: Kanika Gupta, Ashok Kumar
Abstract:
Biofilms are dense, highly hydrated cell clusters that are irreversibly attached to a substratum, to an interface or to each other, and are embedded in a self-produced gelatinous matrix composed of extracellular polymeric substances. Research in biofilm field has become very significant, as biofilm has shown high mechanical resilience and resistance to antibiotic treatment and constituted as a significant problem in both healthcare and other industry related to microorganisms. The massive information both stated and hidden in the biofilm literature are growing exponentially therefore it is not possible for researchers and practitioners to automatically extract and relate information from different written resources. So, the current work proposes and discusses the use of text mining techniques for the extraction of information from biofilm literature corpora containing 34306 documents. It is very difficult and expensive to obtain annotated material for biomedical literature as the literature is unstructured i.e. free-text. Therefore, we considered unsupervised approach, where no annotated training is necessary and using this approach we developed a system that will classify the text on the basis of growth and development, drug effects, radiation effects, classification and physiology of biofilms. For this, a two-step structure was used where the first step is to extract keywords from the biofilm literature using a metathesaurus and standard natural language processing tools like Rapid Miner_v5.3 and the second step is to discover relations between the genes extracted from the whole set of biofilm literature using pubmed.mineR_v1.0.11. We used unsupervised approach, which is the machine learning task of inferring a function to describe hidden structure from 'unlabeled' data, in the above-extracted datasets to develop classifiers using WinPython-64 bit_v3.5.4.0Qt5 and R studio_v0.99.467 packages which will automatically classify the text by using the mentioned sets. The developed classifiers were tested on a large data set of biofilm literature which showed that the unsupervised approach proposed is promising as well as suited for a semi-automatic labeling of the extracted relations. The entire information was stored in the relational database which was hosted locally on the server. The generated biofilm vocabulary and genes relations will be significant for researchers dealing with biofilm research, making their search easy and efficient as the keywords and genes could be directly mapped with the documents used for database development.Keywords: biofilms literature, classifiers development, text mining, unsupervised learning approach, unstructured data, relational database
Procedia PDF Downloads 17293 Comparing Xbar Charts: Conventional versus Reweighted Robust Estimation Methods for Univariate Data Sets
Authors: Ece Cigdem Mutlu, Burak Alakent
Abstract:
Maintaining the quality of manufactured products at a desired level depends on the stability of process dispersion and location parameters and detection of perturbations in these parameters as promptly as possible. Shewhart control chart is the most widely used technique in statistical process monitoring to monitor the quality of products and control process mean and variability. In the application of Xbar control charts, sample standard deviation and sample mean are known to be the most efficient conventional estimators in determining process dispersion and location parameters, respectively, based on the assumption of independent and normally distributed datasets. On the other hand, there is no guarantee that the real-world data would be normally distributed. In the cases of estimated process parameters from Phase I data clouded with outliers, efficiency of traditional estimators is significantly reduced, and performance of Xbar charts are undesirably low, e.g. occasional outliers in the rational subgroups in Phase I data set may considerably affect the sample mean and standard deviation, resulting a serious delay in detection of inferior products in Phase II. For more efficient application of control charts, it is required to use robust estimators against contaminations, which may exist in Phase I. In the current study, we present a simple approach to construct robust Xbar control charts using average distance to the median, Qn-estimator of scale, M-estimator of scale with logistic psi-function in the estimation of process dispersion parameter, and Harrell-Davis qth quantile estimator, Hodge-Lehmann estimator and M-estimator of location with Huber psi-function and logistic psi-function in the estimation of process location parameter. Phase I efficiency of proposed estimators and Phase II performance of Xbar charts constructed from these estimators are compared with the conventional mean and standard deviation statistics both under normality and against diffuse-localized and symmetric-asymmetric contaminations using 50,000 Monte Carlo simulations on MATLAB. Consequently, it is found that robust estimators yield parameter estimates with higher efficiency against all types of contaminations, and Xbar charts constructed using robust estimators have higher power in detecting disturbances, compared to conventional methods. Additionally, utilizing individuals charts to screen outlier subgroups and employing different combination of dispersion and location estimators on subgroups and individual observations are found to improve the performance of Xbar charts.Keywords: average run length, M-estimators, quality control, robust estimators
Procedia PDF Downloads 19192 Unveiling Comorbidities in Irritable Bowel Syndrome: A UK BioBank Study utilizing Supervised Machine Learning
Authors: Uswah Ahmad Khan, Muhammad Moazam Fraz, Humayoon Shafique Satti, Qasim Aziz
Abstract:
Approximately 10-14% of the global population experiences a functional disorder known as irritable bowel syndrome (IBS). The disorder is defined by persistent abdominal pain and an irregular bowel pattern. IBS significantly impairs work productivity and disrupts patients' daily lives and activities. Although IBS is widespread, there is still an incomplete understanding of its underlying pathophysiology. This study aims to help characterize the phenotype of IBS patients by differentiating the comorbidities found in IBS patients from those in non-IBS patients using machine learning algorithms. In this study, we extracted samples coding for IBS from the UK BioBank cohort and randomly selected patients without a code for IBS to create a total sample size of 18,000. We selected the codes for comorbidities of these cases from 2 years before and after their IBS diagnosis and compared them to the comorbidities in the non-IBS cohort. Machine learning models, including Decision Trees, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Logistic Regression, and XGBoost, were employed to assess their accuracy in predicting IBS. The most accurate model was then chosen to identify the features associated with IBS. In our case, we used XGBoost feature importance as a feature selection method. We applied different models to the top 10% of features, which numbered 50. Gradient Boosting, Logistic Regression and XGBoost algorithms yielded a diagnosis of IBS with an optimal accuracy of 71.08%, 71.427%, and 71.53%, respectively. Among the comorbidities most closely associated with IBS included gut diseases (Haemorrhoids, diverticular diseases), atopic conditions(asthma), and psychiatric comorbidities (depressive episodes or disorder, anxiety). This finding emphasizes the need for a comprehensive approach when evaluating the phenotype of IBS, suggesting the possibility of identifying new subsets of IBS rather than relying solely on the conventional classification based on stool type. Additionally, our study demonstrates the potential of machine learning algorithms in predicting the development of IBS based on comorbidities, which may enhance diagnosis and facilitate better management of modifiable risk factors for IBS. Further research is necessary to confirm our findings and establish cause and effect. Alternative feature selection methods and even larger and more diverse datasets may lead to more accurate classification models. Despite these limitations, our findings highlight the effectiveness of Logistic Regression and XGBoost in predicting IBS diagnosis.Keywords: comorbidities, disease association, irritable bowel syndrome (IBS), predictive analytics
Procedia PDF Downloads 11991 Real Estate Trend Prediction with Artificial Intelligence Techniques
Authors: Sophia Liang Zhou
Abstract:
For investors, businesses, consumers, and governments, an accurate assessment of future housing prices is crucial to critical decisions in resource allocation, policy formation, and investment strategies. Previous studies are contradictory about macroeconomic determinants of housing price and largely focused on one or two areas using point prediction. This study aims to develop data-driven models to accurately predict future housing market trends in different markets. This work studied five different metropolitan areas representing different market trends and compared three-time lagging situations: no lag, 6-month lag, and 12-month lag. Linear regression (LR), random forest (RF), and artificial neural network (ANN) were employed to model the real estate price using datasets with S&P/Case-Shiller home price index and 12 demographic and macroeconomic features, such as gross domestic product (GDP), resident population, personal income, etc. in five metropolitan areas: Boston, Dallas, New York, Chicago, and San Francisco. The data from March 2005 to December 2018 were collected from the Federal Reserve Bank, FBI, and Freddie Mac. In the original data, some factors are monthly, some quarterly, and some yearly. Thus, two methods to compensate missing values, backfill or interpolation, were compared. The models were evaluated by accuracy, mean absolute error, and root mean square error. The LR and ANN models outperformed the RF model due to RF’s inherent limitations. Both ANN and LR methods generated predictive models with high accuracy ( > 95%). It was found that personal income, GDP, population, and measures of debt consistently appeared as the most important factors. It also showed that technique to compensate missing values in the dataset and implementation of time lag can have a significant influence on the model performance and require further investigation. The best performing models varied for each area, but the backfilled 12-month lag LR models and the interpolated no lag ANN models showed the best stable performance overall, with accuracies > 95% for each city. This study reveals the influence of input variables in different markets. It also provides evidence to support future studies to identify the optimal time lag and data imputing methods for establishing accurate predictive models.Keywords: linear regression, random forest, artificial neural network, real estate price prediction
Procedia PDF Downloads 10390 Undernutrition Among Children Below Five Years of Age in Uganda: A Deep Dive into Space and Time
Authors: Vallence Ngabo Maniragaba
Abstract:
This study aimed at examining the variations of undernutrition among children below 5 years of age in Uganda. The approach of spatial and spatiotemporal analysis helped in identifying cluster patterns, hot spots and emerging hot spots. Data from the 6 Uganda Demographic and Health Surveys spanning from 1990 to 2016 were used with the main outcome variable being undernutrition among children <5 years of age. All data that were relevant to this study were retrieved from the survey datasets and combined with the 214 shape files for the districts of Uganda to enable spatial and spatiotemporal analysis. Spatial maps with the spatial distribution of the prevalence of undernutrition, both in space and time, were generated using ArcGIS Pro version 2.8. Moran’s I, an index of spatial autocorrelation, rules out doubts of spatial randomness in order to identify spatially clustered patterns of hot or cold spot areas. Furthermore, space-time cubes were generated to establish the trend in undernutrition as well as to mirror its variations over time and across Uganda. Moreover, emerging hot spot analysis was done to help identify the patterns of undernutrition over time. The results indicate a heterogeneous distribution of undernutrition across Uganda and the same variations were also evident over time. Moran’s I index confirmed spatial clustered patterns as opposed to random distributions of undernutrition prevalence. Four hot spot areas, namely; the Karamoja, the Sebei, the West Nile and the Toro regions were significantly evident, most of the central parts of Uganda were identified as cold spot clusters, while most of Western Uganda, the Acholi and the Lango regions had no statistically significant spatial patterns by the year 2016. The spatio-temporal analysis identified the Karamoja and Sebei regions as clusters of persistent, consecutive and intensifying hot spots, West Nile region was identified as a sporadic hot spot area while the Toro region was identified with both sporadic and emerging hotspots. In conclusion, undernutrition is a silent pandemic that needs to be handled with both hands. At 31.2 percent, the prevalence is still very high and unpleasant. The distribution across the country is nonuniform with some areas such as the Karamoja, the West Nile, the Sebei and the Toro regions being epicenters of undernutrition in Uganda. Over time, the same areas have experienced and exhibited high undernutrition prevalence. Policymakers, as well as the implementers, should bear in mind the spatial variations across the country and prioritize hot spot areas in order to have efficient, timely and region-specific interventions.Keywords: undernutrition, spatial autocorrelation, hotspots analysis, geographically weighted regressions, emerging hotspots analysis, under-fives, Uganda
Procedia PDF Downloads 9089 AS-Geo: Arbitrary-Sized Image Geolocalization with Learnable Geometric Enhancement Resizer
Authors: Huayuan Lu, Chunfang Yang, Ma Zhu, Baojun Qi, Yaqiong Qiao, Jiangqian Xu
Abstract:
Image geolocalization has great application prospects in fields such as autonomous driving and virtual/augmented reality. In practical application scenarios, the size of the image to be located is not fixed; it is impractical to train different networks for all possible sizes. When its size does not match the size of the input of the descriptor extraction model, existing image geolocalization methods usually directly scale or crop the image in some common ways. This will result in the loss of some information important to the geolocalization task, thus affecting the performance of the image geolocalization method. For example, excessive down-sampling can lead to blurred building contour, and inappropriate cropping can lead to the loss of key semantic elements, resulting in incorrect geolocation results. To address this problem, this paper designs a learnable image resizer and proposes an arbitrary-sized image geolocation method. (1) The designed learnable image resizer employs the self-attention mechanism to enhance the geometric features of the resized image. Firstly, it applies bilinear interpolation to the input image and its feature maps to obtain the initial resized image and the resized feature maps. Then, SKNet (selective kernel net) is used to approximate the best receptive field, thus keeping the geometric shapes as the original image. And SENet (squeeze and extraction net) is used to automatically select the feature maps with strong contour information, enhancing the geometric features. Finally, the enhanced geometric features are fused with the initial resized image, to obtain the final resized images. (2) The proposed image geolocalization method embeds the above image resizer as a fronting layer of the descriptor extraction network. It not only enables the network to be compatible with arbitrary-sized input images but also enhances the geometric features that are crucial to the image geolocalization task. Moreover, the triplet attention mechanism is added after the first convolutional layer of the backbone network to optimize the utilization of geometric elements extracted by the first convolutional layer. Finally, the local features extracted by the backbone network are aggregated to form image descriptors for image geolocalization. The proposed method was evaluated on several mainstream datasets, such as Pittsburgh30K, Tokyo24/7, and Places365. The results show that the proposed method has excellent size compatibility and compares favorably to recently mainstream geolocalization methods.Keywords: image geolocalization, self-attention mechanism, image resizer, geometric feature
Procedia PDF Downloads 21688 AI Applications in Accounting: Transforming Finance with Technology
Authors: Alireza Karimi
Abstract:
Artificial Intelligence (AI) is reshaping various industries, and accounting is no exception. With the ability to process vast amounts of data quickly and accurately, AI is revolutionizing how financial professionals manage, analyze, and report financial information. In this article, we will explore the diverse applications of AI in accounting and its profound impact on the field. Automation of Repetitive Tasks: One of the most significant contributions of AI in accounting is automating repetitive tasks. AI-powered software can handle data entry, invoice processing, and reconciliation with minimal human intervention. This not only saves time but also reduces the risk of errors, leading to more accurate financial records. Pattern Recognition and Anomaly Detection: AI algorithms excel at pattern recognition. In accounting, this capability is leveraged to identify unusual patterns in financial data that might indicate fraud or errors. AI can swiftly detect discrepancies, enabling auditors and accountants to focus on resolving issues rather than hunting for them. Real-Time Financial Insights: AI-driven tools, using natural language processing and computer vision, can process documents faster than ever. This enables organizations to have real-time insights into their financial status, empowering decision-makers with up-to-date information for strategic planning. Fraud Detection and Prevention: AI is a powerful tool in the fight against financial fraud. It can analyze vast transaction datasets, flagging suspicious activities and reducing the likelihood of financial misconduct going unnoticed. This proactive approach safeguards a company's financial integrity. Enhanced Data Analysis and Forecasting: Machine learning, a subset of AI, is used for data analysis and forecasting. By examining historical financial data, AI models can provide forecasts and insights, aiding businesses in making informed financial decisions and optimizing their financial strategies. Artificial Intelligence is fundamentally transforming the accounting profession. From automating mundane tasks to enhancing data analysis and fraud detection, AI is making financial processes more efficient, accurate, and insightful. As AI continues to evolve, its role in accounting will only become more significant, offering accountants and finance professionals powerful tools to navigate the complexities of modern finance. Embracing AI in accounting is not just a trend; it's a necessity for staying competitive in the evolving financial landscape.Keywords: artificial intelligence, accounting automation, financial analysis, fraud detection, machine learning in finance
Procedia PDF Downloads 6387 The Use of Empirical Models to Estimate Soil Erosion in Arid Ecosystems and the Importance of Native Vegetation
Authors: Meshal M. Abdullah, Rusty A. Feagin, Layla Musawi
Abstract:
When humans mismanage arid landscapes, soil erosion can become a primary mechanism that leads to desertification. This study focuses on applying soil erosion models to a disturbed landscape in Umm Nigga, Kuwait, and identifying its predicted change under restoration plans, The northern portion of Umm Nigga, containing both coastal and desert ecosystems, falls within the boundaries of the Demilitarized Zone (DMZ) adjacent to Iraq, and has been fenced off to restrict public access since 1994. The central objective of this project was to utilize GIS and remote sensing to compare the MPSIAC (Modified Pacific South West Inter Agency Committee), EMP (Erosion Potential Method), and USLE (Universal Soil Loss Equation) soil erosion models and determine their applicability for arid regions such as Kuwait. Spatial analysis was used to develop the necessary datasets for factors such as soil characteristics, vegetation cover, runoff, climate, and topography. Results showed that the MPSIAC and EMP models produced a similar spatial distribution of erosion, though the MPSIAC had more variability. For the MPSIAC model, approximately 45% of the land surface ranged from moderate to high soil loss, while 35% ranged from moderate to high for the EMP model. The USLE model had contrasting results and a different spatial distribution of the soil loss, with 25% of area ranging from moderate to high erosion, and 75% ranging from low to very low. We concluded that MPSIAC and EMP were the most suitable models for arid regions in general, with the MPSIAC model best. We then applied the MPSIAC model to identify the amount of soil loss between coastal and desert areas, and fenced and unfenced sites. In the desert area, soil loss was different between fenced and unfenced sites. In these desert fenced sites, 88% of the surface was covered with vegetation and soil loss was very low, while at the desert unfenced sites it was 3% and correspondingly higher. In the coastal areas, the amount of soil loss was nearly similar between fenced and unfenced sites. These results implied that vegetation cover played an important role in reducing soil erosion, and that fencing is much more important in the desert ecosystems to protect against overgrazing. When applying the MPSIAC model predictively, we found that vegetation cover could be increased from 3% to 37% in unfenced areas, and soil erosion could then decrease by 39%. We conclude that the MPSIAC model is best to predict soil erosion for arid regions such as Kuwait.Keywords: soil erosion, GIS, modified pacific South west inter agency committee model (MPSIAC), erosion potential method (EMP), Universal soil loss equation (USLE)
Procedia PDF Downloads 29886 Colored Image Classification Using Quantum Convolutional Neural Networks Approach
Authors: Farina Riaz, Shahab Abdulla, Srinjoy Ganguly, Hajime Suzuki, Ravinesh C. Deo, Susan Hopkins
Abstract:
Recently, quantum machine learning has received significant attention. For various types of data, including text and images, numerous quantum machine learning (QML) models have been created and are being tested. Images are exceedingly complex data components that demand more processing power. Despite being mature, classical machine learning still has difficulties with big data applications. Furthermore, quantum technology has revolutionized how machine learning is thought of, by employing quantum features to address optimization issues. Since quantum hardware is currently extremely noisy, it is not practicable to run machine learning algorithms on it without risking the production of inaccurate results. To discover the advantages of quantum versus classical approaches, this research has concentrated on colored image data. Deep learning classification models are currently being created on Quantum platforms, but they are still in a very early stage. Black and white benchmark image datasets like MNIST and Fashion MINIST have been used in recent research. MNIST and CIFAR-10 were compared for binary classification, but the comparison showed that MNIST performed more accurately than colored CIFAR-10. This research will evaluate the performance of the QML algorithm on the colored benchmark dataset CIFAR-10 to advance QML's real-time applicability. However, deep learning classification models have not been developed to compare colored images like Quantum Convolutional Neural Network (QCNN) to determine how much it is better to classical. Only a few models, such as quantum variational circuits, take colored images. The methodology adopted in this research is a hybrid approach by using penny lane as a simulator. To process the 10 classes of CIFAR-10, the image data has been translated into grey scale and the 28 × 28-pixel image containing 10,000 test and 50,000 training images were used. The objective of this work is to determine how much the quantum approach can outperform a classical approach for a comprehensive dataset of color images. After pre-processing 50,000 images from a classical computer, the QCNN model adopted a hybrid method and encoded the images into a quantum simulator for feature extraction using quantum gate rotations. The measurements were carried out on the classical computer after the rotations were applied. According to the results, we note that the QCNN approach is ~12% more effective than the traditional classical CNN approaches and it is possible that applying data augmentation may increase the accuracy. This study has demonstrated that quantum machine and deep learning models can be relatively superior to the classical machine learning approaches in terms of their processing speed and accuracy when used to perform classification on colored classes.Keywords: CIFAR-10, quantum convolutional neural networks, quantum deep learning, quantum machine learning
Procedia PDF Downloads 13085 The Association of Work Stress with Job Satisfaction and Occupational Burnout in Nurse Anesthetists
Authors: I. Ling Tsai, Shu Fen Wu, Chen-Fuh Lam, Chia Yu Chen, Shu Jiuan Chen, Yen Lin Liu
Abstract:
Purpose: Following the conduction of the National Health Insurance (NHI) system in Taiwan since 1995, the demand for anesthesia services continues to increase in the operating rooms and other medical units. It has been well recognized that increased work stress not only affects the clinical performance of the medical staff, long-term work load may also result in occupational burnout. Our study aimed to determine the influence of working environment, work stress and job satisfaction on the occupational burnout in nurse anesthetists. The ultimate goal of this research project is to develop a strategy in establishing a friendly, less stressful workplace for the nurse anesthetists to enhance their job satisfaction, thereby reducing occupational burnout and increasing the career life for nurse anesthetists. Methods: This was a cross-sectional, descriptive study performed in a metropolitan teaching hospital in southern Taiwan between May 2017 to July 2017. A structured self-administered questionnaire, modified from the Practice Environment Scale of the Nursing Work Index (PES-NWI), Occupational Stress Indicator 2 (OSI-2) and Maslach Burnout Inventory (MBI) manual was collected from the nurse anesthetists. The relationships between two numeric datasets were analyzed by the Pearson correlation test (SPSS 20.0). Results: A total of 66 completed questionnaires were collected from 75 nurses (response rate 88%). The average scores for the working environment, job satisfaction, and work stress were 69.6%, 61.5%, and 63.9%, respectively. The three perspectives used to assess the occupational burnout, namely emotional exhaustion, depersonalization and sense of personal accomplishment were 26.3, 13.0 and 24.5, suggesting the presence of moderate to high degrees of burnout in our nurse anesthetists. The presence of occupational burnout was closely correlated with the unsatisfactory working environment (r=-0.385, P=0.001) and reduced job satisfaction (r=-0.430, P=0.000). Junior nurse anesthetists (<1-year clinical experience) reported having higher satisfaction in working environment than the seniors (5 to 10-year clinical experience) (P=0.02). Although the average scores for work stress, job satisfaction, and occupational burnout were lower in junior nurses, the differences were not statistically different. The linear regression model, the working environment was the independent factor that predicted occupational burnout in nurse anesthetists up to 19.8%. Conclusions: High occupational burnout is more likely to develop in senior nurse anesthetists who experienced the dissatisfied working environment, work stress and lower job satisfaction. In addition to the regulation of clinical duties, the increased workload in the supervision of the junior nurse anesthetists may result in emotional stress and burnout in senior nurse anesthetists. Therefore, appropriate adjustment of clinical and teaching loading in the senior nurse anesthetists could be helpful to improve the occupational burnout and enhance the retention rate.Keywords: nurse anesthetists, working environment, work stress, job satisfaction, occupational burnout
Procedia PDF Downloads 27884 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison
Authors: Xiangtuo Chen, Paul-Henry Cournéde
Abstract:
Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.Keywords: crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest
Procedia PDF Downloads 232