Search results for: imbalanced datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 746

Search results for: imbalanced datasets

116 Development of an Automatic Calibration Framework for Hydrologic Modelling Using Approximate Bayesian Computation

Authors: A. Chowdhury, P. Egodawatta, J. M. McGree, A. Goonetilleke

Abstract:

Hydrologic models are increasingly used as tools to predict stormwater quantity and quality from urban catchments. However, due to a range of practical issues, most models produce gross errors in simulating complex hydraulic and hydrologic systems. Difficulty in finding a robust approach for model calibration is one of the main issues. Though automatic calibration techniques are available, they are rarely used in common commercial hydraulic and hydrologic modelling software e.g. MIKE URBAN. This is partly due to the need for a large number of parameters and large datasets in the calibration process. To overcome this practical issue, a framework for automatic calibration of a hydrologic model was developed in R platform and presented in this paper. The model was developed based on the time-area conceptualization. Four calibration parameters, including initial loss, reduction factor, time of concentration and time-lag were considered as the primary set of parameters. Using these parameters, automatic calibration was performed using Approximate Bayesian Computation (ABC). ABC is a simulation-based technique for performing Bayesian inference when the likelihood is intractable or computationally expensive to compute. To test the performance and usefulness, the technique was used to simulate three small catchments in Gold Coast. For comparison, simulation outcomes from the same three catchments using commercial modelling software, MIKE URBAN were used. The graphical comparison shows strong agreement of MIKE URBAN result within the upper and lower 95% credible intervals of posterior predictions as obtained via ABC. Statistical validation for posterior predictions of runoff result using coefficient of determination (CD), root mean square error (RMSE) and maximum error (ME) was found reasonable for three study catchments. The main benefit of using ABC over MIKE URBAN is that ABC provides a posterior distribution for runoff flow prediction, and therefore associated uncertainty in predictions can be obtained. In contrast, MIKE URBAN just provides a point estimate. Based on the results of the analysis, it appears as though ABC the developed framework performs well for automatic calibration.

Keywords: automatic calibration framework, approximate bayesian computation, hydrologic and hydraulic modelling, MIKE URBAN software, R platform

Procedia PDF Downloads 277
115 Smart Defect Detection in XLPE Cables Using Convolutional Neural Networks

Authors: Tesfaye Mengistu

Abstract:

Power cables play a crucial role in the transmission and distribution of electrical energy. As the electricity generation, transmission, distribution, and storage systems become smarter, there is a growing emphasis on incorporating intelligent approaches to ensure the reliability of power cables. Various types of electrical cables are employed for transmitting and distributing electrical energy, with cross-linked polyethylene (XLPE) cables being widely utilized due to their exceptional electrical and mechanical properties. However, insulation defects can occur in XLPE cables due to subpar manufacturing techniques during production and cable joint installation. To address this issue, experts have proposed different methods for monitoring XLPE cables. Some suggest the use of interdigital capacitive (IDC) technology for online monitoring, while others propose employing continuous wave (CW) terahertz (THz) imaging systems to detect internal defects in XLPE plates used for power cable insulation. In this study, we have developed models that employ a custom dataset collected locally to classify the physical safety status of individual power cables. Our models aim to replace physical inspections with computer vision and image processing techniques to classify defective power cables from non-defective ones. The implementation of our project utilized the Python programming language along with the TensorFlow package and a convolutional neural network (CNN). The CNN-based algorithm was specifically chosen for power cable defect classification. The results of our project demonstrate the effectiveness of CNNs in accurately classifying power cable defects. We recommend the utilization of similar or additional datasets to further enhance and refine our models. Additionally, we believe that our models could be used to develop methodologies for detecting power cable defects from live video feeds. We firmly believe that our work makes a significant contribution to the field of power cable inspection and maintenance. Our models offer a more efficient and cost-effective approach to detecting power cable defects, thereby improving the reliability and safety of power grids.

Keywords: artificial intelligence, computer vision, defect detection, convolutional neural net

Procedia PDF Downloads 72
114 Hydrodynamic and Water Quality Modelling to Support Alternative Fuels Maritime Operations Incident Planning & Impact Assessments

Authors: Chow Jeng Hei, Pavel Tkalich, Low Kai Sheng Bryan

Abstract:

Due to the growing demand for sustainability in the maritime industry, there has been a significant increase in focus on alternative fuels such as biofuels, liquefied natural gas (LNG), hydrogen, methanol and ammonia to reduce the carbon footprint of vessels. Alternative fuels offer efficient transportability and significantly reduce carbon dioxide emissions, a critical factor in combating global warming. In an era where the world is determined to tackle climate change, the utilization of methanol is projected to witness a consistent rise in demand, even during downturns in the oil and gas industry. Since 2022, there has been an increase in methanol loading and discharging operations for industrial use in Singapore. These operations were conducted across various storage tank terminals at Jurong Island of varying capacities, which are also used to store alternative fuels for bunkering requirements. The key objective of this research is to support the green shipping industries in the transformation to new fuels such as methanol and ammonia, especially in evolving the capability to inform risk assessment and management of spills. In the unlikely event of accidental spills, a highly reliable forecasting system must be in place to provide mitigation measures and ahead planning. The outcomes of this research would lead to an enhanced metocean prediction capability and, together with advanced sensing, will continuously build up a robust digital twin of the bunkering operating environment. Outputs from the developments will contribute to management strategies for alternative marine fuel spills, including best practices, safety challenges and crisis management. The outputs can also benefit key port operators and the various bunkering, petrochemicals, shipping, protection and indemnity, and emergency response sectors. The forecasted datasets provide a forecast of the expected atmosphere and hydrodynamic conditions prior to bunkering exercises, enabling a better understanding of the metocean conditions ahead and allowing for more refined spill incident management planning

Keywords: clean fuels, hydrodynamics, coastal engineering, impact assessments

Procedia PDF Downloads 41
113 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang

Abstract:

As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.

Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression

Procedia PDF Downloads 380
112 The Association of Excessive Work Stress with Job Satisfaction and Turnover Intention in Operating Room Nurses: A Cross-Sectional Study in a Metropolitan Teaching Hospital in Southern Taiwan

Authors: Chia Yu Chen, Shu Fen Wu, Chen-Fuh Lam, I-Ling Tsai, Shu Jiuan Chen, Yen Ling Liu

Abstract:

Aim: It remains undetermined that whether increased work stress may affect the job satisfaction and career loyalty among nursing staffs in the operating room. The long-term goal of this study is to lengthen the professional life of operating room nurses by attenuating the work stress and enhancing their contentment in work. Method: This was a cross-sectional, descriptive study performed in a metropolitan teaching hospital in the southern Taiwan between May 2017 to July 2017. A structured self-administered questionnaire, modified from the Occupational Stress Indicator-2 (OSI-2) and Maslach Burnout Inventory (MBI) manual was collected from the operating room nurses. Chi-square test was used to analyze the categorical data and Pearson correlation was used to analyze the association between two numerical datasets (SPSS version 20.0). Results: The response rate was 80% (80/100) and a total of 73 (73%) completed forms were eventually proceeded for analysis. The average scores for work stress and job satisfaction of the operating room nurses were 145.96±32.91 and 47.38±6.07, respectively. The correlation coefficients of work stress versus job satisfaction and organizational identity were (r=-0.338, p=0.003 and r=-0.354, p=0.002), respectively. There were more nurses who took rotating shift quitted works from the operating room than those who took only dayshift (2=5.176, p<0.05). Nurses who reported of having lower job satisfaction were associated with significantly higher turnover intention (t=3.714, p< 0.01). Following multivariate regression analysis, rotating shift and low job satisfaction were identified as the two independent predictors of intention to quit from working in the operating room. Conclusion: Our study clearly demonstrates that increased work stress significantly attenuates job satisfaction and organizational identity. Rotating shift is associated with higher work stress, lower job satisfaction, and higher turnover intention, which is consistent with the previous surveys carried out in the department of medical technology. Therefore, improvement of working quality in the operating rooms is essential to increase the retain intention of the well-trained nursing staffs. Further investigation into types of work shifts and other strategies of attenuating stress in workplace is currently undertaken in order to improve the job satisfaction and to decrease turnover intention in the operating room.

Keywords: rotating shift, work stress, job satisfaction, turnover intention

Procedia PDF Downloads 163
111 DEMs: A Multivariate Comparison Approach

Authors: Juan Francisco Reinoso Gordo, Francisco Javier Ariza-López, José Rodríguez Avi, Domingo Barrera Rosillo

Abstract:

The evaluation of the quality of a data product is based on the comparison of the product with a reference of greater accuracy. In the case of MDE data products, quality assessment usually focuses on positional accuracy and few studies consider other terrain characteristics, such as slope and orientation. The proposal that is made consists of evaluating the similarity of two DEMs (a product and a reference), through the joint analysis of the distribution functions of the variables of interest, for example, elevations, slopes and orientations. This is a multivariable approach that focuses on distribution functions, not on single parameters such as mean values or dispersions (e.g. root mean squared error or variance). This is considered to be a more holistic approach. The use of the Kolmogorov-Smirnov test is proposed due to its non-parametric nature, since the distributions of the variables of interest cannot always be adequately modeled by parametric models (e.g. the Normal distribution model). In addition, its application to the multivariate case is carried out jointly by means of a single test on the convolution of the distribution functions of the variables considered, which avoids the use of corrections such as Bonferroni when several statistics hypothesis tests are carried out together. In this work, two DEM products have been considered, DEM02 with a resolution of 2x2 meters and DEM05 with a resolution of 5x5 meters, both generated by the National Geographic Institute of Spain. DEM02 is considered as the reference and DEM05 as the product to be evaluated. In addition, the slope and aspect derived models have been calculated by GIS operations on the two DEM datasets. Through sample simulation processes, the adequate behavior of the Kolmogorov-Smirnov statistical test has been verified when the null hypothesis is true, which allows calibrating the value of the statistic for the desired significance value (e.g. 5%). Once the process has been calibrated, the same process can be applied to compare the similarity of different DEM data sets (e.g. the DEM05 versus the DEM02). In summary, an innovative alternative for the comparison of DEM data sets based on a multinomial non-parametric perspective has been proposed by means of a single Kolmogorov-Smirnov test. This new approach could be extended to other DEM features of interest (e.g. curvature, etc.) and to more than three variables

Keywords: data quality, DEM, kolmogorov-smirnov test, multivariate DEM comparison

Procedia PDF Downloads 90
110 Research on Quality Assurance in African Higher Education: A Bibliometric Mapping from 1999 to 2019

Authors: Luís M. João, Patrício Langa

Abstract:

The article reviews the literature on quality assurance (QA) in African higher education studies (HES) conducted through a bibliometric mapping of published papers between 1999 and 2019. Specifically, the article highlights the nuances of knowledge production in four scientific databases: Scopus, Web of Science (WoS), African Journal Online (AJOL), and Google Scholar. The analysis included 531 papers, of which 127 are from Scopus, 30 are from Web of Science, 85 are from African Journal Online, and 259 are from Google Scholar. In essence, 284 authors wrote these papers from 231 institutions and 69 different countries (i.e., Africa=54 and outside Africa=15). Results indicate the existing knowledge. This analysis allows the readers to understand the growth and development of the field during the two-decade period, identify key contributors, and observe potential trends or gaps in the research. The paper employs bibliometric mapping as its primary analytical lens. By utilizing this method, the study quantitatively assesses the publications related to QA in African HES, helping to identify patterns, collaboration networks, and disparities in research output. The bibliometric approach allows for a systematic and objective analysis of large datasets, offering a comprehensive view of the knowledge production in the field. Furthermore, the study highlights the lack of shared resources available to enhance quality in higher education institutions (HEIs) in Africa. This finding underscores the importance of promoting collaborative research efforts, knowledge exchange, and capacity building within the region to improve the overall quality of higher education. The paper argues that despite the growing quantity of QA research in African higher education, there are challenges related to citation impact and access to high-impact publication avenues for African researchers. It emphasises the need to promote collaborative research and resource-sharing to enhance the quality of HEIs in Africa. The analytical lenses of bibliometric mapping and the examination of publication players' scenarios contribute to a comprehensive understanding of the field and its implications for African higher education.

Keywords: Africa, bibliometric research, higher education studies, quality assurance, scientific database, systematic review

Procedia PDF Downloads 24
109 Predicting Success and Failure in Drug Development Using Text Analysis

Authors: Zhi Hao Chow, Cian Mulligan, Jack Walsh, Antonio Garzon Vico, Dimitar Krastev

Abstract:

Drug development is resource-intensive, time-consuming, and increasingly expensive with each developmental stage. The success rates of drug development are also relatively low, and the resources committed are wasted with each failed candidate. As such, a reliable method of predicting the success of drug development is in demand. The hypothesis was that some examples of failed drug candidates are pushed through developmental pipelines based on false confidence and may possess common linguistic features identifiable through sentiment analysis. Here, the concept of using text analysis to discover such features in research publications and investor reports as predictors of success was explored. R studios were used to perform text mining and lexicon-based sentiment analysis to identify affective phrases and determine their frequency in each document, then using SPSS to determine the relationship between our defined variables and the accuracy of predicting outcomes. A total of 161 publications were collected and categorised into 4 groups: (i) Cancer treatment, (ii) Neurodegenerative disease treatment, (iii) Vaccines, and (iv) Others (containing all other drugs that do not fit into the 3 categories). Text analysis was then performed on each document using 2 separate datasets (BING and AFINN) in R within the category of drugs to determine the frequency of positive or negative phrases in each document. A relative positivity and negativity value were then calculated by dividing the frequency of phrases with the word count of each document. Regression analysis was then performed with SPSS statistical software on each dataset (values from using BING or AFINN dataset during text analysis) using a random selection of 61 documents to construct a model. The remaining documents were then used to determine the predictive power of the models. Model constructed from BING predicts the outcome of drug performance in clinical trials with an overall percentage of 65.3%. AFINN model had a lower accuracy at predicting outcomes compared to the BING model at 62.5% but was not effective at predicting the failure of drugs in clinical trials. Overall, the study did not show significant efficacy of the model at predicting outcomes of drugs in development. Many improvements may need to be made to later iterations of the model to sufficiently increase the accuracy.

Keywords: data analysis, drug development, sentiment analysis, text-mining

Procedia PDF Downloads 125
108 Effects of Cash Transfers Mitigation Impacts in the Face of Socioeconomic External Shocks: Evidence from Egypt

Authors: Basma Yassa

Abstract:

Evidence on cash transfers’ effectiveness in mitigating macro and idiosyncratic shocks’ impacts has been mixed and is mostly concentrated in Latin America, Sub-Saharan Africa, and South Asia with very limited evidence from the MENA region. Yet conditional cash transfers schemes have been continually used, especially in Egypt, as the main social protection tool in response to the recent socioeconomic crises and macro shocks. We use 2 panel datasets and 1 cross-sectional dataset to estimate the effectiveness of cash transfers as a shock-mitigative mechanism in the Egyptian context. In this paper, the results from the different models (Panel Fixed Effects model and the Regression Discontinuity Design (RDD) model) confirm that micro and macro shocks lead to significant decline in several household-level welfare outcomes and that Takaful cash transfers have a significant positive impact in mitigating the negative shock impacts, especially on households’ debt incidence, debt levels, and asset ownership, but not necessarily on food, and non-food expenditure levels. The results indicate large positive significant effects on decreasing household incidence of debt by up to 12.4 percent and lowered the debt size by approximately 18 percent among Takaful beneficiaries compared to non-beneficiaries’. Similar evidence is found on asset ownership levels, as the RDD model shows significant positive effects on total asset ownership and productive asset ownership, but the model failed to detect positive impacts on per capita food and non-food expenditures. Further extensions are still in progress to compare the models’ results with the DID model results when using a nationally representative ELMPS panel data (2018/2024) rounds. Finally, our initial analysis suggests that conditional cash transfers are effective in buffering the negative shock impacts on certain welfare indicators even after successive macro-economic shocks in 2022 and 2023 in the Egyptian Context.

Keywords: cash transfers, fixed effects, household welfare, household debt, micro shocks, regression discontinuity design

Procedia PDF Downloads 21
107 Object-Scene: Deep Convolutional Representation for Scene Classification

Authors: Yanjun Chen, Chuanping Hu, Jie Shao, Lin Mei, Chongyang Zhang

Abstract:

Traditional image classification is based on encoding scheme (e.g. Fisher Vector, Vector of Locally Aggregated Descriptor) with low-level image features (e.g. SIFT, HoG). Compared to these low-level local features, deep convolutional features obtained at the mid-level layer of convolutional neural networks (CNN) have richer information but lack of geometric invariance. For scene classification, there are scattered objects with different size, category, layout, number and so on. It is crucial to find the distinctive objects in scene as well as their co-occurrence relationship. In this paper, we propose a method to take advantage of both deep convolutional features and the traditional encoding scheme while taking object-centric and scene-centric information into consideration. First, to exploit the object-centric and scene-centric information, two CNNs that trained on ImageNet and Places dataset separately are used as the pre-trained models to extract deep convolutional features at multiple scales. This produces dense local activations. By analyzing the performance of different CNNs at multiple scales, it is found that each CNN works better in different scale ranges. A scale-wise CNN adaption is reasonable since objects in scene are at its own specific scale. Second, a fisher kernel is applied to aggregate a global representation at each scale and then to merge into a single vector by using a post-processing method called scale-wise normalization. The essence of Fisher Vector lies on the accumulation of the first and second order differences. Hence, the scale-wise normalization followed by average pooling would balance the influence of each scale since different amount of features are extracted. Third, the Fisher vector representation based on the deep convolutional features is followed by a linear Supported Vector Machine, which is a simple yet efficient way to classify the scene categories. Experimental results show that the scale-specific feature extraction and normalization with CNNs trained on object-centric and scene-centric datasets can boost the results from 74.03% up to 79.43% on MIT Indoor67 when only two scales are used (compared to results at single scale). The result is comparable to state-of-art performance which proves that the representation can be applied to other visual recognition tasks.

Keywords: deep convolutional features, Fisher Vector, multiple scales, scale-specific normalization

Procedia PDF Downloads 305
106 Development of a Turbulent Boundary Layer Wall-pressure Fluctuations Power Spectrum Model Using a Stepwise Regression Algorithm

Authors: Zachary Huffman, Joana Rocha

Abstract:

Wall-pressure fluctuations induced by the turbulent boundary layer (TBL) developed over aircraft are a significant source of aircraft cabin noise. Since the power spectral density (PSD) of these pressure fluctuations is directly correlated with the amount of sound radiated into the cabin, the development of accurate empirical models that predict the PSD has been an important ongoing research topic. The sound emitted can be represented from the pressure fluctuations term in the Reynoldsaveraged Navier-Stokes equations (RANS). Therefore, early TBL empirical models (including those from Lowson, Robertson, Chase, and Howe) were primarily derived by simplifying and solving the RANS for pressure fluctuation and adding appropriate scales. Most subsequent models (including Goody, Efimtsov, Laganelli, Smol’yakov, and Rackl and Weston models) were derived by making modifications to these early models or by physical principles. Overall, these models have had varying levels of accuracy, but, in general, they are most accurate under the specific Reynolds and Mach numbers they were developed for, while being less accurate under other flow conditions. Despite this, recent research into the possibility of using alternative methods for deriving the models has been rather limited. More recent studies have demonstrated that an artificial neural network model was more accurate than traditional models and could be applied more generally, but the accuracy of other machine learning techniques has not been explored. In the current study, an original model is derived using a stepwise regression algorithm in the statistical programming language R, and TBL wall-pressure fluctuations PSD data gathered at the Carleton University wind tunnel. The theoretical advantage of a stepwise regression approach is that it will automatically filter out redundant or uncorrelated input variables (through the process of feature selection), and it is computationally faster than machine learning. The main disadvantage is the potential risk of overfitting. The accuracy of the developed model is assessed by comparing it to independently sourced datasets.

Keywords: aircraft noise, machine learning, power spectral density models, regression models, turbulent boundary layer wall-pressure fluctuations

Procedia PDF Downloads 113
105 Assessment of Seeding and Weeding Field Robot Performance

Authors: Victor Bloch, Eerikki Kaila, Reetta Palva

Abstract:

Field robots are an important tool for enhancing efficiency and decreasing the climatic impact of food production. There exists a number of commercial field robots; however, since this technology is still new, the robot advantages and limitations, as well as methods for optimal using of robots, are still unclear. In this study, the performance of a commercial field robot for seeding and weeding was assessed. A research 2-ha sugar beet field with 0.5m row width was used for testing, which included robotic sowing of sugar beet and weeding five times during the first two months of the growing. About three and five percent of the field were used as untreated and chemically weeded control areas, respectively. The plant detection was based on the exact plant location without image processing. The robot was equipped with six seeding and weeding tools, including passive between-rows harrow hoes and active hoes cutting inside rows between the plants, and it moved with a maximal speed of 0.9 km/h. The robot's performance was assessed by image processing. The field images were collected by an action camera with a height of 2 m and a resolution 27M pixels installed on the robot and by a drone with a 16M pixel camera flying at 4 m height. To detect plants and weeds, the YOLO model was trained with transfer learning from two available datasets. A preliminary analysis of the entire field showed that in the areas treated by the robot, the weed average density varied across the field from 6.8 to 9.1 weeds/m² (compared with 0.8 in the chemically treated area and 24.3 in the untreated area), the weed average density inside rows was 2.0-2.9 weeds / m (compared with 0 on the chemically treated area), and the emergence rate was 90-95%. The information about the robot's performance has high importance for the application of robotics for field tasks. With the help of the developed method, the performance can be assessed several times during the growth according to the robotic weeding frequency. When it’s used by farmers, they can know the field condition and efficiency of the robotic treatment all over the field. Farmers and researchers could develop optimal strategies for using the robot, such as seeding and weeding timing, robot settings, and plant and field parameters and geometry. The robot producers can have quantitative information from an actual working environment and improve the robots accordingly.

Keywords: agricultural robot, field robot, plant detection, robot performance

Procedia PDF Downloads 42
104 Gender Perspective in Peace Operations: An Analysis of 14 UN Peace Operations

Authors: Maressa Aires de Proenca

Abstract:

The inclusion of a gender perspective in peace operations is based on a series of conventions, treaties, and resolutions designed to protect and include women addressing gender mainstreaming. The UN Security Council recognizes that women's participation and gender equality within peace operations are indispensable for achieving sustainable development and peace. However, the participation of women in the field of peace and security is still embryonic. There are gaps when we think about female participation in conflict resolution and peace promotion spaces, and it does not seem clear how women are present in these spaces. This absence may correspond to silence about representation and the guarantee of the female perspective within the context of peace promotion. Thus, the present research aimed to describe the panorama of the participation of women who are currently active in the 14 active UN peace operations, which are: 1) MINUJUSTH, Haiti, 2) MINURSO, Western Sahara, 3) MINUSCA, Central African Republic, 4) MINUSMA, Mali, 5) MONUSCO, the Democratic Republic of the Congo, 6) UNAMID, Darfur, 7) UNDOF, Golan, 8) UNFICYP, Cyprus, 9) UNIFIL, Lebanon, 10) UNISFA, Abyei, 11) UNMIK, Kosovo, 12) UNMISS, South Sudan, 13) UNMOGIP, India, and Pakistan, and 14) UNTSO, Middle East. A database was constructed that reported: (1) position held by the woman in the peace operation, (2) her profession, (3) educational level, (4) marital status, (5) religion, (6) nationality, (8) number of years working with peace operations, (9) whether the operation in which it operates has provided training on gender issues. For the construction of this database, official reports and statistics accessed through the UN Peacekeeping Resource Hub were used; The United Nations Statistical Commission, Peacekeeping Master Open Datasets, The Armed Conflict Database (ACD), The International Institute for Strategic Studies (IISS) database; Armed Conflict Location & Event Data Project (ACLED) database; from the Evidence and Data for Gender Equality (EDGE) database. In addition to access to databases, peacekeeping operations will be contacted directly, and data requested individually. The database showed that the presence of women in these peace operations is still incipient, but growing. There are few women in command positions, and most of them occupy administrative or human-care positions.

Keywords: women, peace and security, peacekeeping operations, peace studies

Procedia PDF Downloads 117
103 From Theory to Practice: Harnessing Mathematical and Statistical Sciences in Data Analytics

Authors: Zahid Ullah, Atlas Khan

Abstract:

The rapid growth of data in diverse domains has created an urgent need for effective utilization of mathematical and statistical sciences in data analytics. This abstract explores the journey from theory to practice, emphasizing the importance of harnessing mathematical and statistical innovations to unlock the full potential of data analytics. Drawing on a comprehensive review of existing literature and research, this study investigates the fundamental theories and principles underpinning mathematical and statistical sciences in the context of data analytics. It delves into key mathematical concepts such as optimization, probability theory, statistical modeling, and machine learning algorithms, highlighting their significance in analyzing and extracting insights from complex datasets. Moreover, this abstract sheds light on the practical applications of mathematical and statistical sciences in real-world data analytics scenarios. Through case studies and examples, it showcases how mathematical and statistical innovations are being applied to tackle challenges in various fields such as finance, healthcare, marketing, and social sciences. These applications demonstrate the transformative power of mathematical and statistical sciences in data-driven decision-making. The abstract also emphasizes the importance of interdisciplinary collaboration, as it recognizes the synergy between mathematical and statistical sciences and other domains such as computer science, information technology, and domain-specific knowledge. Collaborative efforts enable the development of innovative methodologies and tools that bridge the gap between theory and practice, ultimately enhancing the effectiveness of data analytics. Furthermore, ethical considerations surrounding data analytics, including privacy, bias, and fairness, are addressed within the abstract. It underscores the need for responsible and transparent practices in data analytics, and highlights the role of mathematical and statistical sciences in ensuring ethical data handling and analysis. In conclusion, this abstract highlights the journey from theory to practice in harnessing mathematical and statistical sciences in data analytics. It showcases the practical applications of these sciences, the importance of interdisciplinary collaboration, and the need for ethical considerations. By bridging the gap between theory and practice, mathematical and statistical sciences contribute to unlocking the full potential of data analytics, empowering organizations and decision-makers with valuable insights for informed decision-making.

Keywords: data analytics, mathematical sciences, optimization, machine learning, interdisciplinary collaboration, practical applications

Procedia PDF Downloads 66
102 Prediction of Alzheimer's Disease Based on Blood Biomarkers and Machine Learning Algorithms

Authors: Man-Yun Liu, Emily Chia-Yu Su

Abstract:

Alzheimer's disease (AD) is the public health crisis of the 21st century. AD is a degenerative brain disease and the most common cause of dementia, a costly disease on the healthcare system. Unfortunately, the cause of AD is poorly understood, furthermore; the treatments of AD so far can only alleviate symptoms rather cure or stop the progress of the disease. Currently, there are several ways to diagnose AD; medical imaging can be used to distinguish between AD, other dementias, and early onset AD, and cerebrospinal fluid (CSF). Compared with other diagnostic tools, blood (plasma) test has advantages as an approach to population-based disease screening because it is simpler, less invasive also cost effective. In our study, we used blood biomarkers dataset of The Alzheimer’s disease Neuroimaging Initiative (ADNI) which was funded by National Institutes of Health (NIH) to do data analysis and develop a prediction model. We used independent analysis of datasets to identify plasma protein biomarkers predicting early onset AD. Firstly, to compare the basic demographic statistics between the cohorts, we used SAS Enterprise Guide to do data preprocessing and statistical analysis. Secondly, we used logistic regression, neural network, decision tree to validate biomarkers by SAS Enterprise Miner. This study generated data from ADNI, contained 146 blood biomarkers from 566 participants. Participants include cognitive normal (healthy), mild cognitive impairment (MCI), and patient suffered Alzheimer’s disease (AD). Participants’ samples were separated into two groups, healthy and MCI, healthy and AD, respectively. We used the two groups to compare important biomarkers of AD and MCI. In preprocessing, we used a t-test to filter 41/47 features between the two groups (healthy and AD, healthy and MCI) before using machine learning algorithms. Then we have built model with 4 machine learning methods, the best AUC of two groups separately are 0.991/0.709. We want to stress the importance that the simple, less invasive, common blood (plasma) test may also early diagnose AD. As our opinion, the result will provide evidence that blood-based biomarkers might be an alternative diagnostics tool before further examination with CSF and medical imaging. A comprehensive study on the differences in blood-based biomarkers between AD patients and healthy subjects is warranted. Early detection of AD progression will allow physicians the opportunity for early intervention and treatment.

Keywords: Alzheimer's disease, blood-based biomarkers, diagnostics, early detection, machine learning

Procedia PDF Downloads 297
101 Atmospheric Circulation Types Related to Dust Transport Episodes over Crete in the Eastern Mediterranean

Authors: K. Alafogiannis, E. E. Houssos, E. Anagnostou, G. Kouvarakis, N. Mihalopoulos, A. Fotiadi

Abstract:

The Mediterranean basin is an area where different aerosol types coexist, including urban/industrial, desert dust, biomass burning and marine particles. Particularly, mineral dust aerosols, mostly originated from North African deserts, significantly contribute to high aerosol loads above the Mediterranean. Dust transport, controlled by the variation of the atmospheric circulation throughout the year, results in a strong spatial and temporal variability of aerosol properties. In this study, the synoptic conditions which favor dust transport over the Eastern Mediterranean are thoroughly investigated. For this reason, three datasets are employed. Firstly, ground-based daily data of aerosol properties, namely Aerosol Optical Thickness (AOT), Ångström exponent (α440-870) and fine fraction from the FORTH-AERONET (Aerosol Robotic Network) station along with measurements of PM10 concentrations from Finokalia station, for the period 2003-2011, are used to identify days with high coarse aerosol load (episodes) over Crete. Then, geopotential height at 1000, 850 and 700 hPa levels obtained from the NCEP/NCAR Reanalysis Project, are utilized to depict the atmospheric circulation during the identified episodes. Additionally, air-mass back trajectories, calculated by HYSPLIT, are used to verify the origin of aerosols from neighbouring deserts. For the 227 identified dust episodes, the statistical methods of Factor and Cluster Analysis are applied on the corresponding atmospheric circulation data to reveal the main types of the synoptic conditions favouring dust transport towards Crete (Eastern Mediterranean). The 227 cases are classified into 11 distinct types (clusters). Dust episodes in Eastern Mediterranean, are found to be more frequent (52%) in spring with a secondary maximum in autumn. The main characteristic of the atmospheric circulation associated with dust episodes, is the presence of a low-pressure system at surface, either in southwestern Europe or western/central Mediterranean, which induces a southerly air flow favouring dust transport from African deserts. The exact position and the intensity of the low-pressure system vary notably among clusters. More rarely dust may originate from deserts of Arabian Peninsula.

Keywords: aerosols, atmospheric circulation, dust particles, Eastern Mediterranean

Procedia PDF Downloads 206
100 Spatial Analysis as a Tool to Assess Risk Management in Peru

Authors: Josué Alfredo Tomas Machaca Fajardo, Jhon Elvis Chahua Janampa, Pedro Rau Lavado

Abstract:

A flood vulnerability index was developed for the Piura River watershed in northern Peru using Principal Component Analysis (PCA) to assess flood risk. The official methodology to assess risk from natural hazards in Peru was introduced in 1980 and proved effective for aiding complex decision-making. This method relies in part on decision-makers defining subjective correlations between variables to identify high-risk areas. While risk identification and ensuing response activities benefit from a qualitative understanding of influences, this method does not take advantage of the advent of national and international data collection efforts, which can supplement our understanding of risk. Furthermore, this method does not take advantage of broadly applied statistical methods such as PCA, which highlight central indicators of vulnerability. Nowadays, information processing is much faster and allows for more objective decision-making tools, such as PCA. The approach presented here develops a tool to improve the current flood risk assessment in the Peruvian basin. Hence, the spatial analysis of the census and other datasets provides a better understanding of the current land occupation and a basin-wide distribution of services and human populations, a necessary step toward ultimately reducing flood risk in Peru. PCA allows the simplification of a large number of variables into a few factors regarding social, economic, physical and environmental dimensions of vulnerability. There is a correlation between the location of people and the water availability mainly found in rivers. For this reason, a comprehensive vision of the population location around the river basin is necessary to establish flood prevention policies. The grouping of 5x5 km gridded areas allows the spatial analysis of flood risk rather than assessing political divisions of the territory. The index was applied to the Peruvian region of Piura, where several flood events occurred in recent past years, being one of the most affected regions during the ENSO events in Peru. The analysis evidenced inequalities for the access to basic services, such as water, electricity, internet and sewage, between rural and urban areas.

Keywords: assess risk, flood risk, indicators of vulnerability, principal component analysis

Procedia PDF Downloads 164
99 The Investigation of Work Stress and Burnout in Nurse Anesthetists: A Cross-Sectional Study

Authors: Yen Ling Liu, Shu-Fen Wu, Chen-Fuh Lam, I-Ling Tsai, Chia-Yu Chen

Abstract:

Purpose: Nurse anesthetists are confronting extraordinarily high job stress in their daily practice, deriving from the fast-track anesthesia care, risk of perioperative complications, routine rotating shifts, teaching programs and interactions with the surgical team in the operating room. This study investigated the influence of work stress on the burnout and turnover intention of nurse anesthetists in a regional general hospital in Southern Taiwan. Methods: This was a descriptive correlational study carried out in 66 full-time nurse anesthetists. Data was collected from March 2017 to June 2017 by in-person interview, and a self-administered structured questionnaire was completed by the interviewee. Outcome measurements included the Practice Environment Scale of the Nursing Work Index (PES-NWI), Maslach Burnout Inventory (MBI) and nursing staff turnover intention. Numerical data were analyzed by descriptive statistics, independent t test, or one-way ANOVA. Categorical data were compared using the chi-square test (x²). Datasets were computed with Pearson product-moment correlation and linear regression. Data were analyzed by using SPSS 20.0 software. Results: The average score for job burnout was 68.7916.67 (out of 100). The three major components of burnout, including emotional depletion (mean score of 26.32), depersonalization (mean score of 13.65), and personal(mean score of 24.48). These average scores suggested that these nurse anesthetists were at high risk of burnout and inversely correlated with turnover intention (t = -4.048, P < 0.05). Using linear regression model, emotional exhaustion and depersonalization were the two independent factors that predicted turnover intention in the nurse anesthetists (19.1% in total variance). Conclusion/Implications for Practice: The study identifies that the high risk of job burnout in the nurse anesthetists is not simply derived from physical overload, but most likely resulted from the additional emotional and psychological stress. The occurrence of job burnout may affect the quality of nursing work, and also influence family harmony, in turn, may increase the turnover rate. Multimodal approach is warranted to reduce work stress and job burnout in nurse anesthetists to enhance their willingness to contribute in anesthesia care.

Keywords: anesthesia nurses, burnout, job, turnover intention

Procedia PDF Downloads 267
98 Development of a 3D Model of Real Estate Properties in Fort Bonifacio, Taguig City, Philippines Using Geographic Information Systems

Authors: Lyka Selene Magnayi, Marcos Vinas, Roseanne Ramos

Abstract:

As the real estate industry continually grows in the Philippines, Geographic Information Systems (GIS) provide advantages in generating spatial databases for efficient delivery of information and services. The real estate sector is not only providing qualitative data about real estate properties but also utilizes various spatial aspects of these properties for different applications such as hazard mapping and assessment. In this study, a three-dimensional (3D) model and a spatial database of real estate properties in Fort Bonifacio, Taguig City are developed using GIS and SketchUp. Spatial datasets include political boundaries, buildings, road network, digital terrain model (DTM) derived from Interferometric Synthetic Aperture Radar (IFSAR) image, Google Earth satellite imageries, and hazard maps. Multiple model layers were created based on property listings by a partner real estate company, including existing and future property buildings. Actual building dimensions, building facade, and building floorplans are incorporated in these 3D models for geovisualization. Hazard model layers are determined through spatial overlays, and different scenarios of hazards are also presented in the models. Animated maps and walkthrough videos were created for company presentation and evaluation. Model evaluation is conducted through client surveys requiring scores in terms of the appropriateness, information content, and design of the 3D models. Survey results show very satisfactory ratings, with the highest average evaluation score equivalent to 9.21 out of 10. The output maps and videos obtained passing rates based on the criteria and standards set by the intended users of the partner real estate company. The methodologies presented in this study were found useful and have remarkable advantages in the real estate industry. This work may be extended to automated mapping and creation of online spatial databases for better storage, access of real property listings and interactive platform using web-based GIS.

Keywords: geovisualization, geographic information systems, GIS, real estate, spatial database, three-dimensional model

Procedia PDF Downloads 131
97 Application of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and Multipoint Optimal Minimum Entropy Deconvolution in Railway Bearings Fault Diagnosis

Authors: Yao Cheng, Weihua Zhang

Abstract:

Although the measured vibration signal contains rich information on machine health conditions, the white noise interferences and the discrete harmonic coming from blade, shaft and mash make the fault diagnosis of rolling element bearings difficult. In order to overcome the interferences of useless signals, a new fault diagnosis method combining Complete Ensemble Empirical Mode Decomposition with adaptive noise (CEEMDAN) and Multipoint Optimal Minimum Entropy Deconvolution (MOMED) is proposed for the fault diagnosis of high-speed train bearings. Firstly, the CEEMDAN technique is applied to adaptively decompose the raw vibration signal into a series of finite intrinsic mode functions (IMFs) and a residue. Compared with Ensemble Empirical Mode Decomposition (EEMD), the CEEMDAN can provide an exact reconstruction of the original signal and a better spectral separation of the modes, which improves the accuracy of fault diagnosis. An effective sensitivity index based on the Pearson's correlation coefficients between IMFs and raw signal is adopted to select sensitive IMFs that contain bearing fault information. The composite signal of the sensitive IMFs is applied to further analysis of fault identification. Next, for propose of identifying the fault information precisely, the MOMED is utilized to enhance the periodic impulses in composite signal. As a non-iterative method, the MOMED has better deconvolution performance than the classical deconvolution methods such Minimum Entropy Deconvolution (MED) and Maximum Correlated Kurtosis Deconvolution (MCKD). Third, the envelope spectrum analysis is applied to detect the existence of bearing fault. The simulated bearing fault signals with white noise and discrete harmonic interferences are used to validate the effectiveness of the proposed method. Finally, the superiorities of the proposed method are further demonstrated by high-speed train bearing fault datasets measured from test rig. The analysis results indicate that the proposed method has strong practicability.

Keywords: bearing, complete ensemble empirical mode decomposition with adaptive noise, fault diagnosis, multipoint optimal minimum entropy deconvolution

Procedia PDF Downloads 341
96 Land Cover Mapping Using Sentinel-2, Landsat-8 Satellite Images, and Google Earth Engine: A Study Case of the Beterou Catchment

Authors: Ella Sèdé Maforikan

Abstract:

Accurate land cover mapping is essential for effective environmental monitoring and natural resources management. This study focuses on assessing the classification performance of two satellite datasets and evaluating the impact of different input feature combinations on classification accuracy in the Beterou catchment, situated in the northern part of Benin. Landsat-8 and Sentinel-2 images from June 1, 2020, to March 31, 2021, were utilized. Employing the Random Forest (RF) algorithm on Google Earth Engine (GEE), a supervised classification categorized the land into five classes: forest, savannas, cropland, settlement, and water bodies. GEE was chosen due to its high-performance computing capabilities, mitigating computational burdens associated with traditional land cover classification methods. By eliminating the need for individual satellite image downloads and providing access to an extensive archive of remote sensing data, GEE facilitated efficient model training on remote sensing data. The study achieved commendable overall accuracy (OA), ranging from 84% to 85%, even without incorporating spectral indices and terrain metrics into the model. Notably, the inclusion of additional input sources, specifically terrain features like slope and elevation, enhanced classification accuracy. The highest accuracy was achieved with Sentinel-2 (OA = 91%, Kappa = 0.88), slightly surpassing Landsat-8 (OA = 90%, Kappa = 0.87). This underscores the significance of combining diverse input sources for optimal accuracy in land cover mapping. The methodology presented herein not only enables the creation of precise, expeditious land cover maps but also demonstrates the prowess of cloud computing through GEE for large-scale land cover mapping with remarkable accuracy. The study emphasizes the synergy of different input sources to achieve superior accuracy. As a future recommendation, the application of Light Detection and Ranging (LiDAR) technology is proposed to enhance vegetation type differentiation in the Beterou catchment. Additionally, a cross-comparison between Sentinel-2 and Landsat-8 for assessing long-term land cover changes is suggested.

Keywords: land cover mapping, Google Earth Engine, random forest, Beterou catchment

Procedia PDF Downloads 31
95 An Alternative Credit Scoring System in China’s Consumer Lendingmarket: A System Based on Digital Footprint Data

Authors: Minjuan Sun

Abstract:

Ever since the late 1990s, China has experienced explosive growth in consumer lending, especially in short-term consumer loans, among which, the growth rate of non-bank lending has surpassed bank lending due to the development in financial technology. On the other hand, China does not have a universal credit scoring and registration system that can guide lenders during the processes of credit evaluation and risk control, for example, an individual’s bank credit records are not available for online lenders to see and vice versa. Given this context, the purpose of this paper is three-fold. First, we explore if and how alternative digital footprint data can be utilized to assess borrower’s creditworthiness. Then, we perform a comparative analysis of machine learning methods for the canonical problem of credit default prediction. Finally, we analyze, from an institutional point of view, the necessity of establishing a viable and nationally universal credit registration and scoring system utilizing online digital footprints, so that more people in China can have better access to the consumption loan market. Two different types of digital footprint data are utilized to match with bank’s loan default records. Each separately captures distinct dimensions of a person’s characteristics, such as his shopping patterns and certain aspects of his personality or inferred demographics revealed by social media features like profile image and nickname. We find both datasets can generate either acceptable or excellent prediction results, and different types of data tend to complement each other to get better performances. Typically, the traditional types of data banks normally use like income, occupation, and credit history, update over longer cycles, hence they can’t reflect more immediate changes, like the financial status changes caused by the business crisis; whereas digital footprints can update daily, weekly, or monthly, thus capable of providing a more comprehensive profile of the borrower’s credit capabilities and risks. From the empirical and quantitative examination, we believe digital footprints can become an alternative information source for creditworthiness assessment, because of their near-universal data coverage, and because they can by and large resolve the "thin-file" issue, due to the fact that digital footprints come in much larger volume and higher frequency.

Keywords: credit score, digital footprint, Fintech, machine learning

Procedia PDF Downloads 132
94 An Improvement of ComiR Algorithm for MicroRNA Target Prediction by Exploiting Coding Region Sequences of mRNAs

Authors: Giorgio Bertolazzi, Panayiotis Benos, Michele Tumminello, Claudia Coronnello

Abstract:

MicroRNAs are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR (Combinatorial miRNA targeting) is a user friendly web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR incorporates miRNA expression in a thermodynamic binding model, and it associates each gene with the probability of being a target of a set of miRNAs. ComiR algorithms were trained with the information regarding binding sites in the 3’UTR region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein; this protein is a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in the ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that the ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'UTR and coding regions, should be considered in a comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’UTR based one.

Keywords: AGO1, coding region, Drosophila melanogaster, microRNA target prediction

Procedia PDF Downloads 412
93 Characterisation of Meteorological Drought at Sub-Catchment Scale in Afghanistan Using Time-Series Climate Data

Authors: Yun Chen, David Penton, Fazlul Karim, Santosh Aryal, Shahriar Wahid, Peter Taylor, Susan M. Cuddy

Abstract:

Droughts have severely affected Afghanistan over the last four decades, leading to critical food shortages where two-thirds of the country’s population are in a food crisis. Long years of conflict have lowered the country’s ability to deal with hazards such as drought, which can rapidly escalate into disasters. Understanding the spatial and temporal distribution of droughts is needed to be able to respond effectively to disasters and plan for future occurrences. This study used Standardized Precipitation Evapotranspiration Index (SPEI) at monthly, seasonal, and annual temporal scales to map the spatiotemporal change dynamics of drought characteristics (distribution, frequency, duration, and severity) in Afghanistan. SPEI indices were mapped for river basins, disaggregated into 189 sub-catchments, using monthly precipitation and potential evapotranspiration derived from temperature station observations from 1980 to 2017. The results show these multi-dimensional drought characteristics vary along different years, change among sub-catchments, and differ across temporal scales. During the 38 years, the driest decade and period are the 2000s and 1999–2022, respectively. The 2000–01 water year is the driest, with the whole country experiencing ‘severe’ to ‘extreme’ drought, more than 53% (87 sub-catchments) suffering the worst drought in history, and about 58% (94 sub-catchments) having ‘very frequent’ drought (7 to 8 months) or ‘extremely frequent’ drought (9 to 10 months). The estimated seasonal duration and severity present significant variations across the study area and throughout the study period. The nation also suffered from recurring droughts with varying length and intensity in 2004, 2006, 2008, and, most recently, 2011. There is a trend towards increasing drought with longer duration and higher severity extending all over sub-catchments from southeast to north and central regions. These datasets and maps help to fill the knowledge gap on detailed sub-catchment scale meteorological drought characteristics in Afghanistan. The study findings improve our understanding of the influences of climate change on drought dynamics and can guide catchment planning for reliable adaptation to and mitigation against future droughts.

Keywords: SPEI, precipitation, evapotranspiration, climate extremes

Procedia PDF Downloads 64
92 Do the Health Benefits of Oil-Led Economic Development Outweigh the Potential Health Harms from Environmental Pollution in Nigeria?

Authors: Marian Emmanuel Okon

Abstract:

Introduction: The Niger Delta region of Nigeria has a vast reserve of oil and gas, which has globally positioned the nation as the sixth largest exporter of crude oil. Production rapidly rose following oil discovery. In most oil producing nations of the world, the wealth generated from oil production and export has propelled economic advancement, enabling the development of industries and other relevant infrastructures. Therefore, it can be assumed that majority of the oil resource such as Nigeria’s, has the potential to improve the health of the population via job creation and derived revenues. However, the health benefits of this economic development might be offset by the environmental consequences of oil exploitation and production. Objective: This research aims to evaluate the balance between the health benefits of oil-led economic development and harmful environmental consequences of crude oil exploitation in Nigeria. Study Design: A pathway has been designed to guide data search and this study. The model created will assess the relationship between oil-led economic development and population health development via job creation, improvement of education, development of infrastructure and other forms of development as well as through harmful environmental consequences from oil activities. Data/Emerging Findings: Diverse potentially suitable datasets which are at different geographical scales have been identified, obtained or applied for and the dataset from the World Bank has been the most thoroughly explored. This large dataset contains information that would enable the longitudinal assessment of both the health benefits and harms from oil exploitation in Nigeria as well as identify the disparities that exist between the communities, states and regions. However, these data do not extend far back enough in time to capture the start of crude oil production. Thus, it is possible that the maximum economic benefits and health harms could be missed. To deal with this shortcoming, the potential for a comparative study with countries like United Kingdom, Morocco and Cote D’ivoire has also been taken into consideration, so as to evaluate the differences between these countries as well as identify the areas of improvement in Nigeria’s environmental and health policies. Notwithstanding, these data have shown some differences in each country’s economic, environmental and health state over time as well as a corresponding summary statistics. Conclusion: In theory, the beneficial effects of oil exploitation to the health of the population may be substantial as large swaths of the ‘wider determinants’ of population heath are influenced by the wealth of a nation. However, if uncontrolled, the consequences from environmental pollution and degradation may outweigh these benefits. Thus, there is a need to address this, in order to improve environmental and population health in Nigeria.

Keywords: environmental pollution, health benefits, oil-led economic development, petroleum exploitation

Procedia PDF Downloads 303
91 The Use of Artificial Intelligence in Digital Forensics and Incident Response in a Constrained Environment

Authors: Dipo Dunsin, Mohamed C. Ghanem, Karim Ouazzane

Abstract:

Digital investigators often have a hard time spotting evidence in digital information. It has become hard to determine which source of proof relates to a specific investigation. A growing concern is that the various processes, technology, and specific procedures used in the digital investigation are not keeping up with criminal developments. Therefore, criminals are taking advantage of these weaknesses to commit further crimes. In digital forensics investigations, artificial intelligence is invaluable in identifying crime. It has been observed that an algorithm based on artificial intelligence (AI) is highly effective in detecting risks, preventing criminal activity, and forecasting illegal activity. Providing objective data and conducting an assessment is the goal of digital forensics and digital investigation, which will assist in developing a plausible theory that can be presented as evidence in court. Researchers and other authorities have used the available data as evidence in court to convict a person. This research paper aims at developing a multiagent framework for digital investigations using specific intelligent software agents (ISA). The agents communicate to address particular tasks jointly and keep the same objectives in mind during each task. The rules and knowledge contained within each agent are dependent on the investigation type. A criminal investigation is classified quickly and efficiently using the case-based reasoning (CBR) technique. The MADIK is implemented using the Java Agent Development Framework and implemented using Eclipse, Postgres repository, and a rule engine for agent reasoning. The proposed framework was tested using the Lone Wolf image files and datasets. Experiments were conducted using various sets of ISA and VMs. There was a significant reduction in the time taken for the Hash Set Agent to execute. As a result of loading the agents, 5 percent of the time was lost, as the File Path Agent prescribed deleting 1,510, while the Timeline Agent found multiple executable files. In comparison, the integrity check carried out on the Lone Wolf image file using a digital forensic tool kit took approximately 48 minutes (2,880 ms), whereas the MADIK framework accomplished this in 16 minutes (960 ms). The framework is integrated with Python, allowing for further integration of other digital forensic tools, such as AccessData Forensic Toolkit (FTK), Wireshark, Volatility, and Scapy.

Keywords: artificial intelligence, computer science, criminal investigation, digital forensics

Procedia PDF Downloads 182
90 Ribotaxa: Combined Approaches for Taxonomic Resolution Down to the Species Level from Metagenomics Data Revealing Novelties

Authors: Oshma Chakoory, Sophie Comtet-Marre, Pierre Peyret

Abstract:

Metagenomic classifiers are widely used for the taxonomic profiling of metagenomic data and estimation of taxa relative abundance. Small subunit rRNA genes are nowadays a gold standard for the phylogenetic resolution of complex microbial communities, although the power of this marker comes down to its use as full-length. We benchmarked the performance and accuracy of rRNA-specialized versus general-purpose read mappers, reference-targeted assemblers and taxonomic classifiers. We then built a pipeline called RiboTaxa to generate a highly sensitive and specific metataxonomic approach. Using metagenomics data, RiboTaxa gave the best results compared to other tools (Kraken2, Centrifuge (1), METAXA2 (2), PhyloFlash (3)) with precise taxonomic identification and relative abundance description, giving no false positive detection. Using real datasets from various environments (ocean, soil, human gut) and from different approaches (metagenomics and gene capture by hybridization), RiboTaxa revealed microbial novelties not seen by current bioinformatics analysis opening new biological perspectives in human and environmental health. In a study focused on corals’ health involving 20 metagenomic samples (4), an affiliation of prokaryotes was limited to the family level with Endozoicomonadaceae characterising healthy octocoral tissue. RiboTaxa highlighted 2 species of uncultured Endozoicomonas which were dominant in the healthy tissue. Both species belonged to a genus not yet described, opening new research perspectives on corals’ health. Applied to metagenomics data from a study on human gut and extreme longevity (5), RiboTaxa detected the presence of an uncultured archaeon in semi-supercentenarians (aged 105 to 109 years) highlighting an archaeal genus, not yet described, and 3 uncultured species belonging to the Enorma genus that could be species of interest participating in the longevity process. RiboTaxa is user-friendly, rapid, allowing microbiota structure description from any environment and the results can be easily interpreted. This software is freely available at https://github.com/oschakoory/RiboTaxa under the GNU Affero General Public License 3.0.

Keywords: metagenomics profiling, microbial diversity, SSU rRNA genes, full-length phylogenetic marker

Procedia PDF Downloads 91
89 Groundwater Potential Mapping using Frequency Ratio and Shannon’s Entropy Models in Lesser Himalaya Zone, Nepal

Authors: Yagya Murti Aryal, Bipin Adhikari, Pradeep Gyawali

Abstract:

The Lesser Himalaya zone of Nepal consists of thrusting and folding belts, which play an important role in the sustainable management of groundwater in the Himalayan regions. The study area is located in the Dolakha and Ramechhap Districts of Bagmati Province, Nepal. Geologically, these districts are situated in the Lesser Himalayas and partly encompass the Higher Himalayan rock sequence, which includes low-grade to high-grade metamorphic rocks. Following the Gorkha Earthquake in 2015, numerous springs dried up, and many others are currently experiencing depletion due to the distortion of the natural groundwater flow. The primary objective of this study is to identify potential groundwater areas and determine suitable sites for artificial groundwater recharge. Two distinct statistical approaches were used to develop models: The Frequency Ratio (FR) and Shannon Entropy (SE) methods. The study utilized both primary and secondary datasets and incorporated significant role and controlling factors derived from field works and literature reviews. Field data collection involved spring inventory, soil analysis, lithology assessment, and hydro-geomorphology study. Additionally, slope, aspect, drainage density, and lineament density were extracted from a Digital Elevation Model (DEM) using GIS and transformed into thematic layers. For training and validation, 114 springs were divided into a 70/30 ratio, with an equal number of non-spring pixels. After assigning weights to each class based on the two proposed models, a groundwater potential map was generated using GIS, classifying the area into five levels: very low, low, moderate, high, and very high. The model's outcome reveals that over 41% of the area falls into the low and very low potential categories, while only 30% of the area demonstrates a high probability of groundwater potential. To evaluate model performance, accuracy was assessed using the Area under the Curve (AUC). The success rate AUC values for the FR and SE methods were determined to be 78.73% and 77.09%, respectively. Additionally, the prediction rate AUC values for the FR and SE methods were calculated as 76.31% and 74.08%. The results indicate that the FR model exhibits greater prediction capability compared to the SE model in this case study.

Keywords: groundwater potential mapping, frequency ratio, Shannon’s Entropy, Lesser Himalaya Zone, sustainable groundwater management

Procedia PDF Downloads 46
88 Private Coded Computation of Matrix Multiplication

Authors: Malihe Aliasgari, Yousef Nejatbakhsh

Abstract:

The era of Big Data and the immensity of real-life datasets compels computation tasks to be performed in a distributed fashion, where the data is dispersed among many servers that operate in parallel. However, massive parallelization leads to computational bottlenecks due to faulty servers and stragglers. Stragglers refer to a few slow or delay-prone processors that can bottleneck the entire computation because one has to wait for all the parallel nodes to finish. The problem of straggling processors, has been well studied in the context of distributed computing. Recently, it has been pointed out that, for the important case of linear functions, it is possible to improve over repetition strategies in terms of the tradeoff between performance and latency by carrying out linear precoding of the data prior to processing. The key idea is that, by employing suitable linear codes operating over fractions of the original data, a function may be completed as soon as enough number of processors, depending on the minimum distance of the code, have completed their operations. The problem of matrix-matrix multiplication in the presence of practically big sized of data sets faced with computational and memory related difficulties, which makes such operations are carried out using distributed computing platforms. In this work, we study the problem of distributed matrix-matrix multiplication W = XY under storage constraints, i.e., when each server is allowed to store a fixed fraction of each of the matrices X and Y, which is a fundamental building of many science and engineering fields such as machine learning, image and signal processing, wireless communication, optimization. Non-secure and secure matrix multiplication are studied. We want to study the setup, in which the identity of the matrix of interest should be kept private from the workers and then obtain the recovery threshold of the colluding model, that is, the number of workers that need to complete their task before the master server can recover the product W. The problem of secure and private distributed matrix multiplication W = XY which the matrix X is confidential, while matrix Y is selected in a private manner from a library of public matrices. We present the best currently known trade-off between communication load and recovery threshold. On the other words, we design an achievable PSGPD scheme for any arbitrary privacy level by trivially concatenating a robust PIR scheme for arbitrary colluding workers and private databases and the proposed SGPD code that provides a smaller computational complexity at the workers.

Keywords: coded distributed computation, private information retrieval, secret sharing, stragglers

Procedia PDF Downloads 92
87 Cosmetic Recommendation Approach Using Machine Learning

Authors: Shakila N. Senarath, Dinesh Asanka, Janaka Wijayanayake

Abstract:

The necessity of cosmetic products is arising to fulfill consumer needs of personality appearance and hygiene. A cosmetic product consists of various chemical ingredients which may help to keep the skin healthy or may lead to damages. Every chemical ingredient in a cosmetic product does not perform on every human. The most appropriate way to select a healthy cosmetic product is to identify the texture of the body first and select the most suitable product with safe ingredients. Therefore, the selection process of cosmetic products is complicated. Consumer surveys have shown most of the time, the selection process of cosmetic products is done in an improper way by consumers. From this study, a content-based system is suggested that recommends cosmetic products for the human factors. To such an extent, the skin type, gender and price range will be considered as human factors. The proposed system will be implemented by using Machine Learning. Consumer skin type, gender and price range will be taken as inputs to the system. The skin type of consumer will be derived by using the Baumann Skin Type Questionnaire, which is a value-based approach that includes several numbers of questions to derive the user’s skin type to one of the 16 skin types according to the Bauman Skin Type indicator (BSTI). Two datasets are collected for further research proceedings. The user data set was collected using a questionnaire given to the public. Those are the user dataset and the cosmetic dataset. Product details are included in the cosmetic dataset, which belongs to 5 different kinds of product categories (Moisturizer, Cleanser, Sun protector, Face Mask, Eye Cream). An alternate approach of TF-IDF (Term Frequency – Inverse Document Frequency) is applied to vectorize cosmetic ingredients in the generic cosmetic products dataset and user-preferred dataset. Using the IF-IPF vectors, each user-preferred products dataset and generic cosmetic products dataset can be represented as sparse vectors. The similarity between each user-preferred product and generic cosmetic product will be calculated using the cosine similarity method. For the recommendation process, a similarity matrix can be used. Higher the similarity, higher the match for consumer. Sorting a user column from similarity matrix in a descending order, the recommended products can be retrieved in ascending order. Even though results return a list of similar products, and since the user information has been gathered, such as gender and the price ranges for product purchasing, further optimization can be done by considering and giving weights for those parameters once after a set of recommended products for a user has been retrieved.

Keywords: content-based filtering, cosmetics, machine learning, recommendation system

Procedia PDF Downloads 110