Search results for: KaraAgroAI cocoa dataset
297 Human-Centred Data Analysis Method for Future Design of Residential Spaces: Coliving Case Study
Authors: Alicia Regodon Puyalto, Alfonso Garcia-Santos
Abstract:
This article presents a method to analyze the use of indoor spaces based on data analytics obtained from inbuilt digital devices. The study uses the data generated by the in-place devices, such as smart locks, Wi-Fi routers, and electrical sensors, to gain additional insights on space occupancy, user behaviour, and comfort. Those devices, originally installed to facilitate remote operations, report data through the internet that the research uses to analyze information on human real-time use of spaces. Using an in-place Internet of Things (IoT) network enables a faster, more affordable, seamless, and scalable solution to analyze building interior spaces without incorporating external data collection systems such as sensors. The methodology is applied to a real case study of coliving, a residential building of 3000m², 7 floors, and 80 users in the centre of Madrid. The case study applies the method to classify IoT devices, assess, clean, and analyze collected data based on the analysis framework. The information is collected remotely, through the different platforms devices' platforms; the first step is to curate the data, understand what insights can be provided from each device according to the objectives of the study, this generates an analysis framework to be escalated for future building assessment even beyond the residential sector. The method will adjust the parameters to be analyzed tailored to the dataset available in the IoT of each building. The research demonstrates how human-centered data analytics can improve the future spatial design of indoor spaces.Keywords: in-place devices, IoT, human-centred data-analytics, spatial design
Procedia PDF Downloads 197296 Predictive Modeling of Bridge Conditions Using Random Forest
Authors: Miral Selim, May Haggag, Ibrahim Abotaleb
Abstract:
The aging of transportation infrastructure presents significant challenges, particularly concerning the monitoring and maintenance of bridges. This study investigates the application of Random Forest algorithms for predictive modeling of bridge conditions, utilizing data from the US National Bridge Inventory (NBI). The research is significant as it aims to improve bridge management through data-driven insights that can enhance maintenance strategies and contribute to overall safety. Random Forest is chosen for its robustness, ability to handle complex, non-linear relationships among variables, and its effectiveness in feature importance evaluation. The study begins with comprehensive data collection and cleaning, followed by the identification of key variables influencing bridge condition ratings, including age, construction materials, environmental factors, and maintenance history. Random Forest is utilized to examine the relationships between these variables and the predicted bridge conditions. The dataset is divided into training and testing subsets to evaluate the model's performance. The findings demonstrate that the Random Forest model effectively enhances the understanding of factors affecting bridge conditions. By identifying bridges at greater risk of deterioration, the model facilitates proactive maintenance strategies, which can help avoid costly repairs and minimize service disruptions. Additionally, this research underscores the value of data-driven decision-making, enabling better resource allocation to prioritize maintenance efforts where they are most necessary. In summary, this study highlights the efficiency and applicability of Random Forest in predictive modeling for bridge management. Ultimately, these findings pave the way for more resilient and proactive management of bridge systems, ensuring their longevity and reliability for future use.Keywords: data analysis, random forest, predictive modeling, bridge management
Procedia PDF Downloads 22295 Adjusting Electricity Demand Data to Account for the Impact of Loadshedding in Forecasting Models
Authors: Migael van Zyl, Stefanie Visser, Awelani Phaswana
Abstract:
The electricity landscape in South Africa is characterized by frequent occurrences of loadshedding, a measure implemented by Eskom to manage electricity generation shortages by curtailing demand. Loadshedding, classified into stages ranging from 1 to 8 based on severity, involves the systematic rotation of power cuts across municipalities according to predefined schedules. However, this practice introduces distortions in recorded electricity demand, posing challenges to accurate forecasting essential for budgeting, network planning, and generation scheduling. Addressing this challenge requires the development of a methodology to quantify the impact of loadshedding and integrate it back into metered electricity demand data. Fortunately, comprehensive records of loadshedding impacts are maintained in a database, enabling the alignment of Loadshedding effects with hourly demand data. This adjustment ensures that forecasts accurately reflect true demand patterns, independent of loadshedding's influence, thereby enhancing the reliability of electricity supply management in South Africa. This paper presents a methodology for determining the hourly impact of load scheduling and subsequently adjusting historical demand data to account for it. Furthermore, two forecasting models are developed: one utilizing the original dataset and the other using the adjusted data. A comparative analysis is conducted to evaluate forecast accuracy improvements resulting from the adjustment process. By implementing this methodology, stakeholders can make more informed decisions regarding electricity infrastructure investments, resource allocation, and operational planning, contributing to the overall stability and efficiency of South Africa's electricity supply system.Keywords: electricity demand forecasting, load shedding, demand side management, data science
Procedia PDF Downloads 61294 A Comparative Analysis of (De)legitimation Strategies in Selected African Inaugural Speeches
Authors: Lily Chimuanya, Ehioghae Esther
Abstract:
Language, a versatile and sophisticated tool, is fundamentally sacrosanct to mankind especially within the realm of politics. In this dynamic world, political leaders adroitly use language to engage in a strategic show aimed at manipulating or mechanising the opinion of discerning people. This nuanced synergy is marked by different rhetorical strategies, meticulously synced with contextual factors ranging from cultural, ideological, and political to achieve multifaceted persuasive objectives. This study investigates the (de)legitimation strategies inherent in African presidential inaugural speeches, as African leaders not only state their policy agenda through inaugural speeches but also subtly indulge in a dance of legitimation and delegitimation, performing a twofold objective of strengthening the credibility of their administration and, at times, undermining the performance of the past administration. Drawing insights from two different legitimation models and a dataset of 4 African presidential inaugural speeches obtained from authentic websites, the study describes the roles of authorisation, rationalisation, moral evaluation, altruism, and mythopoesis in unmasking the structure of political discourse. The analysis takes a mixed-method approach to unpack the (de)legitimation strategy embedded in the carefully chosen speeches. The focus extends beyond a superficial exploration and delves into the linguistic elements that form the basis of presidential discourse. In conclusion, this examination goes beyond the nuanced landscape of language as a potent tool in politics, with each strategy contributing to the overall rhetorical impact and shaping the narrative. From this perspective, the study argues that presidential inaugural speeches are not only linguistic exercises but also viable weapons that influence perceptions and legitimise authority.Keywords: CDA, legitimation, inaugural speeches, delegitmation
Procedia PDF Downloads 69293 Scalable and Accurate Detection of Pathogens from Whole-Genome Shotgun Sequencing
Authors: Janos Juhasz, Sandor Pongor, Balazs Ligeti
Abstract:
Next-generation sequencing, especially whole genome shotgun sequencing, is becoming a common approach to gain insight into the microbiomes in a culture-independent way, even in clinical practice. It does not only give us information about the species composition of an environmental sample but opens the possibility to detect antimicrobial resistance and novel, or currently unknown, pathogens. Accurately and reliably detecting the microbial strains is a challenging task. Here we present a sensitive approach for detecting pathogens in metagenomics samples with special regard to detecting novel variants of known pathogens. We have developed a pipeline that uses fast, short read aligner programs (i.e., Bowtie2/BWA) and comprehensive nucleotide databases. Taxonomic binning is based on the lowest common ancestor (LCA) principle; each read is assigned to a taxon, covering the most significantly hit taxa. This approach helps in balancing between sensitivity and running time. The program was tested both on experimental and synthetic data. The results implicate that our method performs as good as the state-of-the-art BLAST-based ones, furthermore, in some cases, it even proves to be better, while running two orders magnitude faster. It is sensitive and capable of identifying taxa being present only in small abundance. Moreover, it needs two orders of magnitude less reads to complete the identification than MetaPhLan2 does. We analyzed an experimental anthrax dataset (B. anthracis strain BA104). The majority of the reads (96.50%) was classified as Bacillus anthracis, a small portion, 1.2%, was classified as other species from the Bacillus genus. We demonstrate that the evaluation of high-throughput sequencing data is feasible in a reasonable time with good classification accuracy.Keywords: metagenomics, taxonomy binning, pathogens, microbiome, B. anthracis
Procedia PDF Downloads 137292 Adaptive Energy-Aware Routing (AEAR) for Optimized Performance in Resource-Constrained Wireless Sensor Networks
Authors: Innocent Uzougbo Onwuegbuzie
Abstract:
Wireless Sensor Networks (WSNs) are crucial for numerous applications, yet they face significant challenges due to resource constraints such as limited power and memory. Traditional routing algorithms like Dijkstra, Ad hoc On-Demand Distance Vector (AODV), and Bellman-Ford, while effective in path establishment and discovery, are not optimized for the unique demands of WSNs due to their large memory footprint and power consumption. This paper introduces the Adaptive Energy-Aware Routing (AEAR) model, a solution designed to address these limitations. AEAR integrates reactive route discovery, localized decision-making using geographic information, energy-aware metrics, and dynamic adaptation to provide a robust and efficient routing strategy. We present a detailed comparative analysis using a dataset of 50 sensor nodes, evaluating power consumption, memory footprint, and path cost across AEAR, Dijkstra, AODV, and Bellman-Ford algorithms. Our results demonstrate that AEAR significantly reduces power consumption and memory usage while optimizing path weight. This improvement is achieved through adaptive mechanisms that balance energy efficiency and link quality, ensuring prolonged network lifespan and reliable communication. The AEAR model's superior performance underlines its potential as a viable routing solution for energy-constrained WSN environments, paving the way for more sustainable and resilient sensor network deployments.Keywords: wireless sensor networks (WSNs), adaptive energy-aware routing (AEAR), routing algorithms, energy, efficiency, network lifespan
Procedia PDF Downloads 37291 Generalized Additive Model for Estimating Propensity Score
Authors: Tahmidul Islam
Abstract:
Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching
Procedia PDF Downloads 367290 In-Depth Analysis on Sequence Evolution and Molecular Interaction of Influenza Receptors (Hemagglutinin and Neuraminidase)
Authors: Dong Tran, Thanh Dac Van, Ly Le
Abstract:
Hemagglutinin (HA) and Neuraminidase (NA) play an important role in host immune evasion across influenza virus evolution process. The correlation between HA and NA evolution in respect to epitopic evolution and drug interaction has yet to be investigated. In this study, combining of sequence to structure evolution and statistical analysis on epitopic/binding site specificity, we identified potential therapeutic features of HA and NA that show specific antibody binding site of HA and specific binding distribution within NA active site of current inhibitors. Our approach introduces the use of sequence variation and molecular interaction to provide an effective strategy in establishing experimental based distributed representations of protein-protein/ligand complexes. The most important advantage of our method is that it does not require complete dataset of complexes but rather directly inferring feature interaction from sequence variation and molecular interaction. Using correlated sequence analysis, we additionally identified co-evolved mutations associated with maintaining HA/NA structural and functional variability toward immunity and therapeutic treatment. Our investigation on the HA binding specificity revealed unique conserved stalk domain interacts with unique loop domain of universal antibodies (CR9114, CT149, CR8043, CR8020, F16v3, CR6261, F10). On the other hand, NA inhibitors (Oseltamivir, Zaninamivir, Laninamivir) showed specific conserved residue contribution and similar to that of NA substrate (sialic acid) which can be exploited for drug design. Our study provides an important insight into rational design and identification of novel therapeutics targeting universally recognized feature of influenza HA/NA.Keywords: influenza virus, hemagglutinin (HA), neuraminidase (NA), sequence evolution
Procedia PDF Downloads 164289 Hands-off Parking: Deep Learning Gesture-based System for Individuals with Mobility Needs
Authors: Javier Romera, Alberto Justo, Ignacio Fidalgo, Joshue Perez, Javier Araluce
Abstract:
Nowadays, individuals with mobility needs face a significant challenge when docking vehicles. In many cases, after parking, they encounter insufficient space to exit, leading to two undesired outcomes: either avoiding parking in that spot or settling for improperly placed vehicles. To address this issue, the following paper presents a parking control system employing gestural teleoperation. The system comprises three main phases: capturing body markers, interpreting gestures, and transmitting orders to the vehicle. The initial phase is centered around the MediaPipe framework, a versatile tool optimized for real-time gesture recognition. MediaPipe excels at detecting and tracing body markers, with a special emphasis on hand gestures. Hands detection is done by generating 21 reference points for each hand. Subsequently, after data capture, the project employs the MultiPerceptron Layer (MPL) for indepth gesture classification. This tandem of MediaPipe's extraction prowess and MPL's analytical capability ensures that human gestures are translated into actionable commands with high precision. Furthermore, the system has been trained and validated within a built-in dataset. To prove the domain adaptation, a framework based on the Robot Operating System (ROS), as a communication backbone, alongside CARLA Simulator, is used. Following successful simulations, the system is transitioned to a real-world platform, marking a significant milestone in the project. This real vehicle implementation verifies the practicality and efficiency of the system beyond theoretical constructs.Keywords: gesture detection, mediapipe, multiperceptron layer, robot operating system
Procedia PDF Downloads 100288 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer
Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom
Abstract:
Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN
Procedia PDF Downloads 75287 Review of the Road Crash Data Availability in Iraq
Authors: Abeer K. Jameel, Harry Evdorides
Abstract:
Iraq is a middle income country where the road safety issue is considered one of the leading causes of deaths. To control the road risk issue, the Iraqi Ministry of Planning, General Statistical Organization started to organise a collection system of traffic accidents data with details related to their causes and severity. These data are published as an annual report. In this paper, a review of the available crash data in Iraq will be presented. The available data represent the rate of accidents in aggregated level and classified according to their types, road users’ details, and crash severity, type of vehicles, causes and number of causalities. The review is according to the types of models used in road safety studies and research, and according to the required road safety data in the road constructions tasks. The available data are also compared with the road safety dataset published in the United Kingdom as an example of developed country. It is concluded that the data in Iraq are suitable for descriptive and exploratory models, aggregated level comparison analysis, and evaluation and monitoring the progress of the overall traffic safety performance. However, important traffic safety studies require disaggregated level of data and details related to the factors of the likelihood of traffic crashes. Some studies require spatial geographic details such as the location of the accidents which is essential in ranking the roads according to their level of safety, and name the most dangerous roads in Iraq which requires tactic plan to control this issue. Global Road safety agencies interested in solve this problem in low and middle-income countries have designed road safety assessment methodologies which are basing on the road attributes data only. Therefore, in this research it is recommended to use one of these methodologies.Keywords: road safety, Iraq, crash data, road risk assessment, The International Road Assessment Program (iRAP)
Procedia PDF Downloads 256286 Analysis of Pangasinan State University: Bayambang Students’ Concerns Through Social Media Analytics and Latent Dirichlet Allocation Topic Modelling Approach
Authors: Matthew John F. Sino Cruz, Sarah Jane M. Ferrer, Janice C. Francisco
Abstract:
COVID-19 pandemic has affected more than 114 countries all over the world since it was considered a global health concern in 2020. Different sectors, including education, have shifted to remote/distant setups to follow the guidelines set to prevent the spread of the disease. One of the higher education institutes which shifted to remote setup is the Pangasinan State University (PSU). In order to continue providing quality instructions to the students, PSU designed Flexible Learning Model to still provide services to its stakeholders amidst the pandemic. The model covers the redesigning of delivering instructions in remote setup and the technology needed to support these adjustments. The primary goal of this study is to determine the insights of the PSU – Bayambang students towards the remote setup implemented during the pandemic and how they perceived the initiatives employed in relation to their experiences in flexible learning. In this study, the topic modelling approach was implemented using Latent Dirichlet Allocation. The dataset used in the study. The results show that the most common concern of the students includes time and resource management, poor internet connection issues, and difficulty coping with the flexible learning modality. Furthermore, the findings of the study can be used as one of the bases for the administration to review and improve the policies and initiatives implemented during the pandemic in relation to remote service delivery. In addition, further studies can be conducted to determine the overall sentiment of the other stakeholders in the policies implemented at the University.Keywords: COVID-19, topic modelling, students’ sentiment, flexible learning, Latent Dirichlet allocation
Procedia PDF Downloads 122285 Hsa-miR-192-5p, and Hsa-miR-129-5p Prominent Biomarkers in Regulation Glioblastoma Cancer Stem Cells Genes Microenvironment
Authors: Rasha Ahmadi
Abstract:
Glioblastoma is one of the most frequent brain malignancies, having a high mortality rate and limited survival in individuals with this malignancy. Despite different treatments and surgery, recurrence of glioblastoma cancer stem cells may arise as a subsequent tumor. For this reason, it is crucial to research the markers associated with glioblastoma stem cells and specifically their microenvironment. In this study, using bioinformatics analysis, we analyzed and nominated genes in the microenvironment pathways of glioblastoma stem cells. In this study, an appropriate database was selected for analysis by referring to the GEO database. This dataset comprised gene expression patterns in stem cells derived from glioblastoma patients. Gene clusters were divided as high and low expression. Enrichment databases such as Enrichr, STRING, and GEPIA were utilized to analyze the data appropriately. Finally, we extracted the potential genes 2700 high-expression and 1100 low-expression genes are implicated in the metabolic pathways of glioblastoma cancer progression. Cellular senescence, MAPK, TNF, hypoxia, zimosterol biosynthesis, and phosphatidylinositol metabolism pathways were substantially expressed and the metabolic pathways were downregulated. After assessing the association between protein networks, MSMP, SOX2, FGD4 ,and CNTNAP3 genes with high expression and DMKN and SBSN genes with low were selected. All of these genes were observed in the survival curve, with a survival of fewer than 10 percent over around 15 months. hsa-mir-192-5p, hsa-mir-129-5p, hsa-mir-215-5p, hsa-mir-335-5p, and hsa-mir-340-5p played key function in glioblastoma cancer stem cells microenviroments. We introduced critical genes through integrated and regular bioinformatics studies by assessing the amount of gene expression profile data that can play an important role in targeting genes involved in the energy and microenvironment of glioblastoma cancer stem cells. Have. This study indicated that hsa-mir-192-5p, and hsa-mir-129-5p are appropriate candidates for this.Keywords: Glioblastoma, Cancer Stem Cells, Biomarker Discovery, Gene Expression Profiles, Bioinformatics Analysis, Tumor Microenvironment
Procedia PDF Downloads 145284 Modeling Average Paths Traveled by Ferry Vessels Using AIS Data
Authors: Devin Simmons
Abstract:
At the USDOT’s Bureau of Transportation Statistics, a biannual census of ferry operators in the U.S. is conducted, with results such as route mileage used to determine federal funding levels for operators. AIS data allows for the possibility of using GIS software and geographical methods to confirm operator-reported mileage for individual ferry routes. As part of the USDOT’s work on the ferry census, an algorithm was developed that uses AIS data for ferry vessels in conjunction with known ferry terminal locations to model the average route travelled for use as both a cartographic product and confirmation of operator-reported mileage. AIS data from each vessel is first analyzed to determine individual journeys based on the vessel’s velocity, and changes in velocity over time. These trips are then converted to geographic linestring objects. Using the terminal locations, the algorithm then determines whether the trip represented a known ferry route. Given a large enough dataset, routes will be represented by multiple trip linestrings, which are then filtered by DBSCAN spatial clustering to remove outliers. Finally, these remaining trips are ready to be averaged into one route. The algorithm interpolates the point on each trip linestring that represents the start point. From these start points, a centroid is calculated, and the first point of the average route is determined. Each trip is interpolated again to find the point that represents one percent of the journey’s completion, and the centroid of those points is used as the next point in the average route, and so on until 100 points have been calculated. Routes created using this algorithm have shown demonstrable improvement over previous methods, which included the implementation of a LOESS model. Additionally, the algorithm greatly reduces the amount of manual digitizing needed to visualize ferry activity.Keywords: ferry vessels, transportation, modeling, AIS data
Procedia PDF Downloads 176283 Influence of Geologic and Geotechnical Dataset Resolution on Regional Liquefaction Assessment of the Lower Wairau Plains
Authors: Omer Altaf, Liam Wotherspoon, Rolando Orense
Abstract:
The Wairau Plains are located in the northeast of the South Island of New Zealand, with alluvial deposits of fine-grained silts and sands combined with low-lying topography suggesting the presence of liquefiable deposits over significant portions of the region. Liquefaction manifestations were observed in past earthquakes, including the 1848 Marlborough and 1855 Wairarapa earthquakes, and more recently during the 2013 Lake Grassmere and 2016 Kaikōura earthquakes. Therefore, a good understanding of the deposits that may be susceptible to liquefaction is important for land use planning in the region and to allow developers and asset owners to appropriately address their risk. For this purpose, multiple approaches have been employed to develop regional-scale maps showing the liquefaction vulnerability categories for the region. After applying semi-qualitative criteria linked to geologic age and deposit type, the higher resolution surface mapping of geomorphologic characteristics encompassing the Wairau River and the Opaoa River was used for screening. A detailed basin geologic model developed for groundwater modelling was analysed to provide a higher level of resolution than the surface-geology based classification. This is used to identify the thickness of near-surface gravel deposits, providing an improved understanding of the presence or lack of potentially non-liquefiable crust deposits. This paper describes the methodology adopted for this project and focuses on the influence of geomorphic characteristics and analysis of the detailed geologic basin model on the liquefaction classification of the Lower Wairau Plains.Keywords: liquefaction, earthquake, cone penetration test, mapping, liquefaction-induced damage
Procedia PDF Downloads 176282 A Radiomics Approach to Predict the Evolution of Prostate Imaging Reporting and Data System Score 3/5 Prostate Areas in Multiparametric Magnetic Resonance
Authors: Natascha C. D'Amico, Enzo Grossi, Giovanni Valbusa, Ala Malasevschi, Gianpiero Cardone, Sergio Papa
Abstract:
Purpose: To characterize, through a radiomic approach, the nature of areas classified PI-RADS (Prostate Imaging Reporting and Data System) 3/5, recognized in multiparametric prostate magnetic resonance with T2-weighted (T2w), diffusion and perfusion sequences with paramagnetic contrast. Methods and Materials: 24 cases undergoing multiparametric prostate MR and biopsy were admitted to this pilot study. Clinical outcome of the PI-RADS 3/5 was found through biopsy, finding 8 malignant tumours. The analysed images were acquired with a Philips achieva 1.5T machine with a CE- T2-weighted sequence in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using 3DSlicer image analysis software. 45 shape-based, intensity-based and texture-based features were extracted and represented the input for preprocessing. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and testing set and select features yielding the maximal amount of information. After this pre-processing 20 input variables were selected and different machine learning systems were used to develop a predictive model based on a training testing crossover procedure. Results: The best machine learning system (three-layers feed-forward neural network) obtained a global accuracy of 90% ( 80 % sensitivity and 100% specificity ) with a ROC of 0.82. Conclusion: Machine learning systems coupled with radiomics show a promising potential in distinguishing benign from malign tumours in PI-RADS 3/5 areas.Keywords: machine learning, MR prostate, PI-Rads 3, radiomics
Procedia PDF Downloads 188281 Sea Level Characteristics Referenced to Specific Geodetic Datum in Alexandria, Egypt
Authors: Ahmed M. Khedr, Saad M. Abdelrahman, Kareem M. Tonbol
Abstract:
Two geo-referenced sea level datasets (September 2008 – November 2010) and (April 2012 – January 2014) were recorded at Alexandria Western Harbour (AWH). Accurate re-definition of tidal datum, referred to the latest International Terrestrial Reference Frame (ITRF-2014), was discussed and updated to improve our understanding of the old predefined tidal datum at Alexandria. Tidal and non-tidal components of sea level were separated with the use of Delft-3D hydrodynamic model-tide suit (Delft-3D, 2015). Tidal characteristics at AWH were investigated and harmonic analysis showed the most significant 34 constituents with their amplitudes and phases. Tide was identified as semi-diurnal pattern as indicated by a “Form Factor” of 0.24 and 0.25, respectively. Principle tidal datums related to major tidal phenomena were recalculated referred to a meaningful geodetic height datum. The portion of residual energy (surge) out of the total sea level energy was computed for each dataset and found 77% and 72%, respectively. Power spectral density (PSD) showed accurate resolvability in high band (1–6) cycle/days for the nominated independent constituents, except some neighbouring constituents, which are too close in frequency. Wind and atmospheric pressure data, during the recorded sea level time, were analysed and cross-correlated with the surge signals. Moderate association between surge and wind and atmospheric pressure data were obtained. In addition, long-term sea level rise trend at AWH was computed and showed good agreement with earlier estimated rates.Keywords: Alexandria, Delft-3D, Egypt, geodetic reference, harmonic analysis, sea level
Procedia PDF Downloads 165280 A Multi-Objective Decision Making Model for Biodiversity Conservation and Planning: Exploring the Concept of Interdependency
Authors: M. Mohan, J. P. Roise, G. P. Catts
Abstract:
Despite living in an era where conservation zones are de-facto the central element in any sustainable wildlife management strategy, we still find ourselves grappling with several pareto-optimal situations regarding resource allocation and area distribution for the same. In this paper, a multi-objective decision making (MODM) model is presented to answer the question of whether or not we can establish mutual relationships between these contradicting objectives. For our study, we considered a Red-cockaded woodpecker (Picoides borealis) habitat conservation scenario in the coastal plain of North Carolina, USA. Red-cockaded woodpecker (RCW) is a non-migratory territorial bird that excavates cavities in living pine trees for roosting and nesting. The RCW groups nest in an aggregation of cavity trees called ‘cluster’ and for our model we use the number of clusters to be established as a measure of evaluating the size of conservation zone required. The case study is formulated as a linear programming problem and the objective function optimises the Red-cockaded woodpecker clusters, carbon retention rate, biofuel, public safety and Net Present Value (NPV) of the forest. We studied the variation of individual objectives with respect to the amount of area available and plotted a two dimensional dynamic graph after establishing interrelations between the objectives. We further explore the concept of interdependency by integrating the MODM model with GIS, and derive a raster file representing carbon distribution from the existing forest dataset. Model results demonstrate the applicability of interdependency from both linear and spatial perspectives, and suggest that this approach holds immense potential for enhancing environmental investment decision making in future.Keywords: conservation, interdependency, multi-objective decision making, red-cockaded woodpecker
Procedia PDF Downloads 337279 Internal Auditing and the Performance of State-Owned Enterprises in Emerging Markets
Authors: Jobo Dubihlela, Kofi Boamah
Abstract:
The inimitable role of the internal auditing, challenges and the predicament of state-owned enterprises in emerging markets are acknowledged. Study sought to address the inter-related questions, about how does IAF complement the performance and sustainability of SOEs? How can effective IA control systems be implemented to improve the performance results and culture of SOEs in Namibia? The weaknesses inherent in the SOE sector, unfortunately, impacts on the IAF ability to effectively support the SOEs. Despite these challenges, the study has unearthed IAF potential capabilities to contribute to SOE survival in Namibia by complementing the governance practices of the sector. Using a quantitative research approach, the dataset was collected and analysed from SOEs to confirm the role of the internal auditing function (IAF) as an indispensable concomitant of SOE performance. The study adopted a data approach supported by the literary evidence, which enabled generalisation and connectedness of the issues being addressed. The outcome of the data analysis contributed to achieving the results, which are discussed and eventually support the conclusions reached. Results show that the intractable task of internal auditing depends on the leadership of the board of directors of the SOEs. Study also revealed critical priorities needed to influence policymakers and oversight bodies to overcome the iniquities influencing SOE operations, understand and embrace IAF to salvage a sector that has a lot to offer and yet is severely mismanaged. Results support literature on IA’s contribution to SOE development from a developing country’s point of view and is the first of its kind in Namibia. Findings suggest ways to possibly enhance knowledge development of future researchers and ‘wet their appetite’ for further research in emerging markets and on a global scale.Keywords: internal auditing activity, state-owned enterprises, emerging markets, auditing function
Procedia PDF Downloads 103278 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features
Authors: Bushra Zafar, Usman Qamar
Abstract:
Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection
Procedia PDF Downloads 316277 A Machine Learning Approach for Detecting and Locating Hardware Trojans
Authors: Kaiwen Zheng, Wanting Zhou, Nan Tang, Lei Li, Yuanhang He
Abstract:
The integrated circuit industry has become a cornerstone of the information society, finding widespread application in areas such as industry, communication, medicine, and aerospace. However, with the increasing complexity of integrated circuits, Hardware Trojans (HTs) implanted by attackers have become a significant threat to their security. In this paper, we proposed a hardware trojan detection method for large-scale circuits. As HTs introduce physical characteristic changes such as structure, area, and power consumption as additional redundant circuits, we proposed a machine-learning-based hardware trojan detection method based on the physical characteristics of gate-level netlists. This method transforms the hardware trojan detection problem into a machine-learning binary classification problem based on physical characteristics, greatly improving detection speed. To address the problem of imbalanced data, where the number of pure circuit samples is far less than that of HTs circuit samples, we used the SMOTETomek algorithm to expand the dataset and further improve the performance of the classifier. We used three machine learning algorithms, K-Nearest Neighbors, Random Forest, and Support Vector Machine, to train and validate benchmark circuits on Trust-Hub, and all achieved good results. In our case studies based on AES encryption circuits provided by trust-hub, the test results showed the effectiveness of the proposed method. To further validate the method’s effectiveness for detecting variant HTs, we designed variant HTs using open-source HTs. The proposed method can guarantee robust detection accuracy in the millisecond level detection time for IC, and FPGA design flows and has good detection performance for library variant HTs.Keywords: hardware trojans, physical properties, machine learning, hardware security
Procedia PDF Downloads 147276 Multivariate Analysis on Water Quality Attributes Using Master-Slave Neural Network Model
Authors: A. Clementking, C. Jothi Venkateswaran
Abstract:
Mathematical and computational functionalities such as descriptive mining, optimization, and predictions are espoused to resolve natural resource planning. The water quality prediction and its attributes influence determinations are adopted optimization techniques. The water properties are tainted while merging water resource one with another. This work aimed to predict influencing water resource distribution connectivity in accordance to water quality and sediment using an innovative proposed master-slave neural network back-propagation model. The experiment results are arrived through collecting water quality attributes, computation of water quality index, design and development of neural network model to determine water quality and sediment, master–slave back propagation neural network back-propagation model to determine variations on water quality and sediment attributes between the water resources and the recommendation for connectivity. The homogeneous and parallel biochemical reactions are influences water quality and sediment while distributing water from one location to another. Therefore, an innovative master-slave neural network model [M (9:9:2)::S(9:9:2)] designed and developed to predict the attribute variations. The result of training dataset given as an input to master model and its maximum weights are assigned as an input to the slave model to predict the water quality. The developed master-slave model is predicted physicochemical attributes weight variations for 85 % to 90% of water quality as a target values.The sediment level variations also predicated from 0.01 to 0.05% of each water quality percentage. The model produced the significant variations on physiochemical attribute weights. According to the predicated experimental weight variation on training data set, effective recommendations are made to connect different resources.Keywords: master-slave back propagation neural network model(MSBPNNM), water quality analysis, multivariate analysis, environmental mining
Procedia PDF Downloads 477275 Feature Evaluation Based on Random Subspace and Multiple-K Ensemble
Authors: Jaehong Yu, Seoung Bum Kim
Abstract:
Clustering analysis can facilitate the extraction of intrinsic patterns in a dataset and reveal its natural groupings without requiring class information. For effective clustering analysis in high dimensional datasets, unsupervised dimensionality reduction is an important task. Unsupervised dimensionality reduction can generally be achieved by feature extraction or feature selection. In many situations, feature selection methods are more appropriate than feature extraction methods because of their clear interpretation with respect to the original features. The unsupervised feature selection can be categorized as feature subset selection and feature ranking method, and we focused on unsupervised feature ranking methods which evaluate the features based on their importance scores. Recently, several unsupervised feature ranking methods were developed based on ensemble approaches to achieve their higher accuracy and stability. However, most of the ensemble-based feature ranking methods require the true number of clusters. Furthermore, these algorithms evaluate the feature importance depending on the ensemble clustering solution, and they produce undesirable evaluation results if the clustering solutions are inaccurate. To address these limitations, we proposed an ensemble-based feature ranking method with random subspace and multiple-k ensemble (FRRM). The proposed FRRM algorithm evaluates the importance of each feature with the random subspace ensemble, and all evaluation results are combined with the ensemble importance scores. Moreover, FRRM does not require the determination of the true number of clusters in advance through the use of the multiple-k ensemble idea. Experiments on various benchmark datasets were conducted to examine the properties of the proposed FRRM algorithm and to compare its performance with that of existing feature ranking methods. The experimental results demonstrated that the proposed FRRM outperformed the competitors.Keywords: clustering analysis, multiple-k ensemble, random subspace-based feature evaluation, unsupervised feature ranking
Procedia PDF Downloads 339274 Hyper Parameter Optimization of Deep Convolutional Neural Networks for Pavement Distress Classification
Authors: Oumaima Khlifati, Khadija Baba
Abstract:
Pavement distress is the main factor responsible for the deterioration of road structure durability, damage vehicles, and driver comfort. Transportation agencies spend a high proportion of their funds on pavement monitoring and maintenance. The auscultation of pavement distress was based on the manual survey, which was extremely time consuming, labor intensive, and required domain expertise. Therefore, the automatic distress detection is needed to reduce the cost of manual inspection and avoid more serious damage by implementing the appropriate remediation actions at the right time. Inspired by recent deep learning applications, this paper proposes an algorithm for automatic road distress detection and classification using on the Deep Convolutional Neural Network (DCNN). In this study, the types of pavement distress are classified as transverse or longitudinal cracking, alligator, pothole, and intact pavement. The dataset used in this work is composed of public asphalt pavement images. In order to learn the structure of the different type of distress, the DCNN models are trained and tested as a multi-label classification task. In addition, to get the highest accuracy for our model, we adjust the structural optimization hyper parameters such as the number of convolutions and max pooling, filers, size of filters, loss functions, activation functions, and optimizer and fine-tuning hyper parameters that conclude batch size and learning rate. The optimization of the model is executed by checking all feasible combinations and selecting the best performing one. The model, after being optimized, performance metrics is calculated, which describe the training and validation accuracies, precision, recall, and F1 score.Keywords: distress pavement, hyperparameters, automatic classification, deep learning
Procedia PDF Downloads 93273 Alternating Expectation-Maximization Algorithm for a Bilinear Model in Isoform Quantification from RNA-Seq Data
Authors: Wenjiang Deng, Tian Mou, Yudi Pawitan, Trung Nghia Vu
Abstract:
Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform reads distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide a bias correction step(s), which is based on biological considerations, such as GC content–and applied in single samples separately. The main problem is that not all biases are known. For example, new technologies such as single-cell RNA-seq (scRNA-seq) may introduce new sources of bias not seen in bulk-cell data. This study introduces a method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model Xβ, where the design matrix X is known and derived based on the simplifying assumptions. In contrast, XAEM considers Xβ as a bilinear model with both X and β unknown. Joint estimation of X and β is made possible by simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. XAEM implements an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and β. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to other recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes, particularly for paralogs. In a differential-expression analysis of a real scRNA-seq dataset, XAEM achieves substantially greater rediscovery rates in an independent validation set.Keywords: alternating EM algorithm, bias correction, bilinear model, gene expression, RNA-seq
Procedia PDF Downloads 142272 A Multi-Output Network with U-Net Enhanced Class Activation Map and Robust Classification Performance for Medical Imaging Analysis
Authors: Jaiden Xuan Schraut, Leon Liu, Yiqiao Yin
Abstract:
Computer vision in medical diagnosis has achieved a high level of success in diagnosing diseases with high accuracy. However, conventional classifiers that produce an image to-label result provides insufficient information for medical professionals to judge and raise concerns over the trust and reliability of a model with results that cannot be explained. In order to gain local insight into cancerous regions, separate tasks such as imaging segmentation need to be implemented to aid the doctors in treating patients, which doubles the training time and costs which renders the diagnosis system inefficient and difficult to be accepted by the public. To tackle this issue and drive AI-first medical solutions further, this paper proposes a multi-output network that follows a U-Net architecture for image segmentation output and features an additional convolutional neural networks (CNN) module for auxiliary classification output. Class activation maps are a method of providing insight into a convolutional neural network’s feature maps that leads to its classification but in the case of lung diseases, the region of interest is enhanced by U-net-assisted Class Activation Map (CAM) visualization. Therefore, our proposed model combines image segmentation models and classifiers to crop out only the lung region of a chest X-ray’s class activation map to provide a visualization that improves the explainability and is able to generate classification results simultaneously which builds trust for AI-led diagnosis systems. The proposed U-Net model achieves 97.61% accuracy and a dice coefficient of 0.97 on testing data from the COVID-QU-Ex Dataset which includes both diseased and healthy lungs.Keywords: multi-output network model, U-net, class activation map, image classification, medical imaging analysis
Procedia PDF Downloads 203271 Recent Climate Variability and Crop Production in the Central Highlands of Ethiopia
Authors: Arragaw Alemayehu, Woldeamlak Bewket
Abstract:
The aim of this study was to understand the influence of current climate variability on crop production in the central highlands of Ethiopia. We used monthly rainfall and temperature data from 132 points each representing a pixel of 10×10 km. The data are reconstructions based on station records and meteorological satellite observations. Production data of the five major crops in the area were collected from the Central Statistical Agency for the period 2004-2013 and for the main cropping season, locally known as Meher. The production data are at the Enumeration Area (EA ) level and hence the best available dataset on crop production. The results show statistically significant decreasing trends in March–May (Belg) rainfall in the area. However, June – September (Kiremt) rainfall showed increasing trends in Efratana Gidim and Menz Gera Meder which the latter is statistically significant. Annual rainfall also showed positive trends in the area except Basona Werana where significant negative trends were observed. On the other hand, maximum and minimum temperatures showed warming trends in the study area. Correlation results have shown that crop production and area of cultivation have positive correlation with rainfall, and negative with temperature. When the trends in crop production are investigated, most crops showed negative trends and below average production was observed. Regression results have shown that rainfall was the most important determinant of crop production in the area. It is concluded that current climate variability has a significant influence on crop production in the area and any unfavorable change in the local climate in the future will have serious implications for household level food security. Efforts to adapt to the ongoing climate change should begin from tackling the current climate variability and take a climate risk management approach.Keywords: central highlands, climate variability, crop production, Ethiopia, regression, trend
Procedia PDF Downloads 438270 The Nexus Between the Rise of Autocratisation and the Deeper Level of BRI Engagement
Authors: Dishari Rakshit, Mitchell Gallagher
Abstract:
The global landscape is witnessing a disconcerting surge in democratic backsliding, engendering concerns over the rise of autocratisation. This research demonstrates the intricate relationship between a nation's domestic propensity for autocratic governance and its trade relations with China. Giving prominence to Belt and Road Initiative (BRI) investments, this study adopts a rigorous neorealist framework to discern the complexities of nations' economic interests amidst an anarchic milieu and how these interests may transcend steadfast adherence to democratic principles. The burgeoning bipolarity in the international political setting serves as a backdrop to our inquiry. To operationalise our hypothesis, we conduct a large-scale 'N' study, encompassing a comprehensive global dataset comprising countries' democracy indicators, total trade volume with China, and cumulative Chinese BRI investments over a substantial temporal expanse. By meticulously examining BRI signatories’, we aim to ascertain the potential accentuation of democratic backsliding among these nations. To test our empirical underpinning, we will validate our findings through cogent case studies. Our analysis adds to the scholarship on multifaceted interactions between trade dynamics and democratic governance within the fabric of the international political landscape. In its culmination, the paper addresses the question- has the erstwhile grandeur of bipolarity resurfaced in the contemporary global panorama? Concurrently, we explore the nexus between the ascendant wave of autocratisation as a by-product of the Beijing Consensus? Pertinent to policymakers, our discoveries stand poised to furnish a comprehensive grasp of the manifold implications arising from the deepening entanglements with China under the auspices of the BRI.Keywords: democracy, autocracy, china, belt road initiative, international political economy
Procedia PDF Downloads 71269 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis
Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan
Abstract:
Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of Big Data Technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centers or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through Vader and Roberta model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and TFIDF Vectorization, and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.Keywords: counter vectorization, convolutional neural network, crawler, data technology, long short-term memory, web scraping, sentiment analysis
Procedia PDF Downloads 88268 Classifying Affective States in Virtual Reality Environments Using Physiological Signals
Authors: Apostolos Kalatzis, Ashish Teotia, Vishnunarayan Girishan Prabhu, Laura Stanley
Abstract:
Emotions are functional behaviors influenced by thoughts, stimuli, and other factors that induce neurophysiological changes in the human body. Understanding and classifying emotions are challenging as individuals have varying perceptions of their environments. Therefore, it is crucial that there are publicly available databases and virtual reality (VR) based environments that have been scientifically validated for assessing emotional classification. This study utilized two commercially available VR applications (Guided Meditation VR™ and Richie’s Plank Experience™) to induce acute stress and calm state among participants. Subjective and objective measures were collected to create a validated multimodal dataset and classification scheme for affective state classification. Participants’ subjective measures included the use of the Self-Assessment Manikin, emotional cards and 9 point Visual Analogue Scale for perceived stress, collected using a Virtual Reality Assessment Tool developed by our team. Participants’ objective measures included Electrocardiogram and Respiration data that were collected from 25 participants (15 M, 10 F, Mean = 22.28 4.92). The features extracted from these data included heart rate variability components and respiration rate, both of which were used to train two machine learning models. Subjective responses validated the efficacy of the VR applications in eliciting the two desired affective states; for classifying the affective states, a logistic regression (LR) and a support vector machine (SVM) with a linear kernel algorithm were developed. The LR outperformed the SVM and achieved 93.8%, 96.2%, 93.8% leave one subject out cross-validation accuracy, precision and recall, respectively. The VR assessment tool and data collected in this study are publicly available for other researchers.Keywords: affective computing, biosignals, machine learning, stress database
Procedia PDF Downloads 142