Search results for: data envelopment analysis (DEA)

41320 Self-Organizing Maps for Credit Card Fraud Detection

Authors: ChunYi Peng, Wei Hsuan CHeng, Shyh Kuang Ueng

Abstract:

This study focuses on the application of self-organizing maps (SOM) technology in analyzing credit card transaction data, aiming to enhance the accuracy and efficiency of fraud detection. Som, as an artificial neural network, is particularly suited for pattern recognition and data classification, making it highly effective for the complex and variable nature of credit card transaction data. By analyzing transaction characteristics with SOM, the research identifies abnormal transaction patterns that could indicate potentially fraudulent activities. Moreover, this study has developed a specialized visualization tool to intuitively present the relationships between SOM analysis outcomes and transaction data, aiding financial institution personnel in quickly identifying and responding to potential fraud, thereby reducing financial losses. Additionally, the research explores the integration of SOM technology with composite intelligent system technologies (including finite state machines, fuzzy logic, and decision trees) to further improve fraud detection accuracy. This multimodal approach provides a comprehensive perspective for identifying and understanding various types of fraud within credit card transactions. In summary, by integrating SOM technology with visualization tools and composite intelligent system technologies, this research offers a more effective method of fraud detection for the financial industry, not only enhancing detection accuracy but also deepening the overall understanding of fraudulent activities.

Keywords: self-organizing map technology, fraud detection, information visualization, data analysis, composite intelligent system technologies, decision support technologies

Procedia PDF Downloads 60

41319 Energy and Economic Analysis of Heat Recovery from Boiler Exhaust Flue Gas

Authors: Kemal Comakli, Meryem Terhan

Abstract:

In this study, the potential of heat recovery from waste flue gas was examined in 60 MW district heating system of a university, and fuel saving was aimed by using the recovered heat in the system as a source again. Various scenarios are intended to make use of waste heat. For this purpose, actual operation data of the system were taken. Besides, the heat recovery units that consist of heat exchangers such as flue gas condensers, economizers or air pre-heaters were designed theoretically for each scenario. Energy analysis of natural gas-fired boiler’s exhaust flue gas in the system, and economic analysis of heat recovery units to predict payback periods were done. According to calculation results, the waste heat loss ratio from boiler flue gas in the system was obtained as average 16%. Thanks to the heat recovery units, thermal efficiency of the system can be increased, and fuel saving can be provided. At the same time, a huge amount of green gas emission can be decreased by installing the heat recovery units.

Keywords: heat recovery from flue gas, energy analysis of flue gas, economical analysis, payback period

Procedia PDF Downloads 288

41318 Sustainability in Hospitality: An Inevitable Necessity in New Age with Big Environmental Challenges

Authors: Majid Alizadeh, Sina Nematizadeh, Hassan Esmailpour

Abstract:

The mutual effects of hospitality and the environment are undeniable, so that the tourism industry has major harmful effects on the environment. Hotels, as one of the most important pillars of the hospitality industry, have significant effects on the environment. Green marketing is a promising strategy in response to the growing concerns about the environment. A green hotel marketing model was proposed using a grounded theory approach in the hotel industry. The study was carried out as a mixed method study. Data gathering in the qualitative phase was done through literature review and In-depth, semi-structured interviews with 10 experts in green marketing using snowball technique. Following primary analysis, open, axial, and selective coding was done on the data, which yielded 69 concepts, 18 categories and six dimensions. Green hotel (green product) was adopted as the core phenomenon. In the quantitative phase, data were gleaned using 384 questionnaires filled-out by hotel guests and descriptive statistics and Structural equation modeling (SEM) were used for data analysis. The results indicated that the mediating role of behavioral response between the ecological literacy, trust, marketing mix and performance was significant. The green marketing mix, as a strategy, had a significant and positive effect on guests’ behavioral response, corporate green image, and financial and environmental performance of hotels.

Keywords: green marketing, sustainable development, hospitality, grounded theory, structural equations model

Procedia PDF Downloads 82

41317 Trend Analysis of Annual Total Precipitation Data in Konya

Authors: Naci Büyükkaracığan

Abstract:

Hydroclimatic observation values are used in the planning of the project of water resources. Climate variables are the first of the values used in planning projects. At the same time, the climate system is a complex and interactive system involving the atmosphere, land surfaces, snow and bubbles, the oceans and other water structures. The amount and distribution of precipitation, which is an important climate parameter, is a limiting environmental factor for dispersed living things. Trend analysis is applied to the detection of the presence of a pattern or trend in the data set. Many trends work in different parts of the world are usually made for the determination of climate change. The detection and attribution of past trends and variability in climatic variables is essential for explaining potential future alteration resulting from anthropogenic activities. Parametric and non-parametric tests are used for determining the trends in climatic variables. In this study, trend tests were applied to annual total precipitation data obtained in period of 1972 and 2012, in the Konya Basin. Non-parametric trend tests, (Sen’s T, Spearman’s Rho, Mann-Kendal, Sen’s T trend, Wald-Wolfowitz) and parametric test (mean square) were applied to annual total precipitations of 15 stations for trend analysis. The linear slopes (change per unit time) of trends are calculated by using a non-parametric estimator developed by Sen. The beginning of trends is determined by using the Mann-Kendall rank correlation test. In addition, homogeneities in precipitation trends are tested by using a method developed by Van Belle and Hughes. As a result of tests, negative linear slopes were found in annual total precipitations in Konya.

Keywords: trend analysis, precipitation, hydroclimatology, Konya

Procedia PDF Downloads 220

41316 Relative Entropy Used to Determine the Divergence of Cells in Single Cell RNA Sequence Data Analysis

Authors: An Chengrui, Yin Zi, Wu Bingbing, Ma Yuanzhu, Jin Kaixiu, Chen Xiao, Ouyang Hongwei

Abstract:

Single cell RNA sequence (scRNA-seq) is one of the effective tools to study transcriptomics of biological processes. Recently, similarity measurement of cells is Euclidian distance or its derivatives. However, the process of scRNA-seq is a multi-variate Bernoulli event model, thus we hypothesize that it would be more efficient when the divergence between cells is valued with relative entropy than Euclidian distance. In this study, we compared the performances of Euclidian distance, Spearman correlation distance and Relative Entropy using scRNA-seq data of the early, medial and late stage of limb development generated in our lab. Relative Entropy is better than other methods according to cluster potential test. Furthermore, we developed KL-SNE, an algorithm modifying t-SNE whose definition of divergence between cells Euclidian distance to Kullback–Leibler divergence. Results showed that KL-SNE was more effective to dissect cell heterogeneity than t-SNE, indicating the better performance of relative entropy than Euclidian distance. Specifically, the chondrocyte expressing Comp was clustered together with KL-SNE but not with t-SNE. Surprisingly, cells in early stage were surrounded by cells in medial stage in the processing of KL-SNE while medial cells neighbored to late stage with the process of t-SNE. This results parallel to Heatmap which showed cells in medial stage were more heterogenic than cells in other stages. In addition, we also found that results of KL-SNE tend to follow Gaussian distribution compared with those of the t-SNE, which could also be verified with the analysis of scRNA-seq data from another study on human embryo development. Therefore, it is also an effective way to convert non-Gaussian distribution to Gaussian distribution and facilitate the subsequent statistic possesses. Thus, relative entropy is potentially a better way to determine the divergence of cells in scRNA-seq data analysis.

Keywords: Single cell RNA sequence, Similarity measurement, Relative Entropy, KL-SNE, t-SNE

Procedia PDF Downloads 340

41315 Sensitivity Analysis during the Optimization Process Using Genetic Algorithms

Authors: M. A. Rubio, A. Urquia

Abstract:

Genetic algorithms (GA) are applied to the solution of high-dimensional optimization problems. Additionally, sensitivity analysis (SA) is usually carried out to determine the effect on optimal solutions of changes in parameter values of the objective function. These two analyses (i.e., optimization and sensitivity analysis) are computationally intensive when applied to high-dimensional functions. The approach presented in this paper consists in performing the SA during the GA execution, by statistically analyzing the data obtained of running the GA. The advantage is that in this case SA does not involve making additional evaluations of the objective function and, consequently, this proposed approach requires less computational effort than conducting optimization and SA in two consecutive steps.

Keywords: optimization, sensitivity, genetic algorithms, model calibration

Procedia PDF Downloads 437

41314 Hydrochemical Contamination Profiling and Spatial-Temporal Mapping with the Support of Multivariate and Cluster Statistical Analysis

Authors: Sofia Barbosa, Mariana Pinto, José António Almeida, Edgar Carvalho, Catarina Diamantino

Abstract:

The aim of this work was to test a methodology able to generate spatial-temporal maps that can synthesize simultaneously the trends of distinct hydrochemical indicators in an old radium-uranium tailings dam deposit. Multidimensionality reduction derived from principal component analysis and subsequent data aggregation derived from clustering analysis allow to identify distinct hydrochemical behavioural profiles and to generate synthetic evolutionary hydrochemical maps.

Keywords: Contamination plume migration, K-means of PCA scores, groundwater and mine water monitoring, spatial-temporal hydrochemical trends

Procedia PDF Downloads 236

41313 Analysis of Dynamics Underlying the Observation Time Series by Using a Singular Spectrum Approach

Authors: O. Delage, H. Bencherif, T. Portafaix, A. Bourdier

Abstract:

The main purpose of time series analysis is to learn about the dynamics behind some time ordered measurement data. Two approaches are used in the literature to get a better knowledge of the dynamics contained in observation data sequences. The first of these approaches concerns time series decomposition, which is an important analysis step allowing patterns and behaviors to be extracted as components providing insight into the mechanisms producing the time series. As in many cases, time series are short, noisy, and non-stationary. To provide components which are physically meaningful, methods such as Empirical Mode Decomposition (EMD), Empirical Wavelet Transform (EWT) or, more recently, Empirical Adaptive Wavelet Decomposition (EAWD) have been proposed. The second approach is to reconstruct the dynamics underlying the time series as a trajectory in state space by mapping a time series into a set of Rᵐ lag vectors by using the method of delays (MOD). Takens has proved that the trajectory obtained with the MOD technic is equivalent to the trajectory representing the dynamics behind the original time series. This work introduces the singular spectrum decomposition (SSD), which is a new adaptive method for decomposing non-linear and non-stationary time series in narrow-banded components. This method takes its origin from singular spectrum analysis (SSA), a nonparametric spectral estimation method used for the analysis and prediction of time series. As the first step of SSD is to constitute a trajectory matrix by embedding a one-dimensional time series into a set of lagged vectors, SSD can also be seen as a reconstruction method like MOD. We will first give a brief overview of the existing decomposition methods (EMD-EWT-EAWD). The SSD method will then be described in detail and applied to experimental time series of observations resulting from total columns of ozone measurements. The results obtained will be compared with those provided by the previously mentioned decomposition methods. We will also compare the reconstruction qualities of the observed dynamics obtained from the SSD and MOD methods.

Keywords: time series analysis, adaptive time series decomposition, wavelet, phase space reconstruction, singular spectrum analysis

Procedia PDF Downloads 106

41312 Advanced Data Visualization Techniques for Effective Decision-making in Oil and Gas Exploration and Production

Authors: Deepak Singh, Rail Kuliev

Abstract:

This research article explores the significance of advanced data visualization techniques in enhancing decision-making processes within the oil and gas exploration and production domain. With the oil and gas industry facing numerous challenges, effective interpretation and analysis of vast and diverse datasets are crucial for optimizing exploration strategies, production operations, and risk assessment. The article highlights the importance of data visualization in managing big data, aiding the decision-making process, and facilitating communication with stakeholders. Various advanced data visualization techniques, including 3D visualization, augmented reality (AR), virtual reality (VR), interactive dashboards, and geospatial visualization, are discussed in detail, showcasing their applications and benefits in the oil and gas sector. The article presents case studies demonstrating the successful use of these techniques in optimizing well placement, real-time operations monitoring, and virtual reality training. Additionally, the article addresses the challenges of data integration and scalability, emphasizing the need for future developments in AI-driven visualization. In conclusion, this research emphasizes the immense potential of advanced data visualization in revolutionizing decision-making processes, fostering data-driven strategies, and promoting sustainable growth and improved operational efficiency within the oil and gas exploration and production industry.

Keywords: augmented reality (AR), virtual reality (VR), interactive dashboards, real-time operations monitoring

Procedia PDF Downloads 87

41311 Enhanced Analysis of Spatial Morphological Cognitive Traits in Lidukou Village through the Application of Space Syntax

Authors: Man Guo

Abstract:

This paper delves into the intricate interplay between spatial morphology and spatial cognition in Lidukou Village, utilizing a combined approach of spatial syntax and field data. Through a comparative analysis of the gathered data, it emerges that the spatial integration level of Lidukou Village exhibits a direct positive correlation with the spatial cognitive preferences of its inhabitants. Specifically, the areas within the village that exhibit a higher degree of spatial cognition are predominantly distributed along the axis primarily defined by Shuxiang Road. However, the accessibility to historical relics remains limited, lacking a coherent systemic relationship. To address the morphological challenges faced by Lidukou Village, this study proposes optimization strategies that encompass diverse perspectives, including the refinement of spatial mechanisms and the shaping of strategic spatial nodes.

Keywords: traditional villages, spatial syntax, spatial integration degree, morphological problem

Procedia PDF Downloads 45

41310 Combined Proteomic and Metabolomic Analysis Approaches to Investigate the Modification in the Proteome and Metabolome of in vitro Models Treated with Gold Nanoparticles (AuNPs)

Authors: H. Chassaigne, S. Gioria, J. Lobo Vicente, D. Carpi, P. Barboro, G. Tomasi, A. Kinsner-Ovaskainen, F. Rossi

Abstract:

Emerging approaches in the area of exposure to nanomaterials and assessment of human health effects combine the use of in vitro systems and analytical techniques to study the perturbation of the proteome and/or the metabolome. We investigated the modification in the cytoplasmic compartment of the Balb/3T3 cell line exposed to gold nanoparticles. On one hand, the proteomic approach is quite standardized even if it requires precautions when dealing with in vitro systems. On the other hand, metabolomic analysis is challenging due to the chemical diversity of cellular metabolites that complicate data elaboration and interpretation. Differentially expressed proteins were found to cover a range of functions including stress response, cell metabolism, cell growth and cytoskeleton organization. In addition, de-regulated metabolites were annotated using the HMDB database. The "omics" fields hold huge promises in the interaction of nanoparticles with biological systems. The combination of proteomics and metabolomics data is possible however challenging.

Keywords: data processing, gold nanoparticles, in vitro systems, metabolomics, proteomics

Procedia PDF Downloads 503

41309 Exploring Gaming-Learning Interaction in MMOG Using Data Mining Methods

Authors: Meng-Tzu Cheng, Louisa Rosenheck, Chen-Yen Lin, Eric Klopfer

Abstract:

The purpose of the research is to explore some of the ways in which gameplay data can be analyzed to yield results that feedback into the learning ecosystem. Back-end data for all users as they played an MMOG, The Radix Endeavor, was collected, and this study reports the analyses on a specific genetics quest by using the data mining techniques, including the decision tree method. In the study, different reasons for quest failure between participants who eventually succeeded and who never succeeded were revealed. Regarding the in-game tools use, trait examiner was a key tool in the quest completion process. Subsequently, the results of decision tree showed that a lack of trait examiner usage can be made up with additional Punnett square uses, displaying multiple pathways to success in this quest. The methods of analysis used in this study and the resulting usage patterns indicate some useful ways that gameplay data can provide insights in two main areas. The first is for game designers to know how players are interacting with and learning from their game. The second is for players themselves as well as their teachers to get information on how they are progressing through the game, and to provide help they may need based on strategies and misconceptions identified in the data.

Keywords: MMOG, decision tree, genetics, gaming-learning interaction

Procedia PDF Downloads 358

41308 Network Analysis to Reveal Microbial Community Dynamics in the Coral Reef Ocean

Authors: Keigo Ide, Toru Maruyama, Michihiro Ito, Hiroyuki Fujimura, Yoshikatu Nakano, Shoichiro Suda, Sachiyo Aburatani, Haruko Takeyama

Abstract:

Understanding environmental system is one of the important tasks. In recent years, conservation of coral environments has been focused for biodiversity issues. The damage of coral reef under environmental impacts has been observed worldwide. However, the casual relationship between damage of coral and environmental impacts has not been clearly understood. On the other hand, structure/diversity of marine bacterial community may be relatively robust under the certain strength of environmental impact. To evaluate the coral environment conditions, it is necessary to investigate relationship between marine bacterial composition in coral reef and environmental factors. In this study, the Time Scale Network Analysis was developed and applied to analyze the marine environmental data for investigating the relationship among coral, bacterial community compositions and environmental factors. Seawater samples were collected fifteen times from November 2014 to May 2016 at two locations, Ishikawabaru and South of Sesoko in Sesoko Island, Okinawa. The physicochemical factors such as temperature, photosynthetic active radiation, dissolved oxygen, turbidity, pH, salinity, chlorophyll, dissolved organic matter and depth were measured at the coral reef area. Metagenome and metatranscriptome in seawater of coral reef were analyzed as the biological factors. Metagenome data was used to clarify marine bacterial community composition. In addition, functional gene composition was estimated from metatranscriptome. For speculating the relationships between physicochemical and biological factors, cross-correlation analysis was applied to time scale data. Even though cross-correlation coefficients usually include the time precedence information, it also included indirect interactions between the variables. To elucidate the direct regulations between both factors, partial correlation coefficients were combined with cross correlation. This analysis was performed against all parameters such as the bacterial composition, the functional gene composition and the physicochemical factors. As the results, time scale network analysis revealed the direct regulation of seawater temperature by photosynthetic active radiation. In addition, concentration of dissolved oxygen regulated the value of chlorophyll. Some reasonable regulatory relationships between environmental factors indicate some part of mechanisms in coral reef area.

Keywords: coral environment, marine microbiology, network analysis, omics data analysis

Procedia PDF Downloads 254

41307 An Empirical Investigation of Big Data Analytics: The Financial Performance of Users versus Vendors

Authors: Evisa Mitrou, Nicholas Tsitsianis, Supriya Shinde

Abstract:

In the age of digitisation and globalisation, businesses have shifted online and are investing in big data analytics (BDA) to respond to changing market conditions and sustain their performance. Our study shifts the focus from the adoption of BDA to the impact of BDA on financial performance. We explore the financial performance of both BDA-vendors (business-to-business) and BDA-clients (business-to-customer). We distinguish between the five BDA-technologies (big-data-as-a-service (BDaaS), descriptive, diagnostic, predictive, and prescriptive analytics) and discuss them individually. Further, we use four perspectives (internal business process, learning and growth, customer, and finance) and discuss the significance of how each of the five BDA-technologies affects the performance measures of these four perspectives. We also present the analysis of employee engagement, average turnover, average net income, and average net assets for BDA-clients and BDA-vendors. Our study also explores the effect of the COVID-19 pandemic on business continuity for both BDA-vendors and BDA-clients.

Keywords: BDA-clients, BDA-vendors, big data analytics, financial performance

Procedia PDF Downloads 125

41306 Self-Organizing Maps for Credit Card Fraud Detection and Visualization

Authors: Peng Chun-Yi, Chen Wei-Hsuan, Ueng Shyh-Kuang

Abstract:

This study focuses on the application of self-organizing maps (SOM) technology in analyzing credit card transaction data, aiming to enhance the accuracy and efficiency of fraud detection. Som, as an artificial neural network, is particularly suited for pattern recognition and data classification, making it highly effective for the complex and variable nature of credit card transaction data. By analyzing transaction characteristics with SOM, the research identifies abnormal transaction patterns that could indicate potentially fraudulent activities. Moreover, this study has developed a specialized visualization tool to intuitively present the relationships between SOM analysis outcomes and transaction data, aiding financial institution personnel in quickly identifying and responding to potential fraud, thereby reducing financial losses. Additionally, the research explores the integration of SOM technology with composite intelligent system technologies (including finite state machines, fuzzy logic, and decision trees) to further improve fraud detection accuracy. This multimodal approach provides a comprehensive perspective for identifying and understanding various types of fraud within credit card transactions. In summary, by integrating SOM technology with visualization tools and composite intelligent system technologies, this research offers a more effective method of fraud detection for the financial industry, not only enhancing detection accuracy but also deepening the overall understanding of fraudulent activities.

Keywords: self-organizing map technology, fraud detection, information visualization, data analysis, composite intelligent system technologies, decision support technologies

Procedia PDF Downloads 60

41305 Empirical Roughness Progression Models of Heavy Duty Rural Pavements

Authors: Nahla H. Alaswadko, Rayya A. Hassan, Bayar N. Mohammed

Abstract:

Empirical deterministic models have been developed to predict roughness progression of heavy duty spray sealed pavements for a dataset representing rural arterial roads. The dataset provides a good representation of the relevant network and covers a wide range of operating and environmental conditions. A sample with a large size of historical time series data for many pavement sections has been collected and prepared for use in multilevel regression analysis. The modelling parameters include road roughness as performance parameter and traffic loading, time, initial pavement strength, reactivity level of subgrade soil, climate condition, and condition of drainage system as predictor parameters. The purpose of this paper is to report the approaches adopted for models development and validation. The study presents multilevel models that can account for the correlation among time series data of the same section and to capture the effect of unobserved variables. Study results show that the models fit the data very well. The contribution and significance of relevant influencing factors in predicting roughness progression are presented and explained. The paper concludes that the analysis approach used for developing the models confirmed their accuracy and reliability by well-fitting to the validation data.

Keywords: roughness progression, empirical model, pavement performance, heavy duty pavement

Procedia PDF Downloads 168

41304 Detecting the Palaeochannels Based on Optical Data and High-Resolution Radar Data for Periyarriver Basin

Authors: S. Jayalakshmi, Gayathri S., Subiksa V., Nithyasri P., Agasthiya

Abstract:

Paleochannels are the buried part of an active river system which was separated from the active river channel by the process of cutoff or abandonment during the dynamic evolution of the active river. Over time, they are filled by young unconsolidated or semi-consolidated sediments. Additionally, it is impacted by geo morphological influences, lineament alterations, and other factors. The primary goal of this study is to identify the paleochannels in Periyar river basin for the year 2023. Those channels has a high probability in the presence of natural resources, including gold, platinum,tin,an duranium. Numerous techniques are used to map the paleochannel. Using the optical data, Satellite images were collected from various sources, which comprises multispectral satellite images from which indices such as Normalized Difference Vegetation Index (NDVI),Normalized Difference Water Index (NDWI), Soil Adjusted Vegetative Index (SAVI) and thematic layers such as Lithology, Stream Network, Lineament were prepared. Weights are assigned to each layer based on its importance, and overlay analysis has done, which concluded that the northwest region of the area has shown some paleochannel patterns. The results were cross-verified using the results obtained using microwave data. Using Sentinel data, Synthetic Aperture Radar (SAR) Image was extracted from European Space Agency (ESA) portal, pre-processed it using SNAP 6.0. In addition to that, Polarimetric decomposition technique has incorporated to detect the paleochannels based on its scattering property. Further, Principal component analysis has done for enhanced output imagery. Results obtained from optical and microwave radar data were compared and the location of paleochannels were detected. It resulted six paleochannels in the study area out of which three paleochannels were validated with the existing data published by Department of Geology and Environmental Science, Kerala. The other three paleochannels were newly detected with the help of SAR image.

Keywords: paleochannels, optical data, SAR image, SNAP

Procedia PDF Downloads 93

41303 ANOVA-Based Feature Selection and Machine Learning System for IoT Anomaly Detection

Authors: Muhammad Ali

Abstract:

Cyber-attacks and anomaly detection on the Internet of Things (IoT) infrastructure is emerging concern in the domain of data-driven intrusion. Rapidly increasing IoT risk is now making headlines around the world. denial of service, malicious control, data type probing, malicious operation, DDos, scan, spying, and wrong setup are attacks and anomalies that can affect an IoT system failure. Everyone talks about cyber security, connectivity, smart devices, and real-time data extraction. IoT devices expose a wide variety of new cyber security attack vectors in network traffic. For further than IoT development, and mainly for smart and IoT applications, there is a necessity for intelligent processing and analysis of data. So, our approach is too secure. We train several machine learning models that have been compared to accurately predicting attacks and anomalies on IoT systems, considering IoT applications, with ANOVA-based feature selection with fewer prediction models to evaluate network traffic to help prevent IoT devices. The machine learning (ML) algorithms that have been used here are KNN, SVM, NB, D.T., and R.F., with the most satisfactory test accuracy with fast detection. The evaluation of ML metrics includes precision, recall, F1 score, FPR, NPV, G.M., MCC, and AUC & ROC. The Random Forest algorithm achieved the best results with less prediction time, with an accuracy of 99.98%.

Keywords: machine learning, analysis of variance, Internet of Thing, network security, intrusion detection

Procedia PDF Downloads 126

41302 Data Analysis for Taxonomy Prediction and Annotation of 16S rRNA Gene Sequences from Metagenome Data

Authors: Suchithra V., Shreedhanya, Kavya Menon, Vidya Niranjan

Abstract:

Skin metagenomics has a wide range of applications with direct relevance to the health of the organism. It gives us insight to the diverse community of microorganisms (the microbiome) harbored on the skin. In the recent years, it has become increasingly apparent that the interaction between skin microbiome and the human body plays a prominent role in immune system development, cancer development, disease pathology, and many other biological implications. Next Generation Sequencing has led to faster and better understanding of environmental organisms and their mutual interactions. This project is studying the human skin microbiome of different individuals having varied skin conditions. Bacterial 16S rRNA data of skin microbiome is downloaded from SRA toolkit provided by NCBI to perform metagenomics analysis. Twelve samples are selected with two controls, and 3 different categories, i.e., sex (male/female), skin type (moist/intermittently moist/sebaceous) and occlusion (occluded/intermittently occluded/exposed). Quality of the data is increased using Cutadapt, and its analysis is done using FastQC. USearch, a tool used to analyze an NGS data, provides a suitable platform to obtain taxonomy classification and abundance of bacteria from the metagenome data. The statistical tool used for analyzing the USearch result is METAGENassist. The results revealed that the top three abundant organisms found were: Prevotella, Corynebacterium, and Anaerococcus. Prevotella is known to be an infectious bacterium found on wound, tooth cavity, etc. Corynebacterium and Anaerococcus are opportunist bacteria responsible for skin odor. This result infers that Prevotella thrives easily in sebaceous skin conditions. Therefore it is better to undergo intermittently occluded treatment such as applying ointments, creams, etc. to treat wound for sebaceous skin type. Exposing the wound should be avoided as it leads to an increase in Prevotella abundance. Moist skin type individuals can opt for occluded or intermittently occluded treatment as they have shown to decrease the abundance of bacteria during treatment.

Keywords: bacterial 16S rRNA , next generation sequencing, skin metagenomics, skin microbiome, taxonomy

Procedia PDF Downloads 172

41301 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: airborne laser scanning, digital terrain models, filtering, forested areas

Procedia PDF Downloads 139

41300 Health Monitoring and Failure Detection of Electronic and Structural Components in Small Unmanned Aerial Vehicles

Authors: Gopi Kandaswamy, P. Balamuralidhar

Abstract:

Fully autonomous small Unmanned Aerial Vehicles (UAVs) are increasingly being used in many commercial applications. Although a lot of research has been done to develop safe, reliable and durable UAVs, accidents due to electronic and structural failures are not uncommon and pose a huge safety risk to the UAV operators and the public. Hence there is a strong need for an automated health monitoring system for UAVs with a view to minimizing mission failures thereby increasing safety. This paper describes our approach to monitoring the electronic and structural components in a small UAV without the need for additional sensors to do the monitoring. Our system monitors data from four sources; sensors, navigation algorithms, control inputs from the operator and flight controller outputs. It then does statistical analysis on the data and applies a rule based engine to detect failures. This information can then be fed back into the UAV and a decision to continue or abort the mission can be taken automatically by the UAV and independent of the operator. Our system has been verified using data obtained from real flights over the past year from UAVs of various sizes that have been designed and deployed by us for various applications.

Keywords: fault detection, health monitoring, unmanned aerial vehicles, vibration analysis

Procedia PDF Downloads 263

41299 Methodology of the Turkey’s National Geographic Information System Integration Project

Authors: Buse A. Ataç, Doğan K. Cenan, Arda Çetinkaya, Naz D. Şahin, Köksal Sanlı, Zeynep Koç, Akın Kısa

Abstract:

With its spatial data reliability, interpretation and questioning capabilities, Geographical Information Systems make significant contributions to scientists, planners and practitioners. Geographic information systems have received great attention in today's digital world, growing rapidly, and increasing the efficiency of use. Access to and use of current and accurate geographical data, which are the most important components of the Geographical Information System, has become a necessity rather than a need for sustainable and economic development. This project aims to enable sharing of data collected by public institutions and organizations on a web-based platform. Within the scope of the project, INSPIRE (Infrastructure for Spatial Information in the European Community) data specifications are considered as a road-map. In this context, Turkey's National Geographic Information System (TUCBS) Integration Project supports sharing spatial data within 61 pilot public institutions as complied with defined national standards. In this paper, which is prepared by the project team members in the TUCBS Integration Project, the technical process with a detailed methodology is explained. In this context, the main technical processes of the Project consist of Geographic Data Analysis, Geographic Data Harmonization (Standardization), Web Service Creation (WMS, WFS) and Metadata Creation-Publication. In this paper, the integration process carried out to provide the data produced by 61 institutions to be shared from the National Geographic Data Portal (GEOPORTAL), have been trying to be conveyed with a detailed methodology.

Keywords: data specification, geoportal, GIS, INSPIRE, Turkish National Geographic Information System, TUCBS, Turkey's national geographic information system

Procedia PDF Downloads 146

41298 Valence and Arousal-Based Sentiment Analysis: A Comparative Study

Authors: Usama Shahid, Muhammad Zunnurain Hussain

Abstract:

This research paper presents a comprehensive analysis of a sentiment analysis approach that employs valence and arousal as its foundational pillars, in comparison to traditional techniques. Sentiment analysis is an indispensable task in natural language processing that involves the extraction of opinions and emotions from textual data. The valence and arousal dimensions, representing the intensity and positivity/negativity of emotions, respectively, enable the creation of four quadrants, each representing a specific emotional state. The study seeks to determine the impact of utilizing these quadrants to identify distinct emotional states on the accuracy and efficiency of sentiment analysis, in comparison to traditional techniques. The results reveal that the valence and arousal-based approach outperforms other approaches, particularly in identifying nuanced emotions that may be missed by conventional methods. The study's findings are crucial for applications such as social media monitoring and market research, where the accurate classification of emotions and opinions is paramount. Overall, this research highlights the potential of using valence and arousal as a framework for sentiment analysis and offers invaluable insights into the benefits of incorporating specific types of emotions into the analysis. These findings have significant implications for researchers and practitioners in the field of natural language processing, as they provide a basis for the development of more accurate and effective sentiment analysis tools.

Keywords: sentiment analysis, valence and arousal, emotional states, natural language processing, machine learning, text analysis, sentiment classification, opinion mining

Procedia PDF Downloads 102

41297 The First Transcriptome Assembly of Marama Bean: An African Orphan Crop

Authors: Ethel E. Phiri, Lionel Hartzenberg, Percy Chimwamuromba, Emmanuel Nepolo, Jens Kossmann, James R. Lloyd

Abstract:

Orphan crops are underresearched and underutilized food plant species that have not been categorized as major food crops, but have the potential to be economically and agronomically significant. They have been documented to have the ability to tolerate extreme environmental conditions. However, limited research has been conducted to uncover their potential as food crop species. The New Partnership for Africa’s Development (NEPAD) has classified Marama bean, Tylosema esculentum, as an orphan crop. The plant is one of the 101 African orphan crops that must have their genomes sequenced, assembled, and annotated in the foreseeable future. Marama bean is a perennial leguminous plant that primarily grows in poor, arid soils in southern Africa. The plants produce large tubers that can weigh as much as 200kg. While the foliage provides fodder, the tuber is carbohydrate rich and is a staple food source for rural communities in Namibia. Also, the edible seeds are protein- and oil-rich. Marama Bean plants respond rapidly to increased temperatures and severe water scarcity without extreme consequences. Advances in molecular biology and biotechnology have made it possible to effectively transfer technologies between model- and major crops to orphan crops. In this research, the aim was to assemble the first transcriptomic analysis of Marama Bean RNA-sequence data. Many model plant species have had their genomes sequenced and their transcriptomes assembled. Therefore the availability of transcriptome data for a non-model crop plant species will allow for gene identification and comparisons between various species. The data has been sequenced using the Ilumina Hiseq 2500 sequencing platform. Data analysis is underway. In essence, this research will eventually evaluate the potential use of Marama Bean as a crop species to improve its value in agronomy. data for a non-model crop plant species will allow for gene identification and comparisons between various species. The data has been sequenced using the Ilumina Hiseq 2500 sequencing platform. Data analysis is underway. In essence, this researc will eventually evaluate the potential use of Marama bean as a crop species to improve its value in agronomy.

Keywords: 101 African orphan crops, RNA-Seq, Tylosema esculentum, underutilised crop plants

Procedia PDF Downloads 360

41296 Evaluating The Effects of Fundamental Analysis on Earnings Per Share Concept in Stock Valuation in the Zimbabwe Stock Exchange Market

Authors: Brian Basvi

Abstract:

A technique for analyzing a security's intrinsic value is called fundamental analysis. It involves looking at relevant financial, economic, and other qualitative and quantitative aspects. Earnings Per Share (EPS), a crucial metric in fundamental analysis, is calculated by dividing a company's net income by the total number of outstanding shares. With more than 70 listed businesses, the Zimbabwe Stock Exchange (ZSE) is the primary stock exchange in Zimbabwe. This study applies the EPS financial ratio and stock valuation techniques to historical stock data from 68 companies listed on the Zimbabwe Stock Exchange. According to a ZSE study, EPS significantly affects share prices that are listed on the market. The study's objective was to assess how fundamental analysis affected the idea of EPS in ZSE stock valuation. It concluded that EPS is an important consideration for investors when they make judgments about their investments. According to the study's findings, fundamental analysis is a useful tool for ZSE investors since it offers insightful information about a company's financial performance and aids in decision-making. Investors can have a better understanding of a company's underlying worth and prospects for future growth by looking into EPS and other basic aspects.

Keywords: fundamental analysis, stock valuation, EPS, share pricing

Procedia PDF Downloads 49

41295 Framework for Integrating Big Data and Thick Data: Understanding Customers Better

Authors: Nikita Valluri, Vatcharaporn Esichaikul

Abstract:

With the popularity of data-driven decision making on the rise, this study focuses on providing an alternative outlook towards the process of decision-making. Combining quantitative and qualitative methods rooted in the social sciences, an integrated framework is presented with a focus on delivering a much more robust and efficient approach towards the concept of data-driven decision-making with respect to not only Big data but also 'Thick data', a new form of qualitative data. In support of this, an example from the retail sector has been illustrated where the framework is put into action to yield insights and leverage business intelligence. An interpretive approach to analyze findings from both kinds of quantitative and qualitative data has been used to glean insights. Using traditional Point-of-sale data as well as an understanding of customer psychographics and preferences, techniques of data mining along with qualitative methods (such as grounded theory, ethnomethodology, etc.) are applied. This study’s final goal is to establish the framework as a basis for providing a holistic solution encompassing both the Big and Thick aspects of any business need. The proposed framework is a modified enhancement in lieu of traditional data-driven decision-making approach, which is mainly dependent on quantitative data for decision-making.

Keywords: big data, customer behavior, customer experience, data mining, qualitative methods, quantitative methods, thick data

Procedia PDF Downloads 163

41294 Modeling the Demand for the Healthcare Services Using Data Analysis Techniques

Authors: Elizaveta S. Prokofyeva, Svetlana V. Maltseva, Roman D. Zaitsev

Abstract:

Rapidly evolving modern data analysis technologies in healthcare play a large role in understanding the operation of the system and its characteristics. Nowadays, one of the key tasks in urban healthcare is to optimize the resource allocation. Thus, the application of data analysis in medical institutions to solve optimization problems determines the significance of this study. The purpose of this research was to establish the dependence between the indicators of the effectiveness of the medical institution and its resources. Hospital discharges by diagnosis; hospital days of in-patients and in-patient average length of stay were selected as the performance indicators and the demand of the medical facility. The hospital beds by type of care, medical technology (magnetic resonance tomography, gamma cameras, angiographic complexes and lithotripters) and physicians characterized the resource provision of medical institutions for the developed models. The data source for the research was an open database of the statistical service Eurostat. The choice of the source is due to the fact that the databases contain complete and open information necessary for research tasks in the field of public health. In addition, the statistical database has a user-friendly interface that allows you to quickly build analytical reports. The study provides information on 28 European for the period from 2007 to 2016. For all countries included in the study, with the most accurate and complete data for the period under review, predictive models were developed based on historical panel data. An attempt to improve the quality and the interpretation of the models was made by cluster analysis of the investigated set of countries. The main idea was to assess the similarity of the joint behavior of the variables throughout the time period under consideration to identify groups of similar countries and to construct the separate regression models for them. Therefore, the original time series were used as the objects of clustering. The hierarchical agglomerate algorithm k-medoids was used. The sampled objects were used as the centers of the clusters obtained, since determining the centroid when working with time series involves additional difficulties. The number of clusters used the silhouette coefficient. After the cluster analysis it was possible to significantly improve the predictive power of the models: for example, in the one of the clusters, MAPE error was only 0,82%, which makes it possible to conclude that this forecast is highly reliable in the short term. The obtained predicted values of the developed models have a relatively low level of error and can be used to make decisions on the resource provision of the hospital by medical personnel. The research displays the strong dependencies between the demand for the medical services and the modern medical equipment variable, which highlights the importance of the technological component for the successful development of the medical facility. Currently, data analysis has a huge potential, which allows to significantly improving health services. Medical institutions that are the first to introduce these technologies will certainly have a competitive advantage.

Keywords: data analysis, demand modeling, healthcare, medical facilities

Procedia PDF Downloads 145

41293 A Fuzzy TOPSIS Based Model for Safety Risk Assessment of Operational Flight Data

Authors: N. Borjalilu, P. Rabiei, A. Enjoo

Abstract:

Flight Data Monitoring (FDM) program assists an operator in aviation industries to identify, quantify, assess and address operational safety risks, in order to improve safety of flight operations. FDM is a powerful tool for an aircraft operator integrated into the operator’s Safety Management System (SMS), allowing to detect, confirm, and assess safety issues and to check the effectiveness of corrective actions, associated with human errors. This article proposes a model for safety risk assessment level of flight data in a different aspect of event focus based on fuzzy set values. It permits to evaluate the operational safety level from the point of view of flight activities. The main advantages of this method are proposed qualitative safety analysis of flight data. This research applies the opinions of the aviation experts through a number of questionnaires Related to flight data in four categories of occurrence that can take place during an accident or an incident such as: Runway Excursions (RE), Controlled Flight Into Terrain (CFIT), Mid-Air Collision (MAC), Loss of Control in Flight (LOC-I). By weighting each one (by F-TOPSIS) and applying it to the number of risks of the event, the safety risk of each related events can be obtained.

Keywords: F-topsis, fuzzy set, flight data monitoring (FDM), flight safety

Procedia PDF Downloads 168

41292 Review of the Road Crash Data Availability in Iraq

Authors: Abeer K. Jameel, Harry Evdorides

Abstract:

Iraq is a middle income country where the road safety issue is considered one of the leading causes of deaths. To control the road risk issue, the Iraqi Ministry of Planning, General Statistical Organization started to organise a collection system of traffic accidents data with details related to their causes and severity. These data are published as an annual report. In this paper, a review of the available crash data in Iraq will be presented. The available data represent the rate of accidents in aggregated level and classified according to their types, road users’ details, and crash severity, type of vehicles, causes and number of causalities. The review is according to the types of models used in road safety studies and research, and according to the required road safety data in the road constructions tasks. The available data are also compared with the road safety dataset published in the United Kingdom as an example of developed country. It is concluded that the data in Iraq are suitable for descriptive and exploratory models, aggregated level comparison analysis, and evaluation and monitoring the progress of the overall traffic safety performance. However, important traffic safety studies require disaggregated level of data and details related to the factors of the likelihood of traffic crashes. Some studies require spatial geographic details such as the location of the accidents which is essential in ranking the roads according to their level of safety, and name the most dangerous roads in Iraq which requires tactic plan to control this issue. Global Road safety agencies interested in solve this problem in low and middle-income countries have designed road safety assessment methodologies which are basing on the road attributes data only. Therefore, in this research it is recommended to use one of these methodologies.

Keywords: road safety, Iraq, crash data, road risk assessment, The International Road Assessment Program (iRAP)

Procedia PDF Downloads 256

41291 Impact Assessment of Information Communication, Network Providers, Teledensity, and Consumer Complaints on Gross Domestic Products

Authors: Essang Anwana Onuntuei, Chinyere Blessing Azunwoke

Abstract:

The study used secondary data from foreign and local organizations to explore major challenges and opportunities abound in Information Communication. The study aimed at exploring the tie between tele density (network coverage area) and the number of network subscriptions, probing if the degree of consumer complaints varies significantly among network providers, and assessing if network subscriptions do significantly influence the sector’s GDP contribution. Methods used for data analysis include Pearson product-moment correlation and regression analysis, and the Analysis of Variance (ANOVA) as well. At a two-tailed test of 0.05 confidence level, the results of findings established about 85.6% of network subscriptions were explained by tele density (network coverage area), and the number of network subscriptions; Consumer Complaints’ degree varied significantly among network providers as 80.158291 (F calculated) > 3.490295 (F critical) with very high confidence associated p-value = 0.000000 which is < 0.05; and finally, 65% of the nation’s GDP was explained by network subscription to show a high association.

Keywords: tele density, subscription, network coverage, information communication, consumer

Procedia PDF Downloads 51