Search results for: data dependency graph
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25091

Search results for: data dependency graph

24461 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 543
24460 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 135
24459 Illuminating the Policies Affecting Energy Security in Malaysia’s Electricity Sector

Authors: Hussain Ali Bekhet, Endang Jati Mat Sahid

Abstract:

For the past few decades, the Malaysian economy has expanded at an impressive pace, whilst, the Malaysian population has registered a relatively high growth rate. These factors had driven the growth of final energy demand. The ballooning energy demand coupled with the country’s limited indigenous energy resources have resulted in an increased of the country’s net import. Therefore, acknowledging the precarious position of the country’s energy self-sufficiency, this study has identified three main concerns regarding energy security, namely; over-dependence on fossil fuel, increasing energy import dependency, and increasing energy consumption per capita. This paper discusses the recent energy demand and supply trends, highlights the policies that are affecting energy security in Malaysia and suggests strategic options towards achieving energy security. The paper suggested that diversifying energy sources, reducing carbon content of energy, efficient utilization of energy and facilitating low-carbon industries could further enhance the effectiveness of the measures as the introduction of policies and initiatives will be more holistic.

Keywords: electricity, energy policy, energy security, Malaysia

Procedia PDF Downloads 293
24458 Scattering Operator and Spectral Clustering for Ultrasound Images: Application on Deep Venous Thrombi

Authors: Thibaud Berthomier, Ali Mansour, Luc Bressollette, Frédéric Le Roy, Dominique Mottier, Léo Fréchier, Barthélémy Hermenault

Abstract:

Deep Venous Thrombosis (DVT) occurs when a thrombus is formed within a deep vein (most often in the legs). This disease can be deadly if a part or the whole thrombus reaches the lung and causes a Pulmonary Embolism (PE). This disorder, often asymptomatic, has multifactorial causes: immobilization, surgery, pregnancy, age, cancers, and genetic variations. Our project aims to relate the thrombus epidemiology (origins, patient predispositions, PE) to its structure using ultrasound images. Ultrasonography and elastography were collected using Toshiba Aplio 500 at Brest Hospital. This manuscript compares two classification approaches: spectral clustering and scattering operator. The former is based on the graph and matrix theories while the latter cascades wavelet convolutions with nonlinear modulus and averaging operators.

Keywords: deep venous thrombosis, ultrasonography, elastography, scattering operator, wavelet, spectral clustering

Procedia PDF Downloads 469
24457 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 328
24456 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 70
24455 Methodology of Construction Equipment Optimization for Earthwork

Authors: Jaehyun Choi, Hyunjung Kim, Namho Kim

Abstract:

Earthwork is one of the critical civil construction operations that require large-quantities of resources due to its intensive dependency upon construction equipment. Therefore, efficient construction equipment management can highly contribute to productivity improvements and cost savings. Earthwork operation utilizes various combinations of construction equipment in order to meet project requirements such as time and cost. Identification of site condition and construction methods should be performed in advance in order to develop a proper execution plan. The factors to be considered include capacity of equipment assigned, the method of construction, the size of the site, and the surrounding condition. In addition, optimal combination of various construction equipment should be selected. However, in real world practice, equipment utilization plan is performed based on experience and intuition of management. The researchers evaluated the efficiency of various alternatives of construction equipment combinations by utilizing the process simulation model, validated the model from a case study project, and presented a methodology to find optimized plan among alternatives.

Keywords: earthwork operation, construction equipment, process simulation, optimization

Procedia PDF Downloads 414
24454 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 262
24453 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 184
24452 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 33
24451 Environmental and Socioeconomic Determinants of Climate Change Resilience in Rural Nigeria: Empirical Evidence towards Resilience Building

Authors: Ignatius Madu

Abstract:

The study aims at assessing the environmental and socioeconomic determinants of climate change resilience in rural Nigeria. This is necessary because researches and development efforts on building climate change resilience of rural areas in developing countries are usually made without the knowledge of the impacts of the inherent rural characteristics that determine resilient capacities of the households. This has, in many cases, led to costly mistakes, delayed responses, inaccurate outcomes, and other difficulties. Consequently, this assessment becomes crucial not only to policymakers and people living in risk-prone environments in rural areas but also to fill the research gap. To achieve the aim, secondary data were obtained from the Annual Abstract of Statistics 2017, LSMS-Integrated Surveys on Agriculture and General Household Survey Panel 2015/2016, and National Agriculture Sample Survey (NASS), 2010/2011.Resilience was calculated by weighting and adding the adaptive, absorptive and anticipatory measures of households variables aggregated at state levels and then regressed against rural environmental and socioeconomic characteristics influencing it. From the regression, the coefficients of the variables were used to compute the impacts of the variables using the Stochastic Regression of Impacts on Population, Affluence and Technology (STIRPAT) Model. The results showed that the northern States are generally low in resilient indices and are impacted less by the development indicators. The major determining factors are percentage of non-poor, environmental protection, road transport development, landholding, agricultural input, population density, dependency ratio (inverse), household asserts, education and maternal care. The paper concludes that any effort to a successful resilient building in rural areas of the country should first address these key factors that enhance rural development and wellbeing since it is better to take action before shocks take place.

Keywords: climate change resilience; spatial impacts; STIRPAT model; Nigeria

Procedia PDF Downloads 140
24450 Understanding Health Behavior Using Social Network Analysis

Authors: Namrata Mishra

Abstract:

Health of a person plays a vital role in the collective health of his community and hence the well-being of the society as a whole. But, in today’s fast paced technology driven world, health issues are increasingly being associated with human behaviors – their lifestyle. Social networks have tremendous impact on the health behavior of individuals. Many researchers have used social network analysis to understand human behavior that implicates their social and economic environments. It would be interesting to use a similar analysis to understand human behaviors that have health implications. This paper focuses on concepts of those behavioural analyses that have health implications using social networks analysis and provides possible algorithmic approaches. The results of these approaches can be used by the governing authorities for rolling out health plans, benefits and take preventive measures, while the pharmaceutical companies can target specific markets, helping health insurance companies to better model their insurance plans.

Keywords: breadth first search, directed graph, health behaviors, social network analysis

Procedia PDF Downloads 462
24449 Electric Vehicle Fleet Operators in the Energy Market - Feasibility and Effects on the Electricity Grid

Authors: Benjamin Blat Belmonte, Stephan Rinderknecht

Abstract:

The transition to electric vehicles (EVs) stands at the forefront of innovative strategies designed to address environmental concerns and reduce fossil fuel dependency. As the number of EVs on the roads increases, so too does the potential for their integration into energy markets. This research dives deep into the transformative possibilities of using electric vehicle fleets, specifically electric bus fleets, not just as consumers but as active participants in the energy market. This paper investigates the feasibility and grid effects of electric vehicle fleet operators in the energy market. Our objective centers around a comprehensive exploration of the sector coupling domain, with an emphasis on the economic potential in both electricity and balancing markets. Methodologically, our approach combines data mining techniques with thorough pre-processing, pulling from a rich repository of electricity and balancing market data. Our findings are grounded in the actual operational realities of the bus fleet operator in Darmstadt, Germany. We employ a Mixed Integer Linear Programming (MILP) approach, with the bulk of the computations being processed on the High-Performance Computing (HPC) platform ‘Lichtenbergcluster’. Our findings underscore the compelling economic potential of EV fleets in the energy market. With electric buses becoming more prevalent, the considerable size of these fleets, paired with their substantial battery capacity, opens up new horizons for energy market participation. Notably, our research reveals that economic viability is not the sole advantage. Participating actively in the energy market also translates into pronounced positive effects on grid stabilization. Essentially, EV fleet operators can serve a dual purpose: facilitating transport while simultaneously playing an instrumental role in enhancing grid reliability and resilience. This research highlights the symbiotic relationship between the growth of EV fleets and the stabilization of the energy grid. Such systems could lead to both commercial and ecological advantages, reinforcing the value of electric bus fleets in the broader landscape of sustainable energy solutions. In conclusion, the electrification of transport offers more than just a means to reduce local greenhouse gas emissions. By positioning electric vehicle fleet operators as active participants in the energy market, there lies a powerful opportunity to drive forward the energy transition. This study serves as a testament to the synergistic potential of EV fleets in bolstering both economic viability and grid stabilization, signaling a promising trajectory for future sector coupling endeavors.

Keywords: electric vehicle fleet, sector coupling, optimization, electricity market, balancing market

Procedia PDF Downloads 62
24448 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 55
24447 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 403
24446 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 243
24445 Comparison between Continuous Genetic Algorithms and Particle Swarm Optimization for Distribution Network Reconfiguration

Authors: Linh Nguyen Tung, Anh Truong Viet, Nghien Nguyen Ba, Chuong Trinh Trong

Abstract:

This paper proposes a reconfiguration methodology based on a continuous genetic algorithm (CGA) and particle swarm optimization (PSO) for minimizing active power loss and minimizing voltage deviation. Both algorithms are adapted using graph theory to generate feasible individuals, and the modified crossover is used for continuous variable of CGA. To demonstrate the performance and effectiveness of the proposed methods, a comparative analysis of CGA with PSO for network reconfiguration, on 33-node and 119-bus radial distribution system is presented. The simulation results have shown that both CGA and PSO can be used in the distribution network reconfiguration and CGA outperformed PSO with significant success rate in finding optimal distribution network configuration.

Keywords: distribution network reconfiguration, particle swarm optimization, continuous genetic algorithm, power loss reduction, voltage deviation

Procedia PDF Downloads 175
24444 On-Ice Force-Velocity Modeling Technical Considerations

Authors: Dan Geneau, Mary Claire Geneau, Seth Lenetsky, Ming -Chang Tsai, Marc Klimstra

Abstract:

Introduction— Horizontal force-velocity profiling (HFVP) involves modeling an athletes linear sprint kinematics to estimate valuable maximum force and velocity metrics. This approach to performance modeling has been used in field-based team sports and has recently been introduced to ice-hockey as a forward skating performance assessment. While preliminary data has been collected on ice, distance constraints of the on-ice test restrict the ability of the athletes to reach their maximal velocity which result in limits of the model to effectively estimate athlete performance. This is especially true of more elite athletes. This report explores whether athletes on-ice are able to reach a velocity plateau similar to what has been seen in overground trials. Fourteen male Major Junior ice-hockey players (BW= 83.87 +/- 7.30 kg, height = 188 ± 3.4cm cm, age = 18 ± 1.2 years n = 14) were recruited. For on-ice sprints, participants completed a standardized warm-up consisting of skating and dynamic stretching and a progression of three skating efforts from 50% to 95%. Following the warm-up, participants completed three on ice 45m sprints, with three minutes of rest in between each trial. For overground sprints, participants completed a similar dynamic warm-up to that of on-ice trials. Following the warm-up participants completed three 40m overground sprint trials. For each trial (on-ice and overground), radar was used to collect instantaneous velocity (Stalker ATS II, Texas, USA) aimed at the participant’s waist. Sprint velocities were modelled using custom Python (version 3.2) script using a mono-exponential function, similar to previous work. To determine if on-ice tirals were achieving a maximum velocity (plateau), minimum acceleration values of the modeled data at the end of the sprint were compared (using paired t-test) between on-ice and overground trials. Significant differences (P<0.001) between overground and on-ice minimum accelerations were observed. It was found that on-ice trials consistently reported higher final acceleration values, indicating a maximum maintained velocity (plateau) had not been reached. Based on these preliminary findings, it is suggested that reliable HFVP metrics cannot yet be collected from all ice-hockey populations using current methods. Elite male populations were not able to achieve a velocity plateau similar to what has been seen in overground trials, indicating the absence of a maximum velocity measure. With current velocity and acceleration modeling techniques, including a dependency of a velocity plateau, these results indicate the potential for error in on-ice HFVP measures. Therefore, these findings suggest that a greater on-ice sprint distance may be required or the need for other velocity modeling techniques, where maximal velocity is not required for a complete profile.   

Keywords: ice-hockey, sprint, skating, power

Procedia PDF Downloads 93
24443 The Influence of Carbamazepine on the Activity of CYP3A4 in Patients with Alcoholism

Authors: Mikhail S. Zastrozhin, Valery V. Smirnov, Dmitry A. Sychev, Ludmila M. Savchenko, Evgeny A. Bryun, Mark O. Nechaev

Abstract:

Cytochrome P-450 isoenzyme 3A4 takes part in the biotransformation of medical drugs. The activity of CYP isoenzymes depends on genetic (polymorphisms of genes which encoded it) and phenotypic factors (a kind of food, a concomitant drug therapy). The aim of the study was to evaluate a carbamazepine effect on the CYP3A4 activity in patients with alcohol addiction. The study included 25 men with alcohol dependence, who received haloperidol during the exacerbation of the addiction. CYP3A4 activity was assessed by urinary 6-beta-hydroxycortisol/cortisol ratios measured by high performance liquid chromatography with mass spectrometry. The study modeled a graph and an equation of the logarithmic regression, that reflects the dependence of CYP3A4 activity on a dose of carbamazepine: y = 5,5 * 9,1 * 10-5 * x2. The study statistically significant demonstrates the effect of carbamazepine on CYP2D6 isozyme activity in patients with alcohol addiction.

Keywords: CYP3A4, biotransformation, carbamazepine, alcohol abuse

Procedia PDF Downloads 265
24442 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 125
24441 Feature Engineering Based Detection of Buffer Overflow Vulnerability in Source Code Using Deep Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

One of the most important challenges in the field of software code audit is the presence of vulnerabilities in software source code. Every year, more and more software flaws are found, either internally in proprietary code or revealed publicly. These flaws are highly likely exploited and lead to system compromise, data leakage, or denial of service. C and C++ open-source code are now available in order to create a largescale, machine-learning system for function-level vulnerability identification. We assembled a sizable dataset of millions of opensource functions that point to potential exploits. We developed an efficient and scalable vulnerability detection method based on deep neural network models that learn features extracted from the source codes. The source code is first converted into a minimal intermediate representation to remove the pointless components and shorten the dependency. Moreover, we keep the semantic and syntactic information using state-of-the-art word embedding algorithms such as glove and fastText. The embedded vectors are subsequently fed into deep learning networks such as LSTM, BilSTM, LSTM-Autoencoder, word2vec, BERT, and GPT-2 to classify the possible vulnerabilities. Furthermore, we proposed a neural network model which can overcome issues associated with traditional neural networks. Evaluation metrics such as f1 score, precision, recall, accuracy, and total execution time have been used to measure the performance. We made a comparative analysis between results derived from features containing a minimal text representation and semantic and syntactic information. We found that all of the deep learning models provide comparatively higher accuracy when we use semantic and syntactic information as the features but require higher execution time as the word embedding the algorithm puts on a bit of complexity to the overall system.

Keywords: cyber security, vulnerability detection, neural networks, feature extraction

Procedia PDF Downloads 75
24440 Livelihood and Willingness to Accept Reducing Emission from Deforestation and Degradation by Local People in the Southwestern Nigeria

Authors: Adebayo John Julius, Emmanuel Imoagene

Abstract:

Mitigating global warming through reducing emission from deforestation and degradation (REDD) has been given increasing attentions in government-to-government negotiations while discussions among decision-makers have been going on, it is important to learn about the perception of local people in relation to REDD because the implementation will affect their lives. A survey was conducted using questionnaires to examine the livelihood and forest dependency of the local people in the vicinity of Onigambari and Ido area. Respondents’ income from forest activities and forest resources are collected. Participation in tourism related activities among the household members was also investigated to measure the potential of this “eco-friendly” income generation activity in the local communities. There was a general indication of reducing slash-and-burn activities with distance from the park and involvement in tourism-related job. Most of the local people were willing to accept compensation as alternative for slash-and-burn activities. The compensation preferred is in various form of development and different level of forest and environmental activities

Keywords: livelihood, emission, deforestation, degradation, local people, southwest Nigeria

Procedia PDF Downloads 130
24439 Quantum Computing with Qudits on a Graph

Authors: Aleksey Fedorov

Abstract:

Building a scalable platform for quantum computing remains one of the most challenging tasks in quantum science and technologies. However, the implementation of most important quantum operations with qubits (quantum analogues of classical bits), such as multiqubit Toffoli gate, requires either a polynomial number of operation or a linear number of operations with the use of ancilla qubits. Therefore, the reduction of the number of operations in the presence of scalability is a crucial goal in quantum information processing. One of the most elegant ideas in this direction is to use qudits (multilevel systems) instead of qubits and rely on additional levels of qudits instead of ancillas. Although some of the already obtained results demonstrate a reduction of the number of operation, they suffer from high complexity and/or of the absence of scalability. We show a strong reduction of the number of operations for the realization of the Toffoli gate by using qudits for a scalable multi-qudit processor. This is done on the basis of a general relation between the dimensionality of qudits and their topology of connections, that we derived.

Keywords: quantum computing, qudits, Toffoli gates, gate decomposition

Procedia PDF Downloads 135
24438 Implementation of A Treatment Escalation Plan During The Covid 19 Outbreak in Aneurin Bevan University Health Board

Authors: Peter Collett, Mike Pynn, Haseeb Ur Rahman

Abstract:

For the last few years across the UK there has been a push towards implementing treatment escalation plans (TEP) for every patient admitted to hospital. This is a paper form which is completed by a junior doctor then countersigned by the consultant responsible for the patient's care. It is designed to address what level of care is appropriate for the patient in question at point of entry to hospital. It helps decide whether the patient would benefit for ward based, high dependency or intensive care. They are completed to ensure the patient's best interests are maintained and aim to facilitate difficult decisions which may be required at a later date. For example, a frail patient with significant co-morbidities, unlikely to survive a pathology requiring an intensive care admission is admitted to hospital the decision can be made early to state the patient would not benefit from an ICU admission. This decision can be reversed depending on the clinical course of the patient's admission. It promotes discussions with the patient regarding their wishes to receive certain levels of healthcare. This poster describes the steps taken in the Aneurin Bevan University Health Board (ABUHB) when implementing the TEP form. The team implementing the TEP form campaigned for it's use to the board of directors. The directors were eager to hear of experiences of other health boards who had implemented the TEP form. The team presented the data produced in a number of health boards and demonstrated the proposed form. Concern was raised regarding the legalities of the form and that it could upset patients and relatives if the form was not explained properly. This delayed the effectuation of the TEP form and further research and discussion would be required. When COVID 19 reached the UK the National Institute for Health and Clinical Excellence issued guidance stating every patient admitted to hospital should be issued a TEP form. The TEP form was accelerated through the vetting process and was approved with immediate effect. The TEP form in ABUHB has now been in circulation for a month. An audit investigating it's uptake and a survey gathering opinions have been conducted.

Keywords: acute medicine, clinical governance, intensive care, patient centered decision making

Procedia PDF Downloads 168
24437 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 122
24436 Numerical Simulation of Flow Past Inline Tandem Cylinders in Uniform Shear Flow

Authors: Rajesh Bhatt, Dilip Kumar Maiti

Abstract:

The incompressible shear flow past a square cylinder placed parallel to a plane wall of side length A in presence of upstream rectangular cylinder of height 0.5A and width 0.25A in an inline tandem arrangement are numerically investigated using finite volume method. The discretized equations are solved by an implicit, time-marching, pressure correction based SIMPLE algorithm. This study provides the qualitative insight in to the dependency of basic structure (i.e. vortex shedding or suppression) of flow over the downstream square cylinder and the upstream rectangular cylinder (and hence the aerodynamic characteristics) on inter-cylinder spacing (S) and Reynolds number (Re). The spacing between the cylinders is varied systematically from S = 0.5A to S = 7.0A so the sensitivity of the flow structure between the cylinders can be inspected. A sudden jump in strouhal number is observed, which shows the transition of flow pattern in the wake of the cylinders. The results are presented at Re = 100 and 200 in term of Strouhal number, RMS and mean of lift and drag coefficients and contour plots for different spacing.

Keywords: square cylinder, vortex shedding, isolated, tandem arrangement, spacing distance

Procedia PDF Downloads 538
24435 Experimental Verification of On-Board Power Generation System for Vehicle Application

Authors: Manish Kumar, Krupa Shah

Abstract:

The usage of renewable energy sources is increased day by day to overcome the dependency on fossil fuels. The wind energy is considered as a prominent source of renewable energy. This paper presents an approach for utilizing wind energy obtained from moving the vehicle for cell-phone charging. The selection of wind turbine, blades, generator, etc. is done to have the most efficient system. The calculation procedure for power generated and drag force is shown to know the effectiveness of the proposal. The location of the turbine is selected such that the system remains symmetric, stable and has the maximum induced wind. The calculation of the generated power at different velocity is presented. The charging is achieved for the speed 30 km/h and the system works well till 60 km/h. The model proposed seems very useful for the people traveling long distances in the absence of mobile electricity. The model is very economical and easy to fabricate. It has very less weight and area that makes it portable and comfortable to carry along. The practical results are shown by implementing the portable wind turbine system on two-wheeler.

Keywords: cell-phone charging, on-board power generation, wind energy, vehicle

Procedia PDF Downloads 289
24434 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 424
24433 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 82
24432 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 146