Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24468

Search results for: big data computation

24258 Quantification of Magnetic Resonance Elastography for Tissue Shear Modulus using U-Net Trained with Finite-Differential Time-Domain Simulation

Authors: Jiaying Zhang, Xin Mu, Chang Ni, Jeff L. Zhang

Abstract:

Magnetic resonance elastography (MRE) non-invasively assesses tissue elastic properties, such as shear modulus, by measuring tissue’s displacement in response to mechanical waves. The estimated metrics on tissue elasticity or stiffness have been shown to be valuable for monitoring physiologic or pathophysiologic status of tissue, such as a tumor or fatty liver. To quantify tissue shear modulus from MRE-acquired displacements (essentially an inverse problem), multiple approaches have been proposed, including Local Frequency Estimation (LFE) and Direct Inversion (DI). However, one common problem with these methods is that the estimates are severely noise-sensitive due to either the inverse-problem nature or noise propagation in the pixel-by-pixel process. With the advent of deep learning (DL) and its promise in solving inverse problems, a few groups in the field of MRE have explored the feasibility of using DL methods for quantifying shear modulus from MRE data. Most of the groups chose to use real MRE data for DL model training and to cut training images into smaller patches, which enriches feature characteristics of training data but inevitably increases computation time and results in outcomes with patched patterns. In this study, simulated wave images generated by Finite Differential Time Domain (FDTD) simulation are used for network training, and U-Net is used to extract features from each training image without cutting it into patches. The use of simulated data for model training has the flexibility of customizing training datasets to match specific applications. The proposed method aimed to estimate tissue shear modulus from MRE data with high robustness to noise and high model-training efficiency. Specifically, a set of 3000 maps of shear modulus (with a range of 1 kPa to 15 kPa) containing randomly positioned objects were simulated, and their corresponding wave images were generated. The two types of data were fed into the training of a U-Net model as its output and input, respectively. For an independently simulated set of 1000 images, the performance of the proposed method against DI and LFE was compared by the relative errors (root mean square error or RMSE divided by averaged shear modulus) between the true shear modulus map and the estimated ones. The results showed that the estimated shear modulus by the proposed method achieved a relative error of 4.91%±0.66%, substantially lower than 78.20%±1.11% by LFE. Using simulated data, the proposed method significantly outperformed LFE and DI in resilience to increasing noise levels and in resolving fine changes of shear modulus. The feasibility of the proposed method was also tested on MRE data acquired from phantoms and from human calf muscles, resulting in maps of shear modulus with low noise. In future work, the method’s performance on phantom and its repeatability on human data will be tested in a more quantitative manner. In conclusion, the proposed method showed much promise in quantifying tissue shear modulus from MRE with high robustness and efficiency.

Keywords: deep learning, magnetic resonance elastography, magnetic resonance imaging, shear modulus estimation

Procedia PDF Downloads 32

24257 Algorithms used in Spatial Data Mining GIS

Authors: Vahid Bairami Rad

Abstract:

Extracting knowledge from spatial data like GIS data is important to reduce the data and extract information. Therefore, the development of new techniques and tools that support the human in transforming data into useful knowledge has been the focus of the relatively new and interdisciplinary research area ‘knowledge discovery in databases’. Thus, we introduce a set of database primitives or basic operations for spatial data mining which are sufficient to express most of the spatial data mining algorithms from the literature. This approach has several advantages. Similar to the relational standard language SQL, the use of standard primitives will speed-up the development of new data mining algorithms and will also make them more portable. We introduced a database-oriented framework for spatial data mining which is based on the concepts of neighborhood graphs and paths. A small set of basic operations on these graphs and paths were defined as database primitives for spatial data mining. Furthermore, techniques to efficiently support the database primitives by a commercial DBMS were presented.

Keywords: spatial data base, knowledge discovery database, data mining, spatial relationship, predictive data mining

Procedia PDF Downloads 424

24256 Quasi-Photon Monte Carlo on Radiative Heat Transfer: An Importance Sampling and Learning Approach

Authors: Utkarsh A. Mishra, Ankit Bansal

Abstract:

At high temperature, radiative heat transfer is the dominant mode of heat transfer. It is governed by various phenomena such as photon emission, absorption, and scattering. The solution of the governing integrodifferential equation of radiative transfer is a complex process, more when the effect of participating medium and wavelength properties are taken into consideration. Although a generic formulation of such radiative transport problem can be modeled for a wide variety of problems with non-gray, non-diffusive surfaces, there is always a trade-off between simplicity and accuracy of the problem. Recently, solutions of complicated mathematical problems with statistical methods based on randomization of naturally occurring phenomena have gained significant importance. Photon bundles with discrete energy can be replicated with random numbers describing the emission, absorption, and scattering processes. Photon Monte Carlo (PMC) is a simple, yet powerful technique, to solve radiative transfer problems in complicated geometries with arbitrary participating medium. The method, on the one hand, increases the accuracy of estimation, and on the other hand, increases the computational cost. The participating media -generally a gas, such as CO₂, CO, and H₂O- present complex emission and absorption spectra. To model the emission/absorption accurately with random numbers requires a weighted sampling as different sections of the spectrum carries different importance. Importance sampling (IS) was implemented to sample random photon of arbitrary wavelength, and the sampled data provided unbiased training of MC estimators for better results. A better replacement to uniform random numbers is using deterministic, quasi-random sequences. Halton, Sobol, and Faure Low-Discrepancy Sequences are used in this study. They possess better space-filling performance than the uniform random number generator and gives rise to a low variance, stable Quasi-Monte Carlo (QMC) estimators with faster convergence. An optimal supervised learning scheme was further considered to reduce the computation costs of the PMC simulation. A one-dimensional plane-parallel slab problem with participating media was formulated. The history of some randomly sampled photon bundles is recorded to train an Artificial Neural Network (ANN), back-propagation model. The flux was calculated using the standard quasi PMC and was considered to be the training target. Results obtained with the proposed model for the one-dimensional problem are compared with the exact analytical and PMC model with the Line by Line (LBL) spectral model. The approximate variance obtained was around 3.14%. Results were analyzed with respect to time and the total flux in both cases. A significant reduction in variance as well a faster rate of convergence was observed in the case of the QMC method over the standard PMC method. However, the results obtained with the ANN method resulted in greater variance (around 25-28%) as compared to the other cases. There is a great scope of machine learning models to help in further reduction of computation cost once trained successfully. Multiple ways of selecting the input data as well as various architectures will be tried such that the concerned environment can be fully addressed to the ANN model. Better results can be achieved in this unexplored domain.

Keywords: radiative heat transfer, Monte Carlo Method, pseudo-random numbers, low discrepancy sequences, artificial neural networks

Procedia PDF Downloads 185

24255 Data Stream Association Rule Mining with Cloud Computing

Authors: B. Suraj Aravind, M. H. M. Krishna Prasad

Abstract:

There exist emerging applications of data streams that require association rule mining, such as network traffic monitoring, web click streams analysis, sensor data, data from satellites etc. Data streams typically arrive continuously in high speed with huge amount and changing data distribution. This raises new issues that need to be considered when developing association rule mining techniques for stream data. This paper proposes to introduce an improved data stream association rule mining algorithm by eliminating the limitation of resources. For this, the concept of cloud computing is used. Inclusion of this may lead to additional unknown problems which needs further research.

Keywords: data stream, association rule mining, cloud computing, frequent itemsets

Procedia PDF Downloads 469

24254 Computation of Stress Intensity Factor Using Extended Finite Element Method

Authors: Mahmoudi Noureddine, Bouregba Rachid

Abstract:

In this paper the stress intensity factors of a slant-cracked plate of AISI 304 stainless steel, have been calculated using extended finite element method and finite element method (FEM) in ABAQUS software, the results were compared with theoretical values.

Keywords: stress intensity factors, extended finite element method, stainless steel, abaqus

Procedia PDF Downloads 583

24253 A Comprehensive Survey and Improvement to Existing Privacy Preserving Data Mining Techniques

Authors: Tosin Ige

Abstract:

Ethics must be a condition of the world, like logic. (Ludwig Wittgenstein, 1889-1951). As important as data mining is, it possess a significant threat to ethics, privacy, and legality, since data mining makes it difficult for an individual or consumer (in the case of a company) to control the accessibility and usage of his data. This research focuses on Current issues and the latest research and development on Privacy preserving data mining methods as at year 2022. It also discusses some advances in those techniques while at the same time highlighting and providing a new technique as a solution to an existing technique of privacy preserving data mining methods. This paper also bridges the wide gap between Data mining and the Web Application Programing Interface (web API), where research is urgently needed for an added layer of security in data mining while at the same time introducing a seamless and more efficient way of data mining.

Keywords: data, privacy, data mining, association rule, privacy preserving, mining technique

Procedia PDF Downloads 127

24252 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: big data, big data analytics, Hadoop, cloud

Procedia PDF Downloads 278

24251 Semantic Data Schema Recognition

Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia

Abstract:

The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.

Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns

Procedia PDF Downloads 392

24250 Applied Complement of Probability and Information Entropy for Prediction in Student Learning

Authors: Kennedy Efosa Ehimwenma, Sujatha Krishnamoorthy, Safiya Al‑Sharji

Abstract:

The probability computation of events is in the interval of [0, 1], which are values that are determined by the number of outcomes of events in a sample space S. The probability Pr(A) that an event A will never occur is 0. The probability Pr(B) that event B will certainly occur is 1. This makes both events A and B a certainty. Furthermore, the sum of probabilities Pr(E₁) + Pr(E₂) + … + Pr(Eₙ) of a finite set of events in a given sample space S equals 1. Conversely, the difference of the sum of two probabilities that will certainly occur is 0. This paper first discusses Bayes, the complement of probability, and the difference of probability for occurrences of learning-events before applying them in the prediction of learning objects in student learning. Given the sum of 1; to make a recommendation for student learning, this paper proposes that the difference of argMaxPr(S) and the probability of student-performance quantifies the weight of learning objects for students. Using a dataset of skill-set, the computational procedure demonstrates i) the probability of skill-set events that have occurred that would lead to higher-level learning; ii) the probability of the events that have not occurred that requires subject-matter relearning; iii) accuracy of the decision tree in the prediction of student performance into class labels and iv) information entropy about skill-set data and its implication on student cognitive performance and recommendation of learning.

Keywords: complement of probability, Bayes’ rule, prediction, pre-assessments, computational education, information theory

Procedia PDF Downloads 127

24249 Access Control System for Big Data Application

Authors: Winfred Okoe Addy, Jean Jacques Dominique Beraud

Abstract:

Access control systems (ACs) are some of the most important components in safety areas. Inaccuracies of regulatory frameworks make personal policies and remedies more appropriate than standard models or protocols. This problem is exacerbated by the increasing complexity of software, such as integrated Big Data (BD) software for controlling large volumes of encrypted data and resources embedded in a dedicated BD production system. This paper proposes a general access control strategy system for the diffusion of Big Data domains since it is crucial to secure the data provided to data consumers (DC). We presented a general access control circulation strategy for the Big Data domain by describing the benefit of using designated access control for BD units and performance and taking into consideration the need for BD and AC system. We then presented a generic of Big Data access control system to improve the dissemination of Big Data.

Keywords: access control, security, Big Data, domain

Procedia PDF Downloads 105

24248 A Data Envelopment Analysis Model in a Multi-Objective Optimization with Fuzzy Environment

Authors: Michael Gidey Gebru

Abstract:

Most of Data Envelopment Analysis models operate in a static environment with input and output parameters that are chosen by deterministic data. However, due to ambiguity brought on shifting market conditions, input and output data are not always precisely gathered in real-world scenarios. Fuzzy numbers can be used to address this kind of ambiguity in input and output data. Therefore, this work aims to expand crisp Data Envelopment Analysis into Data Envelopment Analysis with fuzzy environment. In this study, the input and output data are regarded as fuzzy triangular numbers. Then, the Data Envelopment Analysis model with fuzzy environment is solved using a multi-objective method to gauge the Decision Making Units' efficiency. Finally, the developed Data Envelopment Analysis model is illustrated with an application on real data 50 educational institutions.

Keywords: efficiency, Data Envelopment Analysis, fuzzy, higher education, input, output

Procedia PDF Downloads 10

24247 An Enhanced Distributed Weighted Clustering Algorithm for Intra and Inter Cluster Routing in MANET

Authors: K. Gomathi

Abstract:

Mobile Ad hoc Networks (MANET) is defined as collection of routable wireless mobile nodes with no centralized administration and communicate each other using radio signals. Especially MANETs deployed in hostile environments where hackers will try to disturb the secure data transfer and drain the valuable network resources. Since MANET is battery operated network, preserving the network resource is essential one. For resource constrained computation, efficient routing and to increase the network stability, the network is divided into smaller groups called clusters. The clustering architecture consists of Cluster Head(CH), ordinary node and gateway. The CH is responsible for inter and intra cluster routing. CH election is a prominent research area and many more algorithms are developed using many different metrics. The CH with longer life sustains network lifetime, for this purpose Secondary Cluster Head(SCH) also elected and it is more economical. To nominate efficient CH, a Enhanced Distributed Weighted Clustering Algorithm (EDWCA) has been proposed. This approach considers metrics like battery power, degree difference and speed of the node for CH election. The proficiency of proposed one is evaluated and compared with existing algorithm using Network Simulator(NS-2).

Keywords: MANET, EDWCA, clustering, cluster head

Procedia PDF Downloads 363

24246 Cellular Automata Using Fractional Integral Model

Authors: Yasser F. Hassan

Abstract:

In this paper, a proposed model of cellular automata is studied by means of fractional integral function. A cellular automaton is a decentralized computing model providing an excellent platform for performing complex computation with the help of only local information. The paper discusses how using fractional integral function for representing cellular automata memory or state. The architecture of computing and learning model will be given and the results of calibrating of approach are also given.

Keywords: fractional integral, cellular automata, memory, learning

Procedia PDF Downloads 379

24245 Bayesian Parameter Inference for Continuous Time Markov Chains with Intractable Likelihood

Authors: Randa Alharbi, Vladislav Vyshemirsky

Abstract:

Systems biology is an important field in science which focuses on studying behaviour of biological systems. Modelling is required to produce detailed description of the elements of a biological system, their function, and their interactions. A well-designed model requires selecting a suitable mechanism which can capture the main features of the system, define the essential components of the system and represent an appropriate law that can define the interactions between its components. Complex biological systems exhibit stochastic behaviour. Thus, using probabilistic models are suitable to describe and analyse biological systems. Continuous-Time Markov Chain (CTMC) is one of the probabilistic models that describe the system as a set of discrete states with continuous time transitions between them. The system is then characterised by a set of probability distributions that describe the transition from one state to another at a given time. The evolution of these probabilities through time can be obtained by chemical master equation which is analytically intractable but it can be simulated. Uncertain parameters of such a model can be inferred using methods of Bayesian inference. Yet, inference in such a complex system is challenging as it requires the evaluation of the likelihood which is intractable in most cases. There are different statistical methods that allow simulating from the model despite intractability of the likelihood. Approximate Bayesian computation is a common approach for tackling inference which relies on simulation of the model to approximate the intractable likelihood. Particle Markov chain Monte Carlo (PMCMC) is another approach which is based on using sequential Monte Carlo to estimate intractable likelihood. However, both methods are computationally expensive. In this paper we discuss the efficiency and possible practical issues for each method, taking into account the computational time for these methods. We demonstrate likelihood-free inference by performing analysing a model of the Repressilator using both methods. Detailed investigation is performed to quantify the difference between these methods in terms of efficiency and computational cost.

Keywords: Approximate Bayesian computation(ABC), Continuous-Time Markov Chains, Sequential Monte Carlo, Particle Markov chain Monte Carlo (PMCMC)

Procedia PDF Downloads 179

24244 Flow Transformation: An Investigation on Theoretical Aspects and Numerical Computation

Authors: Abhisek Sarkar, Abhimanyu Gaur

Abstract:

In this report we have discussed the theoretical aspects of the flow transformation, occurring through a series of bifurcations. The parameters and their continuous diversion, the intermittent bursts in the transition zone, variation of velocity and pressure with time, effect of roughness in turbulent zone, and changes in friction factor and head loss coefficient as a function of Reynolds number for a transverse flow across a cylinder have been discussed. An analysis of the variation in the wake length with Reynolds number was done in FORTRAN.

Keywords: bifurcation, attractor, intermittence, energy cascade, energy spectra, vortex stretching

Procedia PDF Downloads 369

24243 Optimization of the Numerical Fracture Mechanics

Authors: H. Hentati, R. Abdelmoula, Li Jia, A. Maalej

Abstract:

In this work, we present numerical simulations of the quasi-static crack propagation based on the variation approach. We perform numerical simulations of a piece of brittle material without initial crack. An alternate minimization algorithm is used. Based on these numerical results, we determine the influence of numerical parameters on the location of crack. We show the importance of trying to optimize the time of numerical computation and we present the first attempt to develop a simple numerical method to optimize this time.

Keywords: fracture mechanics, optimization, variation approach, mechanic

Procedia PDF Downloads 575

24242 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation

Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan

Abstract:

Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.

Keywords: binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform

Procedia PDF Downloads 274

24241 Graph Codes - 2D Projections of Multimedia Feature Graphs for Fast and Effective Retrieval

Authors: Stefan Wagenpfeil, Felix Engel, Paul McKevitt, Matthias Hemmje

Abstract:

Multimedia Indexing and Retrieval is generally designed and implemented by employing feature graphs. These graphs typically contain a significant number of nodes and edges to reflect the level of detail in feature detection. A higher level of detail increases the effectiveness of the results but also leads to more complex graph structures. However, graph-traversal-based algorithms for similarity are quite inefficient and computation intensive, especially for large data structures. To deliver fast and effective retrieval, an efficient similarity algorithm, particularly for large graphs, is mandatory. Hence, in this paper, we define a graph-projection into a 2D space (Graph Code) as well as the corresponding algorithms for indexing and retrieval. We show that calculations in this space can be performed more efficiently than graph-traversals due to a simpler processing model and a high level of parallelization. In consequence, we prove that the effectiveness of retrieval also increases substantially, as Graph Codes facilitate more levels of detail in feature fusion. Thus, Graph Codes provide a significant increase in efficiency and effectiveness (especially for Multimedia indexing and retrieval) and can be applied to images, videos, audio, and text information.

Keywords: indexing, retrieval, multimedia, graph algorithm, graph code

Procedia PDF Downloads 129

24240 Effects of Computer Aided Instructional Package on Performance and Retention of Genetic Concepts amongst Secondary School Students in Niger State, Nigeria

Authors: Muhammad R. Bello, Mamman A. Wasagu, Yahya M. Kamar

Abstract:

The study investigated the effects of computer-aided instructional package (CAIP) on performance and retention of genetic concepts among secondary school students in Niger State. Quasi-experimental research design i.e. pre-test-post-test experimental and control groups were adopted for the study. The population of the study was all senior secondary school three (SS3) students’ offering biology. A sample of 223 students was randomly drawn from six purposively selected secondary schools. The researchers’ developed computer aided instructional package (CAIP) on genetic concepts was used as treatment instrument for the experimental group while the control group was exposed to the conventional lecture method (CLM). The instrument for data collection was a Genetic Performance Test (GEPET) that had 50 multiple-choice questions which were validated by science educators. A Reliability coefficient of 0.92 was obtained for GEPET using Pearson Product Moment Correlation (PPMC). The data collected were analyzed using IBM SPSS Version 20 package for computation of Means, Standard deviation, t-test, and analysis of covariance (ANCOVA). The ANOVA analysis (Fcal (220) = 27.147, P < 0.05) shows that students who received instruction with CAIP outperformed the students who received instruction with CLM and also had higher retention. The findings also revealed no significant difference in performance and retention between male and female students (tcal (103) = -1.429, P > 0.05). It was recommended amongst others that teachers should use computer-aided instructional package in teaching genetic concepts in order to improve students’ performance and retention in biology subject. Keywords: Computer-aided Instructional Package, Performance, Retention and Genetic Concepts.

Keywords: computer aided instructional package, performance, retention, genetic concepts, senior secondary school students

Procedia PDF Downloads 334

24239 The Economic Limitations of Defining Data Ownership Rights

Authors: Kacper Tomasz Kröber-Mulawa

Abstract:

This paper will address the topic of data ownership from an economic perspective, and examples of economic limitations of data property rights will be provided, which have been identified using methods and approaches of economic analysis of law. To properly build a background for the economic focus, in the beginning a short perspective of data and data ownership in the EU’s legal system will be provided. It will include a short introduction to its political and social importance and highlight relevant viewpoints. This will stress the importance of a Single Market for data but also far-reaching regulations of data governance and privacy (including the distinction of personal and non-personal data, data held by public bodies and private businesses). The main discussion of this paper will build upon the briefly referred to legal basis as well as methods and approaches of economic analysis of law.

Keywords: antitrust, data, data ownership, digital economy, property rights

Procedia PDF Downloads 48

24238 Factorization of Computations in Bayesian Networks: Interpretation of Factors

Authors: Linda Smail, Zineb Azouz

Abstract:

Given a Bayesian network relative to a set I of discrete random variables, we are interested in computing the probability distribution P(S) where S is a subset of I. The general idea is to write the expression of P(S) in the form of a product of factors where each factor is easy to compute. More importantly, it will be very useful to give an interpretation of each of the factors in terms of conditional probabilities. This paper considers a semantic interpretation of the factors involved in computing marginal probabilities in Bayesian networks. Establishing such a semantic interpretations is indeed interesting and relevant in the case of large Bayesian networks.

Keywords: Bayesian networks, D-Separation, level two Bayesian networks, factorization of computation

Procedia PDF Downloads 490

24237 Protecting the Cloud Computing Data Through the Data Backups

Authors: Abdullah Alsaeed

Abstract:

Virtualized computing and cloud computing infrastructures are no longer fuzz or marketing term. They are a core reality in today’s corporate Information Technology (IT) organizations. Hence, developing an effective and efficient methodologies for data backup and data recovery is required more than any time. The purpose of data backup and recovery techniques are to assist the organizations to strategize the business continuity and disaster recovery approaches. In order to accomplish this strategic objective, a variety of mechanism were proposed in the recent years. This research paper will explore and examine the latest techniques and solutions to provide data backup and restoration for the cloud computing platforms.

Keywords: data backup, data recovery, cloud computing, business continuity, disaster recovery, cost-effective, data encryption.

Procedia PDF Downloads 55

24236 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 484

24235 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 392

24234 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 341

24233 Carbon Nanotube Field Effect Transistor - a Review

Authors: P. Geetha, R. S. D. Wahida Banu

Abstract:

The crowning advances in Silicon based electronic technology have dominated the computation world for the past decades. The captivating performance of Si devices lies in sustainable scaling down of the physical dimensions, by that increasing device density and improved performance. But, the fundamental limitations due to physical, technological, economical, and manufacture features restrict further miniaturization of Si based devices. The pit falls are due to scaling down of the devices such as process variation, short channel effects, high leakage currents, and reliability concerns. To fix the above-said problems, it is needed either to follow a new concept that will manage the current hitches or to support the available concept with different materials. The new concept is to design spintronics, quantum computation or two terminal molecular devices. Otherwise, presently used well known three terminal devices can be modified with different materials that suits to address the scaling down difficulties. The first approach will occupy in the far future since it needs considerable effort; the second path is a bright light towards the travel. Modelling paves way to know not only the current-voltage characteristics but also the performance of new devices. So, it is desirable to model a new device of suitable gate control and project the its abilities towards capability of handling high current, high power, high frequency, short delay, and high velocity with excellent electronic and optical properties. Carbon nanotube became a thriving material to replace silicon in nano devices. A well-planned optimized utilization of the carbon material leads to many more advantages. The unique nature of this organic material allows the recent developments in almost all fields of applications from an automobile industry to medical science, especially in electronics field-on which the automation industry depends. More research works were being done in this area. This paper reviews the carbon nanotube field effect transistor with various gate configurations, number of channel element, CNT wall configurations and different modelling techniques.

Keywords: array of channels, carbon nanotube field effect transistor, double gate transistor, gate wrap around transistor, modelling, multi-walled CNT, single-walled CNT

Procedia PDF Downloads 287

24232 Analysis of a IncResU-Net Model for R-Peak Detection in ECG Signals

Authors: Beatriz Lafuente Alcázar, Yash Wani, Amit J. Nimunkar

Abstract:

Cardiovascular Diseases (CVDs) are the leading cause of death globally, and around 80% of sudden cardiac deaths are due to arrhythmias or irregular heartbeats. The majority of these pathologies are revealed by either short-term or long-term alterations in the electrocardiogram (ECG) morphology. The ECG is the main diagnostic tool in cardiology. It is a non-invasive, pain free procedure that measures the heart’s electrical activity and that allows the detecting of abnormal rhythms and underlying conditions. A cardiologist can diagnose a wide range of pathologies based on ECG’s form alterations, but the human interpretation is subjective and it is contingent to error. Moreover, ECG records can be quite prolonged in time, which can further complicate visual diagnosis, and deeply retard disease detection. In this context, deep learning methods have risen as a promising strategy to extract relevant features and eliminate individual subjectivity in ECG analysis. They facilitate the computation of large sets of data and can provide early and precise diagnoses. Therefore, the cardiology field is one of the areas that can most benefit from the implementation of deep learning algorithms. In the present study, a deep learning algorithm is trained following a novel approach, using a combination of different databases as the training set. The goal of the algorithm is to achieve the detection of R-peaks in ECG signals. Its performance is further evaluated in ECG signals with different origins and features to test the model’s ability to generalize its outcomes. Performance of the model for detection of R-peaks for clean and noisy ECGs is presented. The model is able to detect R-peaks in the presence of various types of noise, and when presented with data, it has not been trained. It is expected that this approach will increase the effectiveness and capacity of cardiologists to detect divergences in the normal cardiac activity of their patients.

Keywords: arrhythmia, deep learning, electrocardiogram, machine learning, R-peaks

Procedia PDF Downloads 143

24231 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 80

24230 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 277

24229 Bayesian Variable Selection in Quantile Regression with Application to the Health and Retirement Study

Authors: Priya Kedia, Kiranmoy Das

Abstract:

There is a rich literature on variable selection in regression setting. However, most of these methods assume normality for the response variable under consideration for implementing the methodology and establishing the statistical properties of the estimates. In many real applications, the distribution for the response variable may be non-Gaussian, and one might be interested in finding the best subset of covariates at some predetermined quantile level. We develop dynamic Bayesian approach for variable selection in quantile regression framework. We use a zero-inflated mixture prior for the regression coefficients, and consider the asymmetric Laplace distribution for the response variable for modeling different quantiles of its distribution. An efficient Gibbs sampler is developed for our computation. Our proposed approach is assessed through extensive simulation studies, and real application of the proposed approach is also illustrated. We consider the data from health and retirement study conducted by the University of Michigan, and select the important predictors when the outcome of interest is out-of-pocket medical cost, which is considered as an important measure for financial risk. Our analysis finds important predictors at different quantiles of the outcome, and thus enhance our understanding on the effects of different predictors on the out-of-pocket medical cost.

Keywords: variable selection, quantile regression, Gibbs sampler, asymmetric Laplace distribution

Procedia PDF Downloads 125