Search results for: big data computation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24469

Search results for: big data computation

24319 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 367
24318 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 609
24317 Algorithm for Automatic Real-Time Electrooculographic Artifact Correction

Authors: Norman Sinnigen, Igor Izyurov, Marina Krylova, Hamidreza Jamalabadi, Sarah Alizadeh, Martin Walter

Abstract:

Background: EEG is a non-invasive brain activity recording technique with a high temporal resolution that allows the use of real-time applications, such as neurofeedback. However, EEG data are susceptible to electrooculographic (EOG) and electromyography (EMG) artifacts (i.e., jaw clenching, teeth squeezing and forehead movements). Due to their non-stationary nature, these artifacts greatly obscure the information and power spectrum of EEG signals. Many EEG artifact correction methods are too time-consuming when applied to low-density EEG and have been focusing on offline processing or handling one single type of EEG artifact. A software-only real-time method for correcting multiple types of EEG artifacts of high-density EEG remains a significant challenge. Methods: We demonstrate an improved approach for automatic real-time EEG artifact correction of EOG and EMG artifacts. The method was tested on three healthy subjects using 64 EEG channels (Brain Products GmbH) and a sampling rate of 1,000 Hz. Captured EEG signals were imported in MATLAB with the lab streaming layer interface allowing buffering of EEG data. EMG artifacts were detected by channel variance and adaptive thresholding and corrected by using channel interpolation. Real-time independent component analysis (ICA) was applied for correcting EOG artifacts. Results: Our results demonstrate that the algorithm effectively reduces EMG artifacts, such as jaw clenching, teeth squeezing and forehead movements, and EOG artifacts (horizontal and vertical eye movements) of high-density EEG while preserving brain neuronal activity information. The average computation time of EOG and EMG artifact correction for 80 s (80,000 data points) 64-channel data is 300 – 700 ms depending on the convergence of ICA and the type and intensity of the artifact. Conclusion: An automatic EEG artifact correction algorithm based on channel variance, adaptive thresholding, and ICA improves high-density EEG recordings contaminated with EOG and EMG artifacts in real-time.

Keywords: EEG, muscle artifacts, ocular artifacts, real-time artifact correction, real-time ICA

Procedia PDF Downloads 139
24316 Heat Transfer and Diffusion Modelling

Authors: R. Whalley

Abstract:

The heat transfer modelling for a diffusion process will be considered. Difficulties in computing the time-distance dynamics of the representation will be addressed. Incomplete and irrational Laplace function will be identified as the computational issue. Alternative approaches to the response evaluation process will be provided. An illustration application problem will be presented. Graphical results confirming the theoretical procedures employed will be provided.

Keywords: heat, transfer, diffusion, modelling, computation

Procedia PDF Downloads 522
24315 Computer-Aided Detection of Simultaneous Abdominal Organ CT Images by Iterative Watershed Transform

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

Interpretation of medical images benefits from anatomical and physiological priors to optimize computer-aided diagnosis applications. Segmentation of liver, spleen and kidneys is regarded as a major primary step in the computer-aided diagnosis of abdominal organ diseases. In this paper, a semi-automated method for medical image data is presented for the abdominal organ segmentation data using mathematical morphology. Our proposed method is based on hierarchical segmentation and watershed algorithm. In our approach, a powerful technique has been designed to suppress over-segmentation based on mosaic image and on the computation of the watershed transform. Our algorithm is currency in two parts. In the first, we seek to improve the quality of the gradient-mosaic image. In this step, we propose a method for improving the gradient-mosaic image by applying the anisotropic diffusion filter followed by the morphological filters. Thereafter, we proceed to the hierarchical segmentation of the liver, spleen and kidney. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.

Keywords: anisotropic diffusion filter, CT images, morphological filter, mosaic image, simultaneous organ segmentation, the watershed algorithm

Procedia PDF Downloads 413
24314 Experimental and Computational Investigations of Baffle Position Effects on ‎the Performance of Oil and Water Separator Tanks

Authors: Haitham A. Hussein, Rozi Abdullah‏‎, Md Azlin Md Said ‎

Abstract:

Gravity separator tanks are used to separate oil from water in treatment units. Achieving the best flow ‎uniformity in a separator tank will improve the maximum removal efficiency of oil globules from water. ‎In this study, the effect on hydraulic performance of different baffle structure positions inside a tank ‎was investigated. Experimental data and 2D computation fluid dynamics were used for analysis. In the ‎numerical model, two-phase flow (drift flux model) was used to validate one-phase flow. For ‎laboratory measurements, the velocity fields were measured using an acoustic Doppler velocimeter. The ‎measurements were compared with the result of the computational model. The results of the ‎experimental and computational simulations indicate that the best location of a baffle structure is ‎achieved when the standard deviation of the velocity profile and the volume of the circulation zone ‎inside the tank are minimized.‎

Keywords: gravity separator tanks, CFD, baffle position, two phase flow, ADV, oil droplet

Procedia PDF Downloads 279
24313 An Unbiased Profiling of Immune Repertoire via Sequencing and Analyzing T-Cell Receptor Genes

Authors: Yi-Lin Chen, Sheng-Jou Hung, Tsunglin Liu

Abstract:

Adaptive immune system recognizes a wide range of antigens via expressing a large number of structurally distinct T cell and B cell receptor genes. The distinct receptor genes arise from complex rearrangements called V(D)J recombination, and constitute the immune repertoire. A common method of profiling immune repertoire is via amplifying recombined receptor genes using multiple primers and high-throughput sequencing. This multiplex-PCR approach is efficient; however, the resulting repertoire can be distorted because of primer bias. To eliminate primer bias, 5’ RACE is an alternative amplification approach. However, the application of RACE approach is limited by its low efficiency (i.e., the majority of data are non-regular receptor sequences, e.g., containing intronic segments) and lack of the convenient tool for analysis. We propose a computational tool that can correctly identify non-regular receptor sequences in RACE data via aligning receptor sequences against the whole gene instead of only the exon regions as done in all other tools. Using our tool, the remaining regular data allow for an accurate profiling of immune repertoire. In addition, a RACE approach is improved to yield a higher fraction of regular T-cell receptor sequences. Finally, we quantify the degree of primer bias of a multiplex-PCR approach via comparing it to the RACE approach. The results reveal significant differences in frequency of VJ combination by the two approaches. Together, we provide a new experimental and computation pipeline for an unbiased profiling of immune repertoire. As immune repertoire profiling has many applications, e.g., tracing bacterial and viral infection, detection of T cell lymphoma and minimal residual disease, monitoring cancer immunotherapy, etc., our work should benefit scientists who are interested in the applications.

Keywords: immune repertoire, T-cell receptor, 5' RACE, high-throughput sequencing, sequence alignment

Procedia PDF Downloads 159
24312 Development of Precise Ephemeris Generation Module for Thaichote Satellite Operations

Authors: Manop Aorpimai, Ponthep Navakitkanok

Abstract:

In this paper, the development of the ephemeris generation module used for the Thaichote satellite operations is presented. It is a vital part of the flight dynamics system, which comprises, the orbit determination, orbit propagation, event prediction and station-keeping maneuver modules. In the generation of the spacecraft ephemeris data, the estimated orbital state vector from the orbit determination module is used as an initial condition. The equations of motion are then integrated forward in time to predict the satellite states. The higher geopotential harmonics, as well as other disturbing forces, are taken into account to resemble the environment in low-earth orbit. Using a highly accurate numerical integrator based on the Burlish-Stoer algorithm the ephemeris data can be generated for long-term predictions, by using a relatively small computation burden and short calculation time. Some events occurring during the prediction course that are related to the mission operations, such as the satellite’s rise/set viewed from the ground station, Earth and Moon eclipses, the drift in ground track as well as the drift in the local solar time of the orbital plane are all detected and reported. When combined with other modules to form a flight dynamics system, this application is aimed to be applied for the Thaichote satellite and successive Thailand’s Earth-observation missions.

Keywords: flight dynamics system, orbit propagation, satellite ephemeris, Thailand’s Earth Observation Satellite

Procedia PDF Downloads 348
24311 Unsupervised Feature Learning by Pre-Route Simulation of Auto-Encoder Behavior Model

Authors: Youngjae Jin, Daeshik Kim

Abstract:

This paper describes a cycle accurate simulation results of weight values learned by an auto-encoder behavior model in terms of pre-route simulation. Given the results we visualized the first layer representations with natural images. Many common deep learning threads have focused on learning high-level abstraction of unlabeled raw data by unsupervised feature learning. However, in the process of handling such a huge amount of data, the learning method’s computation complexity and time limited advanced research. These limitations came from the fact these algorithms were computed by using only single core CPUs. For this reason, parallel-based hardware, FPGAs, was seen as a possible solution to overcome these limitations. We adopted and simulated the ready-made auto-encoder to design a behavior model in Verilog HDL before designing hardware. With the auto-encoder behavior model pre-route simulation, we obtained the cycle accurate results of the parameter of each hidden layer by using MODELSIM. The cycle accurate results are very important factor in designing a parallel-based digital hardware. Finally this paper shows an appropriate operation of behavior model based pre-route simulation. Moreover, we visualized learning latent representations of the first hidden layer with Kyoto natural image dataset.

Keywords: auto-encoder, behavior model simulation, digital hardware design, pre-route simulation, Unsupervised feature learning

Procedia PDF Downloads 417
24310 Evaluation of Turbulence Prediction over Washington, D.C.: Comparison of DCNet Observations and North American Mesoscale Model Outputs

Authors: Nebila Lichiheb, LaToya Myles, William Pendergrass, Bruce Hicks, Dawson Cagle

Abstract:

Atmospheric transport of hazardous materials in urban areas is increasingly under investigation due to the potential impact on human health and the environment. In response to health and safety concerns, several dispersion models have been developed to analyze and predict the dispersion of hazardous contaminants. The models of interest usually rely on meteorological information obtained from the meteorological models of NOAA’s National Weather Service (NWS). However, due to the complexity of the urban environment, NWS forecasts provide an inadequate basis for dispersion computation in urban areas. A dense meteorological network in Washington, DC, called DCNet, has been operated by NOAA since 2003 to support the development of urban monitoring methodologies and provide the driving meteorological observations for atmospheric transport and dispersion models. This study focuses on the comparison of wind observations from the DCNet station on the U.S. Department of Commerce Herbert C. Hoover Building against the North American Mesoscale (NAM) model outputs for the period 2017-2019. The goal is to develop a simple methodology for modifying NAM outputs so that the dispersion requirements of the city and its urban area can be satisfied. This methodology will allow us to quantify the prediction errors of the NAM model and propose adjustments of key variables controlling dispersion model calculation.

Keywords: meteorological data, Washington D.C., DCNet data, NAM model

Procedia PDF Downloads 205
24309 Adopting Cloud-Based Techniques to Reduce Energy Consumption: Toward a Greener Cloud

Authors: Sandesh Achar

Abstract:

The cloud computing industry has set new goals for better service delivery and deployment, so anyone can access services such as computation, application, and storage anytime. Cloud computing promises new possibilities for approaching sustainable solutions to deploy and advance their services in this distributed environment. This work explores energy-efficient approaches and how cloud-based architecture can reduce energy consumption levels amongst enterprises leveraging cloud computing services. Adopting cloud-based networking, database, and server machines provide a comprehensive means of achieving the potential gains in energy efficiency that cloud computing offers. In energy-efficient cloud computing, virtualization is one aspect that can integrate several technologies to achieve consolidation and better resource utilization. Moreover, the Green Cloud Architecture for cloud data centers is discussed in terms of cost, performance, and energy consumption, and appropriate solutions for various application areas are provided.

Keywords: greener cloud, cloud computing, energy efficiency, energy consumption, metadata tags, green cloud advisor

Procedia PDF Downloads 50
24308 Numerical Studies for Standard Bi-Conjugate Gradient Stabilized Method and the Parallel Variants for Solving Linear Equations

Authors: Kuniyoshi Abe

Abstract:

Bi-conjugate gradient (Bi-CG) is a well-known method for solving linear equations Ax = b, for x, where A is a given n-by-n matrix, and b is a given n-vector. Typically, the dimension of the linear equation is high and the matrix is sparse. A number of hybrid Bi-CG methods such as conjugate gradient squared (CGS), Bi-CG stabilized (Bi-CGSTAB), BiCGStab2, and BiCGstab(l) have been developed to improve the convergence of Bi-CG. Bi-CGSTAB has been most often used for efficiently solving the linear equation, but we have seen the convergence behavior with a long stagnation phase. In such cases, it is important to have Bi-CG coefficients that are as accurate as possible, and the stabilization strategy, which stabilizes the computation of the Bi-CG coefficients, has been proposed. It may avoid stagnation and lead to faster computation. Motivated by a large number of processors in present petascale high-performance computing hardware, the scalability of Krylov subspace methods on parallel computers has recently become increasingly prominent. The main bottleneck for efficient parallelization is the inner products which require a global reduction. The resulting global synchronization phases cause communication overhead on parallel computers. The parallel variants of Krylov subspace methods reducing the number of global communication phases and hiding the communication latency have been proposed. However, the numerical stability, specifically, the convergence speed of the parallel variants of Bi-CGSTAB may become worse than that of the standard Bi-CGSTAB. In this paper, therefore, we compare the convergence speed between the standard Bi-CGSTAB and the parallel variants by numerical experiments and show that the convergence speed of the standard Bi-CGSTAB is faster than the parallel variants. Moreover, we propose the stabilization strategy for the parallel variants.

Keywords: bi-conjugate gradient stabilized method, convergence speed, Krylov subspace methods, linear equations, parallel variant

Procedia PDF Downloads 138
24307 Digital Homeostasis: Tangible Computing as a Multi-Sensory Installation

Authors: Andrea Macruz

Abstract:

This paper explores computation as a process for design by examining how computers can become more than an operative strategy in a designer's toolkit. It documents this, building upon concepts of neuroscience and Antonio Damasio's Homeostasis Theory, which is the control of bodily states through feedback intended to keep conditions favorable for life. To do this, it follows a methodology through algorithmic drawing and discusses the outcomes of three multi-sensory design installations, which culminated from a course in an academic setting. It explains both the studio process that took place to create the installations and the computational process that was developed, related to the fields of algorithmic design and tangible computing. It discusses how designers can use computational range to achieve homeostasis related to sensory data in a multi-sensory installation. The outcomes show clearly how people and computers interact with different sensory modalities and affordances. They propose using computers as meta-physical stabilizers rather than tools.

Keywords: algorithmic drawing, Antonio Damasio, emotion, homeostasis, multi-sensory installation, neuroscience

Procedia PDF Downloads 75
24306 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 344
24305 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 128
24304 Numerical Study of Flapping-Wing Flight of Hummingbird Hawkmoth during Hovering: Longitudinal Dynamics

Authors: Yao Jie, Yeo Khoon Seng

Abstract:

In recent decades, flapping wing aerodynamics has attracted great interest. Understanding the physics of biological flyers such as birds and insects can help improve the performance of micro air vehicles. The present research focuses on the aerodynamics of insect-like flapping wing flight with the approach of numerical computation. Insect model of hawkmoth is adopted in the numerical study with rigid wing assumption currently. The numerical model integrates the computational fluid dynamics of the flow and active control of wing kinematics to achieve stable flight. The computation grid is a hybrid consisting of background Cartesian nodes and clouds of mesh-free grids around immersed boundaries. The generalized finite difference method is used in conjunction with single value decomposition (SVD-GFD) in computational fluid dynamics solver to study the dynamics of a free hovering hummingbird hawkmoth. The longitudinal dynamics of the hovering flight is governed by three control parameters, i.e., wing plane angle, mean positional angle and wing beating frequency. In present work, a PID controller works out the appropriate control parameters with the insect motion as input. The controller is adjusted to acquire desired maneuvering of the insect flight. The numerical scheme in present study is proven to be accurate and stable to simulate the flight of the hummingbird hawkmoth, which has relatively high Reynolds number. The PID controller is responsive to provide feedback to the wing kinematics during the hovering flight. The simulated hovering flight agrees well with the real insect flight. The present numerical study offers a promising route to investigate the free flight aerodynamics of insects, which could overcome some of the limitations of experiments.

Keywords: aerodynamics, flight control, computational fluid dynamics (CFD), flapping-wing flight

Procedia PDF Downloads 319
24303 Inner Derivations of Low-Dimensional Diassociative Algebras

Authors: M. A. Fiidow, Ahmad M. Alenezi

Abstract:

In this work, we study the inner derivations of diassociative algebras in dimension two and three, an algorithmic approach is adopted for the computation of inner derivation, using some results from the derivation of finite dimensional diassociative algebras. Some basic properties of inner derivation of finite dimensional diassociative algebras are also provided.

Keywords: diassociative algebras, inner derivations, derivations, left and right operator

Procedia PDF Downloads 241
24302 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 180
24301 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 137
24300 Design Of High Sensitivity Transceiver for WSN

Authors: A. Anitha, M. Aishwariya

Abstract:

The realization of truly ubiquitous wireless sensor networks (WSN) demands Ultra-low power wireless communication capability. Because the radio transceiver in a wireless sensor node consumes more power when compared to the computation part it is necessary to reduce the power consumption. Hence, a low power transceiver is designed and implemented in a 120 nm CMOS technology for wireless sensor nodes. The power consumption of the transceiver is reduced still by maintaining the sensitivity. The transceiver designed combines the blocks including differential oscillator, mixer, envelope detector, power amplifiers, and LNA. RF signal modulation and demodulation is carried by On-Off keying method at 2.4 GHz which is said as ISM band. The transmitter demonstrates an output power of 2.075 mW while consuming a supply voltage of range 1.2 V-5.0 V. Here the comparison of LNA and power amplifier is done to obtain an amplifier which produces a high gain of 1.608 dB at receiver which is suitable to produce a desired sensitivity. The multistage RF amplifier is used to improve the gain at the receiver side. The power dissipation of the circuit is in the range of 0.183-0.323 mW. The receiver achieves a sensitivity of about -95 dBm with data rate of 1 Mbps.

Keywords: CMOS, envelope detector, ISM band, LNA, low power electronics, PA, wireless transceiver

Procedia PDF Downloads 478
24299 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 174
24298 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 443
24297 Nadler's Fixed Point Theorem on Partial Metric Spaces and its Application to a Homotopy Result

Authors: Hemant Kumar Pathak

Abstract:

In 1994, Matthews (S.G. Matthews, Partial metric topology, in: Proc. 8th Summer Conference on General Topology and Applications, in: Ann. New York Acad. Sci., vol. 728, 1994, pp. 183-197) introduced the concept of a partial metric as a part of the study of denotational semantics of data flow networks. He gave a modified version of the Banach contraction principle, more suitable in this context. In fact, (complete) partial metric spaces constitute a suitable framework to model several distinguished examples of the theory of computation and also to model metric spaces via domain theory. In this paper, we introduce the concept of almost partial Hausdorff metric. We prove a fixed point theorem for multi-valued mappings on partial metric space using the concept of almost partial Hausdorff metric and prove an analogous to the well-known Nadler’s fixed point theorem. In the sequel, we derive a homotopy result as an application of our main result.

Keywords: fixed point, partial metric space, homotopy, physical sciences

Procedia PDF Downloads 409
24296 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 532
24295 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 302
24294 A TgCNN-Based Surrogate Model for Subsurface Oil-Water Phase Flow under Multi-Well Conditions

Authors: Jian Li

Abstract:

The uncertainty quantification and inversion problems of subsurface oil-water phase flow usually require extensive repeated forward calculations for new runs with changed conditions. To reduce the computational time, various forms of surrogate models have been built. Related research shows that deep learning has emerged as an effective surrogate model, while most surrogate models with deep learning are purely data-driven, which always leads to poor robustness and abnormal results. To guarantee the model more consistent with the physical laws, a coupled theory-guided convolutional neural network (TgCNN) based surrogate model is built to facilitate computation efficiency under the premise of satisfactory accuracy. The model is a convolutional neural network based on multi-well reservoir simulation. The core notion of this proposed method is to bridge two separate blocks on top of an overall network. They underlie the TgCNN model in a coupled form, which reflects the coupling nature of pressure and water saturation in the two-phase flow equation. The model is driven by not only labeled data but also scientific theories, including governing equations, stochastic parameterization, boundary, and initial conditions, well conditions, and expert knowledge. The results show that the TgCNN-based surrogate model exhibits satisfactory accuracy and efficiency in subsurface oil-water phase flow under multi-well conditions.

Keywords: coupled theory-guided convolutional neural network, multi-well conditions, surrogate model, subsurface oil-water phase

Procedia PDF Downloads 60
24293 A Novel Algorithm for Production Scheduling

Authors: Ali Mohammadi Bolban Abad, Fariborz Ahmadi

Abstract:

Optimization in manufacture is a method to use limited resources to obtain the best performance and reduce waste. In this paper a new algorithm based on eurygaster life is introduced to obtain a plane in which task order and completion time of resources are defined. Evaluation results show our approach has less make span when the resources are allocated with some products in comparison to genetic algorithm.

Keywords: evolutionary computation, genetic algorithm, particle swarm optimization, NP-Hard problems, production scheduling

Procedia PDF Downloads 351
24292 Continuous Functions Modeling with Artificial Neural Network: An Improvement Technique to Feed the Input-Output Mapping

Authors: A. Belayadi, A. Mougari, L. Ait-Gougam, F. Mekideche-Chafa

Abstract:

The artificial neural network is one of the interesting techniques that have been advantageously used to deal with modeling problems. In this study, the computing with artificial neural network (CANN) is proposed. The model is applied to modulate the information processing of one-dimensional task. We aim to integrate a new method which is based on a new coding approach of generating the input-output mapping. The latter is based on increasing the neuron unit in the last layer. Accordingly, to show the efficiency of the approach under study, a comparison is made between the proposed method of generating the input-output set and the conventional method. The results illustrated that the increasing of the neuron units, in the last layer, allows to find the optimal network’s parameters that fit with the mapping data. Moreover, it permits to decrease the training time, during the computation process, which avoids the use of computers with high memory usage.

Keywords: neural network computing, continuous functions generating the input-output mapping, decreasing the training time, machines with big memories

Procedia PDF Downloads 251
24291 One-Dimension Model for Positive Displacement Pump with Cavitation Algorithm

Authors: Francesco Rizzuto, Matthew Stickland, Stephan Hannot

Abstract:

The simulation of a positive displacement pump system with commercial software for Computer Fluid Dynamics (CFD), will result in an enormous computational effort due to the complexity of the pump system. This drawback restricts the use of it to a specific part of the pump in one simulation. This research focuses on developing an algorithm that provides a suitable result in agreement with experiment data, without that computational effort. The compressible equations are solved with an explicit algorithm. A comparison is presented between the FV method with Monotonic Upwind scheme for Conservative Laws (MUSCL) with slope limiter and experimental results. The source term for cavitation and friction is introduced into the algorithm with a slipping strategy and solved with a 4th order Runge-Kutta scheme (RK4). Different pumps are modeled and analyzed to evaluate the flexibility of the code. The simulation required minimal computation time and resources without compromising the accuracy of the simulation results. Therefore, this algorithm highlights the feasibility of pressure pulsation simulation as a design tool for an industrial purpose.

Keywords: cavitation, diaphragm, DVCM, finite volume, MUSCL, positive displacement pump

Procedia PDF Downloads 122
24290 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 158