Search results for: data cleaning.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7416

Search results for: data cleaning.

7386 Characterization of Candlenut Shells and Its Application to Remove Oil and Fine Solids of Produced Water in Nutshell Filters of Water Cleaning Plant

Authors: Annur Suhadi, Haris B. Harahap, Zaim Arrosyidi, Epan, Darmapala

Abstract:

Oilfields under waterflood often face the problem of plugging injectors either by internal filtration or external filter cake built up inside pore throats. The content of suspended solids shall be reduced to required level of filtration since corrective action of plugging is costly expensive. The performance of nutshell filters, where filtration takes place, is good using pecan and walnut shells. Candlenut shells were used instead of pecan and walnut shells since they were abundant in Indonesia, Malaysia, and East Africa. Physical and chemical properties of walnut, pecan, and candlenut shells were tested and the results were compared. Testing, using full-scale nutshell filters, was conducted to determine the oil content, turbidity, and suspended solid removal, which was based on designed flux rate. The performance of candlenut shells, which were deeply bedded in nutshell filters for filtration process, was monitored. Cleaned water outgoing nutshell filters had total suspended solids of 17 ppm, while oil content could be reduced to 15.1 ppm. Turbidity, using candlenut shells, was below the specification for injection water, which was less than 10 Nephelometric Turbidity Unit (NTU). Turbidity of water, outgoing nutshell filter, was ranged from 1.7-5.0 NTU at various dates of operation. Walnut, pecan, and candlenut shells had moisture content of 8.98 wt%, 10.95 wt%, and 9.95 wt%, respectively. The porosity of walnut, pecan, and candlenut shells was significantly affected by moisture content. Candlenut shells had property of toluene solubility of 7.68 wt%, which was much higher than walnut shells, reflecting more crude oil adsorption. The hardness of candlenut shells was 2.5-3 Mohs, which was close to walnut shells’ hardness. It was advantage to guarantee the cleaning filter cake by fluidization process during backwashing.

Keywords: Candlenut shells, walnut shells, pecan shells, nutshell filter, filtration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 399
7385 Computational Fluid Dynamics Study on Water Soot Blower Direction in Tangentially Fired Pulverized-Coal Boiler

Authors: Teewin Plangsrinont, Wasawat Nakkiew

Abstract:

In this study, Computational Fluid Dynamics (CFD) was utilized to simulate and predict the path of water from water soot blower through an ambient flow field in 300-megawatt tangentially burned pulverized coal boiler that utilizes a water soot blower as a cleaning device. To predict the position of the impact of water on the opposite side of the water soot blower under identical conditions, the nozzle size and water flow rate were fixed in this investigation. The simulation findings demonstrated a high degree of accuracy in predicting the direction of water flow to the boiler's water wall tube, which was validated by comparison to experimental data. Results show maximum deviation value of the water jet trajectory is 10.2%.

Keywords: Computational fluid dynamics, tangentially fired boiler, thermal power plant, water soot blower.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 624
7384 Condition Monitoring for Twin-Fluid Nozzles with Internal Mixing

Authors: C. Lanzerstorfer

Abstract:

Liquid sprays of water are frequently used in air pollution control for gas cooling purposes and for gas cleaning. Twin-fluid nozzles with internal mixing are often used for these purposes because of the small size of the drops produced. In these nozzles the liquid is dispersed by compressed air or another pressurized gas. In high efficiency scrubbers for particle separation, several nozzles are operated in parallel because of the size of the cross section. In such scrubbers, the scrubbing water has to be re-circulated. Precipitation of some solid material can occur in the liquid circuit, caused by chemical reactions. When such precipitations are detached from the place of formation, they can partly or totally block the liquid flow to a nozzle. Due to the resulting unbalanced supply of the nozzles with water and gas, the efficiency of separation decreases. Thus, the nozzles have to be cleaned if a certain fraction of blockages is reached. The aim of this study was to provide a tool for continuously monitoring the status of the nozzles of a scrubber based on the available operation data (water flow, air flow, water pressure and air pressure). The difference between the air pressure and the water pressure is not well suited for this purpose, because the difference is quite small and therefore very exact calibration of the pressure measurement would be required. Therefore, an equation for the reference air flow of a nozzle at the actual water flow and operation pressure was derived. This flow can be compared with the actual air flow for assessment of the status of the nozzles.

Keywords: Twin-fluid nozzles, operation data, condition monitoring, flow equation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1114
7383 Big Data: Big Challenges to Privacy and Data Protection

Authors: Abu Bakar Munir, Siti Hajar Mohd Yasin, Firdaus Muhammad-Sukki

Abstract:

This paper seeks to analyse the benefits of big data and more importantly the challenges it pose to the subject of privacy and data protection. First, the nature of big data will be briefly deliberated before presenting the potential of big data in the present days. Afterwards, the issue of privacy and data protection is highlighted before discussing the challenges of implementing this issue in big data. In conclusion, the paper will put forward the debate on the adequacy of the existing legal framework in protecting personal data in the era of big data.

Keywords: Big data, data protection, information, privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3868
7382 Extended Study on Removing Gaussian Noise in Mechanical Engineering Drawing Images using Median Filters

Authors: Low Khong Teck, Hasan S. M. Al-Khaffaf, Abdullah Zawawi Talib, Tan Kian Lam

Abstract:

In this paper, an extended study is performed on the effect of different factors on the quality of vector data based on a previous study. In the noise factor, one kind of noise that appears in document images namely Gaussian noise is studied while the previous study involved only salt-and-pepper noise. High and low levels of noise are studied. For the noise cleaning methods, algorithms that were not covered in the previous study are used namely Median filters and its variants. For the vectorization factor, one of the best available commercial raster to vector software namely VPstudio is used to convert raster images into vector format. The performance of line detection will be judged based on objective performance evaluation method. The output of the performance evaluation is then analyzed statistically to highlight the factors that affect vector quality.

Keywords: Performance Evaluation, Vectorization, Median Filter, Gaussian Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
7381 Load Forecasting in Microgrid Systems with R and Cortana Intelligence Suite

Authors: F. Lazzeri, I. Reiter

Abstract:

Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.

Keywords: Time-series, features engineering methods for forecasting, energy demand forecasting, Azure machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1243
7380 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: Analytics, Big Data in Education, Hadoop, Learning Analytics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4822
7379 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518
7378 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2425
7377 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3733
7376 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263
7375 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
7374 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1965
7373 Recent Developments in Speed Control System of Pipeline PIGs for Deepwater Pipeline Applications

Authors: Mohamad Azmi Haniffa, Fakhruldin Mohd Hashim

Abstract:

Pipeline infrastructures normally represent high cost of investment and the pipeline must be free from risks that could cause environmental hazard and potential threats to personnel safety. Pipeline integrity such monitoring and management become very crucial to provide unimpeded transportation and avoiding unnecessary production deferment. Thus proper cleaning and inspection is the key to safe and reliable pipeline operation and plays an important role in pipeline integrity management program and has become a standard industry procedure. In view of this, understanding the motion (dynamic behavior), prediction and control of the PIG speed is important in executing pigging operation as it offers significant benefits, such as estimating PIG arrival time at receiving station, planning for suitable pigging operation, and improves efficiency of pigging tasks. The objective of this paper is to review recent developments in speed control system of pipeline PIGs. The review carried out would serve as an industrial application in a form of quick reference of recent developments in pipeline PIG speed control system, and further initiate others to add-in/update the list in the future leading to knowledge based data, and would attract active interest of others to share their view points.

Keywords: Pipeline Inspection Gauge (PIG), In Line Inspection Tools (ILI), PIG motion, PIG speed control system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3278
7372 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1999
7371 A Centralized Architecture for Cooperative Air-Sea Vehicles Using UAV-USV

Authors: Salima Bella, Assia Belbachir, Ghalem Belalem

Abstract:

This paper deals with the problem of monitoring and cleaning dirty zones of oceans using unmanned vehicles. We present a centralized cooperative architecture for unmanned aerial vehicles (UAVs) to monitor ocean regions and clean dirty zones with the help of unmanned surface vehicles (USVs). Due to the rapid deployment of these unmanned vehicles, it is convenient to use them in oceanic regions where the water pollution zones are generally unknown. In order to optimize this process, our solution aims to detect and reduce the pollution level of the ocean zones while taking into account the problem of fault tolerance related to these vehicles.

Keywords: Centralized architecture, fault tolerance, UAV, USV.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 947
7370 Long-term Irrigation with Dairy Factory Wastewater Influences Soil Quality

Authors: Yen-Yiu Liu, Richard J. Haynes

Abstract:

The effects of irrigation with dairy factory wastewater on soil properties were investigated at two sites that had received irrigation for > 60 years. Two adjoining paired sites that had never received DFE were also sampled as well as another seven fields from a wider area around the factory. In comparison with paired sites that had not received effluent, long-term wastewater irrigation resulted in an increase in pH, EC, extractable P, exchangeable Na and K and ESP. These changes were related to the use of phosphoric acid, NaOH and KOH as cleaning agents in the factory. Soil organic C content was unaffected by DFE irrigation but the size (microbial biomass C and N) and activity (basal respiration) of the soil microbial community were increased. These increases were attributed to regular inputs of soluble C (e.g. lactose) present as milk residues in the wastewater. Principal component analysis (PCA) of the soils data from all 11sites confirmed that the main effects of DFE irrigation were an increase in exchangeable Na, extractable P and microbial biomass C, an accumulation of soluble salts and a liming effect. PCA analysis of soil bacterial community structure, using PCR-DGGE of 16S rDNA fragments, generally separated individual sites from one another but did not group them according to irrigation history. Thus, whilst the size and activity of the soil microbial community were increased, the structure and diversity of the bacterial community remained unaffected.

Keywords: Dairy factory, wastewater; effluent, irrigation, soil quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
7369 Long- term Irrigation with Dairy Factory Wastewater Influences Soil Quality

Authors: Yen-Yiu Liu, Richard J. Haynes

Abstract:

The effects of irrigation with dairy factory wastewater on soil properties were investigated at two sites that had received irrigation for > 60 years. Two adjoining paired sites that had never received DFE were also sampled as well as another seven fields from a wider area around the factory. In comparison with paired sites that had not received effluent, long-term wastewater irrigation resulted in an increase in pH, EC, extractable P, exchangeable Na and K and ESP. These changes were related to the use of phosphoric acid, NaOH and KOH as cleaning agents in the factory. Soil organic C content was unaffected by DFE irrigation but the size (microbial biomass C and N) and activity (basal respiration) of the soil microbial community were increased. These increases were attributed to regular inputs of soluble C (e.g. lactose) present as milk residues in the wastewater. Principal component analysis (PCA) of the soils data from all 11sites confirmed that the main effects of DFE irrigation were an increase in exchangeable Na, extractable P and microbial biomass C, an accumulation of soluble salts and a liming effect. PCA analysis of soil bacterial community structure, using PCR-DGGE of 16S rDNA fragments, generally separated individual sites from one another but did not group them according to irrigation history. Thus, whilst the size and activity of the soil microbial community were increased, the structure and diversity of the bacterial community remained unaffected.

Keywords: Dairy factory, wastewater; effluent, irrigation, soil quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1987
7368 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2737
7367 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1599
7366 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173
7365 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555
7364 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2424
7363 STATISTICA Software: A State of the Art Review

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha

Abstract:

Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.

Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2564
7362 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: Communication, computer network, data collection, probe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
7361 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1169
7360 Simulation of Polymeric Precursors Production from Wine Industrial Organic Wastes

Authors: Tanapoom Phuncharoen, Tawiwat Sriwongsa, Kanita Boonruang, Apichit Svang-ariyaskul

Abstract:

The production of Dimethyl acetal, Isovaleradehyde and Pyridine were simulated using Aspen Plus simulation. Upgrading cleaning water from wine industrial production is the main objective of the project. The winery waste composes of Acetaldehyde, Methanol, Ethyl Acetate, 1-propanol, water, iso-amyl alcohol and iso-butyl alcohol. The project is separated into three parts; separation, reaction, and purification. Various processes were considered to maximize the profit along with obtaining high purity and recovery of each component with optimum heat duty. The results show a significant value of the product with purity more than 75% and recovery over 98%.

Keywords: Dimethyl acetal, Pyridine, wine, Aspen Plus, Isovaleradehyde, polymeric precursors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2393
7359 Metadata Update Mechanism Improvements in Data Grid

Authors: S. Farokhzad, M. Reza Salehnamadi

Abstract:

Grid environments include aggregation of geographical distributed resources. Grid is put forward in three types of computational, data and storage. This paper presents a research on data grid. Data grid is used for covering and securing accessibility to data from among many heterogeneous sources. Users are not worry on the place where data is located in it, provided that, they should get access to the data. Metadata is used for getting access to data in data grid. Presently, application metadata catalogue and SRB middle-ware package are used in data grids for management of metadata. At this paper, possibility of updating, streamlining and searching is provided simultaneously and rapidly through classified table of preserving metadata and conversion of each table to numerous tables. Meanwhile, with regard to the specific application, the most appropriate and best division is set and determined. Concurrency of implementation of some of requests and execution of pipeline is adaptability as a result of this technique.

Keywords: Grids, data grid, metadata, update.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
7358 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1023
7357 Numerical Investigation of the Flow Characteristics inside the Scrubber Unit

Authors: Kumaresh Selvakumar, Man Young Kim

Abstract:

Wet scrubbers have found widespread use in cleaning contaminated gas streams because of their ability to remove particulates and based on the applications of scrubbing of marine engine exhaust gases by spraying sea-water. In order to examine the flow characteristics inside the scrubber, the model is designated with flow properties of hot air and water sprayer. The flow dynamics of evaporation of hot air by the injection of water droplets is the key factor considered in this paper. The flow behavior inside the scrubber was investigated from the previous works and to sum up the evaporation rate with respect to the concentration of water droplets are predicted to bring out the competent modelling. The numerical analysis using CFD facilitates in understanding the problem better and empathies the behavior of the model over its entire operating envelope.

Keywords: Concentration of water droplets, Evaporation rate, Scrubber, Water sprayer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3261