Search results for: data cleaning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24346

Search results for: data cleaning

24226 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 363
24225 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 608
24224 Wettability of Superhydrophobic Polymer Layers Filled with Hydrophobized Silica on Glass

Authors: Diana Rymuszka, Konrad Terpiłowski, Lucyna Hołysz, Elena Goncharuk, Iryna Sulym

Abstract:

Superhydrophobic surfaces exhibit extremely high water repellency. The commonly accepted basic criterion for such surfaces is a water contact angle larger than 150°, low contact angle hysteresis and low sliding angle. These surfaces are of special interest, because properties such as anti-sticking, anti-contamination and self-cleaning are expected. These properties are attractive for many applications such as anti-sticking of snow for antennas and windows, anti-biofouling paints for boats, waterproof clothing, self-cleaning windshields for automobiles, dust-free coatings or metal refining. The various methods for the preparation of superhydrophobic surfaces since last two decades have been reported, such as phase separation, electrochemical deposition, template method, plasma method, chemical vapor deposition, wet chemical reaction, sol-gel processing, lithography and so on. The aim of the study was to investigate the influence of modified colloidal silica, used as a filler, on the hydrophobicity of the polymer film deposited on the glass support activated with plasma. On prepared surfaces water advancing (ӨA) and receding (ӨR) contact angles were measured and then their total apparent surface free energy was determined using the contact angle hysteresis approach (CAH). The structures of deposited films were observed with the help of an optical microscope. Topographies of selected films were also determined using an optical profilometer. It was found that plasma treatment influence glass surface wetting and energetic properties that is observed in higher adhesion between polymer/filler film and glass support. Using the colloidal silica particles as a filler for the polymer thin film deposited on the glass support, it is possible to produce strongly adhering layers of superhydrophobic properties. The best superhydrophobic properties were obtained for surfaces of the film glass/polimer + modified silica covered in 89 and 100%. The advancing contact angle measured on these surfaces amounts above 150° that leads to under 2 mJ/m2 value of the apparent surface free energy. Such films may have many practical applications, among others, as dust-free coatings or anticorrosion protection.

Keywords: contact angle, plasma, superhydrophobic, surface free energy

Procedia PDF Downloads 445
24223 Model for Remanufacture of Medical Equipment in Cross Border Collaboration

Authors: Kingsley Oturu, Winifred Ijomah, Wale Coker, Chibueze Achi

Abstract:

With the impact of BREXIT and the need for cross-border collaboration, this international research investigated the use of a conceptual model for remanufacturing medical equipment (with a focus on anesthetic machines and baby incubators). Early findings of the research suggest that contextual factors need to be taken into consideration, as well as an emphasis on cleaning (e.g., sterilization) during the process of remanufacturing medical equipment. For example, copper tubings may be more important in the remanufacturing of anesthetic equipment in tropical climates than in cold climates.

Keywords: medical equipment remanufacture, sustainability, circular business models, remanufacture process model

Procedia PDF Downloads 138
24222 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 339
24221 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 126
24220 Exceptional Cost and Time Optimization with Successful Leak Repair and Restoration of Oil Production: West Kuwait Case Study

Authors: Nasser Al-Azmi, Al-Sabea Salem, Abu-Eida Abdullah, Milan Patra, Mohamed Elyas, Daniel Freile, Larisa Tagarieva

Abstract:

Well intervention was done along with Production Logging Tools (PLT) to detect sources of water, and to check well integrity for two West Kuwait oil wells started to produce 100 % water. For the first well, to detect the source of water, PLT was performed to check the perforations, no production observed from the bottom two perforation intervals, and an intake of water was observed from the top most perforation. Then a decision was taken to extend the PLT survey from tag depth to the Y-tool. For the second well, the aim was to detect the source of water and if there was a leak in the 7’’liner in front of the upper zones. Data could not be recorded in flowing conditions due to the casing deformation at almost 8300 ft. For the first well from the interpretation of PLT and well integrity data, there was a hole in the 9 5/8'' casing from 8468 ft to 8494 ft producing almost the majority of water, which is 2478 bbl/d. The upper perforation from 10812 ft to 10854 ft was taking 534 stb/d. For the second well, there was a hole in the 7’’liner from 8303 ft MD to 8324 ft MD producing 8334.0 stb/d of water with an intake zone from10322.9-10380.8 ft MD taking the whole fluid. To restore the oil production, W/O rig was mobilized to prevent dump flooding, and during the W/O, the leaking interval was confirmed for both wells. The leakage was cement squeezed and tested at 900-psi positive pressure and 500-psi drawdown pressure. The cement squeeze job was successful. After W/O, the wells kept producing for cleaning, and eventually, the WC reduced to 0%. Regular PLT and well integrity logs are required to study well performance, and well integrity issues, proper cement behind casing is essential to well longevity and well integrity, and the presence of the Y-tool is essential as monitoring of well parameters and ESP to facilitate well intervention tasks. Cost and time optimization in oil and gas and especially during rig operations is crucial. PLT data quality and the accuracy of the interpretations contributed a lot to identify the leakage interval accurately and, in turn, saved a lot of time and reduced the repair cost with almost 35 to 45 %. The added value here was more related to the cost reduction and effective and quick proper decision making based on the economic environment.

Keywords: leak, water shut-off, cement, water leak

Procedia PDF Downloads 87
24219 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 179
24218 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 134
24217 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 172
24216 Automated Transformation of 3D Point Cloud to BIM Model: Leveraging Algorithmic Modeling for Efficient Reconstruction

Authors: Radul Shishkov, Orlin Davchev

Abstract:

The digital era has revolutionized architectural practices, with building information modeling (BIM) emerging as a pivotal tool for architects, engineers, and construction professionals. However, the transition from traditional methods to BIM-centric approaches poses significant challenges, particularly in the context of existing structures. This research introduces a technical approach to bridge this gap through the development of algorithms that facilitate the automated transformation of 3D point cloud data into detailed BIM models. The core of this research lies in the application of algorithmic modeling and computational design methods to interpret and reconstruct point cloud data -a collection of data points in space, typically produced by 3D scanners- into comprehensive BIM models. This process involves complex stages of data cleaning, feature extraction, and geometric reconstruction, which are traditionally time-consuming and prone to human error. By automating these stages, our approach significantly enhances the efficiency and accuracy of creating BIM models for existing buildings. The proposed algorithms are designed to identify key architectural elements within point clouds, such as walls, windows, doors, and other structural components, and to translate these elements into their corresponding BIM representations. This includes the integration of parametric modeling techniques to ensure that the generated BIM models are not only geometrically accurate but also embedded with essential architectural and structural information. Our methodology has been tested on several real-world case studies, demonstrating its capability to handle diverse architectural styles and complexities. The results showcase a substantial reduction in time and resources required for BIM model generation while maintaining high levels of accuracy and detail. This research contributes significantly to the field of architectural technology by providing a scalable and efficient solution for the integration of existing structures into the BIM framework. It paves the way for more seamless and integrated workflows in renovation and heritage conservation projects, where the accuracy of existing conditions plays a critical role. The implications of this study extend beyond architectural practices, offering potential benefits in urban planning, facility management, and historic preservation.

Keywords: BIM, 3D point cloud, algorithmic modeling, computational design, architectural reconstruction

Procedia PDF Downloads 19
24215 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 440
24214 A Centralized Architecture for Cooperative Air-Sea Vehicles Using UAV-USV

Authors: Salima Bella, Assia Belbachir, Ghalem Belalem

Abstract:

This paper deals with the problem of monitoring and cleaning dirty zones of oceans using unmanned vehicles. We present a centralized cooperative architecture for unmanned aerial vehicles (UAVs) to monitor ocean regions and clean dirty zones with the help of unmanned surface vehicles (USVs). Due to the rapid deployment of these unmanned vehicles, it is convenient to use them in oceanic regions where the water pollution zones are generally unknown. In order to optimize this process, our solution aims to detect and reduce the pollution level of the ocean zones while taking into account the problem of fault tolerance related to these vehicles.

Keywords: centralized architecture, fault tolerance, UAV, USV

Procedia PDF Downloads 294
24213 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 51
24212 Cleaning of Polycyclic Aromatic Hydrocarbons (PAH) Obtained from Ferroalloys Plant

Authors: Stefan Andersson, Balram Panjwani, Bernd Wittgens, Jan Erik Olsen

Abstract:

Polycyclic Aromatic hydrocarbons are organic compounds consisting of only hydrogen and carbon aromatic rings. PAH are neutral, non-polar molecules that are produced due to incomplete combustion of organic matter. These compounds are carcinogenic and interact with biological nucleophiles to inhibit the normal metabolic functions of the cells. Norways, the most important sources of PAH pollution is considered to be aluminum plants, the metallurgical industry, offshore oil activity, transport, and wood burning. Stricter governmental regulations regarding emissions to the outer and internal environment combined with increased awareness of the potential health effects have motivated Norwegian metal industries to increase their efforts to reduce emissions considerably. One of the objective of the ongoing industry and Norwegian research council supported "SCORE" project is to reduce potential PAH emissions from an off gas stream of a ferroalloy furnace through controlled combustion. In a dedicated combustion chamber. The sizing and configuration of the combustion chamber depends on the combined properties of the bulk gas stream and the properties of the PAH itself. In order to achieve efficient and complete combustion the residence time and minimum temperature need to be optimized. For this design approach reliable kinetic data of the individual PAH-species and/or groups thereof are necessary. However, kinetic data on the combustion of PAH are difficult to obtain and there is only a limited number of studies. The paper presents an evaluation of the kinetic data for some of the PAH obtained from literature. In the present study, the oxidation is modelled for pure PAH and also for PAH mixed with process gas. Using a perfectly stirred reactor modelling approach the oxidation is modelled including advanced reaction kinetics to study influence of residence time and temperature on the conversion of PAH to CO2 and water. A Chemical Reactor Network (CRN) approach is developed to understand the oxidation of PAH inside the combustion chamber. Chemical reactor network modeling has been found to be a valuable tool in the evaluation of oxidation behavior of PAH under various conditions.

Keywords: PAH, PSR, energy recovery, ferro alloy furnace

Procedia PDF Downloads 240
24211 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Saeed Hassan Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analysing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics

Procedia PDF Downloads 530
24210 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 298
24209 CFD Simulation for Development of Cooling System in a Cooking Oven

Authors: V. Jagadish, Mathiyalagan V.

Abstract:

Prediction of Door Touch temperature of a Cooking Oven using CFD Simulation. Self-Clean cycle is carried out in Cooking ovens to convert food spilling into ashes which makes cleaning easy. During this cycle cavity of oven is exposed to high temperature around 460 C. At this operating point the user may prone to touch the Door surfaces, Side Shield, Control Panel. To prevent heat experienced by user, cooling system is built in oven. The most effective cooling system is developed with existing design constraints through CFD Simulations. Cross Flow fan is used for Cooling system due to its cost effectiveness and it can give more air flow with low pressure drop.

Keywords: CFD, MRF, RBM, RANS, new product development, simulation, thermal analysis

Procedia PDF Downloads 123
24208 Study of Biofouling Wastewater Treatment Technology

Authors: Sangho Park, Mansoo Kim, Kyujung Chae, Junhyuk Yang

Abstract:

The International Maritime Organization (IMO) recognized the problem of invasive species invasion and adopted the "International Convention for the Control and Management of Ships' Ballast Water and Sediments" in 2004, which came into force on September 8, 2017. In 2011, the IMO approved the "Guidelines for the Control and Management of Ships' Biofouling to Minimize the Transfer of Invasive Aquatic Species" to minimize the movement of invasive species by hull-attached organisms and required ships to manage the organisms attached to their hulls. Invasive species enter new environments through ships' ballast water and hull attachment. However, several obstacles to implementing these guidelines have been identified, including a lack of underwater cleaning equipment, regulations on underwater cleaning activities in ports, and difficulty accessing crevices in underwater areas. The shipping industry, which is the party responsible for understanding these guidelines, wants to implement them for fuel cost savings resulting from the removal of organisms attached to the hull, but they anticipate significant difficulties in implementing the guidelines due to the obstacles mentioned above. Robots or people remove the organisms attached to the hull underwater, and the resulting wastewater includes various species of organisms and particles of paint and other pollutants. Currently, there is no technology available to sterilize the organisms in the wastewater or stabilize the heavy metals in the paint particles. In this study, we aim to analyze the characteristics of the wastewater generated from the removal of hull-attached organisms and select the optimal treatment technology. The organisms in the wastewater generated from the removal of the attached organisms meet the biological treatment standard (D-2) using the sterilization technology applied in the ships' ballast water treatment system. The heavy metals and other pollutants in the paint particles generated during removal are treated using stabilization technologies such as thermal decomposition. The wastewater generated is treated using a two-step process: 1) development of sterilization technology through pretreatment filtration equipment and electrolytic sterilization treatment and 2) development of technology for removing particle pollutants such as heavy metals and dissolved inorganic substances. Through this study, we will develop a biological removal technology and an environmentally friendly processing system for the waste generated after removal that meets the requirements of the government and the shipping industry and lays the groundwork for future treatment standards.

Keywords: biofouling, ballast water treatment system, filtration, sterilization, wastewater

Procedia PDF Downloads 80
24207 Phytoremediation of Hydrocarbon-Polluted Soils: Assess the Potentialities of Six Tropical Plant Species

Authors: Pulcherie Matsodoum Nguemte, Adrien Wanko Ngnien, Guy Valerie Djumyom Wafo, Ives Magloire Kengne Noumsi, Pierre Francois Djocgoue

Abstract:

The identification of plant species with the capacity to grow on hydrocarbon-polluted soils is an essential step for phytoremediation. In view of developing phytoremediation in Cameroon, floristic surveys have been conducted in 4 cities (Douala, Yaounde, Limbe, and Kribi). In each city, 13 hydrocarbon-polluted, as well as unpolluted sites (control), have been investigated using quadrat method. 106 species belonging to 76 genera and 30 families have been identified on hydrocarbon-polluted sites, unlike the control sites where floristic diversity was much higher (166 species contained in 125 genera and 50 families). Poaceae, Cyperaceae, Asteraceae and Amaranthaceae have higher taxonomic richness on polluted sites (16, 15,10 and 8 taxa, respectively). Shannon diversity index of the hydrocarbon-polluted sites (1.6 to 2.7 bits/ind.) were significantly lower than the control sites (2.7 to 3.2 bits/ind.). Based on a relative frequency > 10% and abundance > 7%, this study highlights more than ten plants predisposed to be effective in the cleaning-up attempts of soils contaminated by hydrocarbons. Based on the floristic indicators, 6 species (Eleusine indica (L.) Gaertn., Cynodon dactylon (L.) Pers., Alternanthera sessilis (L.) R. Br. ex DC †, Commelinpa benghalensis L., Cleome ciliata Schum. & Thonn. and Asystasia gangetica (L.) T. Anderson) were selected for a study to determine their capacity to remediate a soil contaminated with fuel oil (82.5 ml/ kg of soil). The experiments lasting 150 days takes into account three modalities - Tn: uncontaminated soils planted (6) To contaminated soils unplanted (3) and Tp: contaminated soil planted (18) – randomized arranged. 3 on 6 species (Eleusine indica, Cynodon dactylon, and Alternanthera sessilis) survived the climatic and soil conditions. E. indica presents a significantly higher growth rate for density and leaf area while C. dactylon had a significantly higher growth rate for stem size and leaf numbers. A. sessilis showed stunted growth and development throughout the experimental period. The species Eleusine indica (L.) Gaertn. and Cynodon dactylon (L.) Pers. can be qualified as polluo-tolerant plant species; polluo-tolerance being the ability of a species to survive and develop in the midst subject to extreme physical and chemical disturbances.

Keywords: Cameroon, cleaning-up, floristic surveys, phytoremediation

Procedia PDF Downloads 214
24206 Control the Flow of Big Data

Authors: Shizra Waris, Saleem Akhtar

Abstract:

Big data is a research area receiving attention from academia and IT communities. In the digital world, the amounts of data produced and stored have within a short period of time. Consequently this fast increasing rate of data has created many challenges. In this paper, we use functionalism and structuralism paradigms to analyze the genesis of big data applications and its current trends. This paper presents a complete discussion on state-of-the-art big data technologies based on group and stream data processing. Moreover, strengths and weaknesses of these technologies are analyzed. This study also covers big data analytics techniques, processing methods, some reported case studies from different vendor, several open research challenges and the chances brought about by big data. The similarities and differences of these techniques and technologies based on important limitations are also investigated. Emerging technologies are suggested as a solution for big data problems.

Keywords: computer, it community, industry, big data

Procedia PDF Downloads 156
24205 Classification of Barley Varieties by Artificial Neural Networks

Authors: Alper Taner, Yesim Benal Oztekin, Huseyin Duran

Abstract:

In this study, an Artificial Neural Network (ANN) was developed in order to classify barley varieties. For this purpose, physical properties of barley varieties were determined and ANN techniques were used. The physical properties of 8 barley varieties grown in Turkey, namely thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour parameters of grain, were determined and it was found that these properties were statistically significant with respect to varieties. As ANN model, three models, N-l, N-2 and N-3 were constructed. The performances of these models were compared. It was determined that the best-fit model was N-1. In the N-1 model, the structure of the model was designed to be 11 input layers, 2 hidden layers and 1 output layer. Thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour parameters of grain were used as input parameter; and varieties as output parameter. R2, Root Mean Square Error and Mean Error for the N-l model were found as 99.99%, 0.00074 and 0.009%, respectively. All results obtained by the N-l model were observed to have been quite consistent with real data. By this model, it would be possible to construct automation systems for classification and cleaning in flourmills.

Keywords: physical properties, artificial neural networks, barley, classification

Procedia PDF Downloads 145
24204 High Performance Computing and Big Data Analytics

Authors: Branci Sarra, Branci Saadia

Abstract:

Because of the multiplied data growth, many computer science tools have been developed to process and analyze these Big Data. High-performance computing architectures have been designed to meet the treatment needs of Big Data (view transaction processing standpoint, strategic, and tactical analytics). The purpose of this article is to provide a historical and global perspective on the recent trend of high-performance computing architectures especially what has a relation with Analytics and Data Mining.

Keywords: high performance computing, HPC, big data, data analysis

Procedia PDF Downloads 481
24203 A Landscape of Research Data Repositories in Re3data.org Registry: A Case Study of Indian Repositories

Authors: Prashant Shrivastava

Abstract:

The purpose of this study is to explore re3dat.org registry to identify research data repositories registration workflow process. Further objective is to depict a graph for present development of research data repositories in India. Preliminarily with an approach to understand re3data.org registry framework and schema design then further proceed to explore the status of research data repositories of India in re3data.org registry. Research data repositories are getting wider relevance due to e-research concepts. Now available registry re3data.org is a good tool for users and researchers to identify appropriate research data repositories as per their research requirements. In Indian environment, a compatible National Research Data Policy is the need of the time to boost the management of research data. Registry for Research Data Repositories is a crucial tool to discover specific information in specific domain. Also, Research Data Repositories in India have not been studied. Re3data.org registry and status of Indian research data repositories both discussed in this study.

Keywords: research data, research data repositories, research data registry, re3data.org

Procedia PDF Downloads 293
24202 Simulation of Polymeric Precursors Production from Wine Industrial Organic Wastes

Authors: Tanapoom Phuncharoen, Tawiwat Sriwongsa, Kanita Boonruang, Apichit Svang-Ariyaskul

Abstract:

The production of dimethyl acetal, isovaleradehyde, and pyridine were simulated using Aspen Plus simulation. Upgrading cleaning water from wine industrial production is the main objective of the project. The winery waste composes of acetaldehyde, methanol, ethyl acetate, 1-propanol, water, isoamyl alcohol, and isobutanol. The project is separated into three parts; separation, reaction, and purification. Various processes were considered to maximize the profit along with obtaining high purity and recovery of each component with optimum heat duty. The results show a significant value of the product with purity more than 75% and recovery over 98%.

Keywords: dimethyl acetal, pyridine, wine, aspen plus, isovaleradehyde, polymeric precursors

Procedia PDF Downloads 295
24201 A Study of Cloud Computing Solution for Transportation Big Data Processing

Authors: Ilgin Gökaşar, Saman Ghaffarian

Abstract:

The need for fast processed big data of transportation ridership (eg., smartcard data) and traffic operation (e.g., traffic detectors data) which requires a lot of computational power is incontrovertible in Intelligent Transportation Systems. Nowadays cloud computing is one of the important subjects and popular information technology solution for data processing. It enables users to process enormous measure of data without having their own particular computing power. Thus, it can also be a good selection for transportation big data processing as well. This paper intends to examine how the cloud computing can enhance transportation big data process with contrasting its advantages and disadvantages, and discussing cloud computing features.

Keywords: big data, cloud computing, Intelligent Transportation Systems, ITS, traffic data processing

Procedia PDF Downloads 420
24200 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 215
24199 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: data mining, fuzzy sets, linguistic summarization, patent data

Procedia PDF Downloads 244
24198 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: communication, computer network, data collection, probe

Procedia PDF Downloads 327
24197 Constructions of Linear and Robust Codes Based on Wavelet Decompositions

Authors: Alla Levina, Sergey Taranov

Abstract:

The classical approach to the providing noise immunity and integrity of information that process in computing devices and communication channels is to use linear codes. Linear codes have fast and efficient algorithms of encoding and decoding information, but this codes concentrate their detect and correct abilities in certain error configurations. To protect against any configuration of errors at predetermined probability can robust codes. This is accomplished by the use of perfect nonlinear and almost perfect nonlinear functions to calculate the code redundancy. The paper presents the error-correcting coding scheme using biorthogonal wavelet transform. Wavelet transform applied in various fields of science. Some of the wavelet applications are cleaning of signal from noise, data compression, spectral analysis of the signal components. The article suggests methods for constructing linear codes based on wavelet decomposition. For developed constructions we build generator and check matrix that contain the scaling function coefficients of wavelet. Based on linear wavelet codes we develop robust codes that provide uniform protection against all errors. In article we propose two constructions of robust code. The first class of robust code is based on multiplicative inverse in finite field. In the second robust code construction the redundancy part is a cube of information part. Also, this paper investigates the characteristics of proposed robust and linear codes.

Keywords: robust code, linear code, wavelet decomposition, scaling function, error masking probability

Procedia PDF Downloads 458