Search results for: XML Data Stream
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7549

Search results for: XML Data Stream

7279 Speech Data Compression using Vector Quantization

Authors: H. B. Kekre, Tanuja K. Sarode

Abstract:

Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table shows computational complexity of these three algorithms. Here we have introduced a new performance parameter Average Fractional Change in Speech Sample (AFCSS). Our FCG algorithm gives far better performance considering mean absolute error, AFCSS and complexity as compared to others.

Keywords: Vector Quantization, Data Compression, Encoding, , Speech coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
7278 Ontology and CDSS Based Intelligent Health Data Management in Health Care Server

Authors: Eun-Jung Ko, Hyung-Jik Lee, Jeun-Woo Lee

Abstract:

In ubiqutious healthcare environment, user's health data are transfered to the remote healthcare server by the user's wearable system or mobile phone. These collected user's health data should be managed and analyzed in the healthcare server, so that care giver or user can monitor user's physiological state. In this paper, we designed and developed the intelligent Healthcare Server to manage the user's health data using CDSS and ontology. Our system can analyze user's health data semantically using CDSS and ontology, and report the result of user's physiological raw data to the user and care giver.

Keywords: u-healthcare, CDSS, healthcare server, health data, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2191
7277 The Development of Decision Support System for Waste Management; a Review

Authors: M. S. Bani, Z. A. Rashid, K. H. K. Hamid, M. E. Harbawi, A.B.Alias, M. J. Aris

Abstract:

Most Decision Support Systems (DSS) for waste management (WM) constructed are not widely marketed and lack practical applications. This is due to the number of variables and complexity of the mathematical models which include the assumptions and constraints required in decision making. The approach made by many researchers in DSS modelling is to isolate a few key factors that have a significant influence to the DSS. This segmented approach does not provide a thorough understanding of the complex relationships of the many elements involved. The various elements in constructing the DSS must be integrated and optimized in order to produce a viable model that is marketable and has practical application. The DSS model used in assisting decision makers should be integrated with GIS, able to give robust prediction despite the inherent uncertainties of waste generation and the plethora of waste characteristics, and gives optimal allocation of waste stream for recycling, incineration, landfill and composting.

Keywords: Review, decision support system, GIS and waste management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3671
7276 A Genetic Algorithm for Clustering on Image Data

Authors: Qin Ding, Jim Gasvoda

Abstract:

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Keywords: Clustering, data mining, genetic algorithm, image data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996
7275 A Holistic Framework for Unifying Data Security and Management in Modern Enterprises

Authors: Ashly Joseph

Abstract:

Modern businesses struggle significantly to secure and manage their data properly as the volume and complexity of their data both expand exponentially. Through the use of a multi-layered defense strategy, a centralized management platform, and cutting-edge technologies like AI, this research paper presents a comprehensive framework to integrate data security and management. The constraints of current data protection and management strategies, technological advancements, and the evolving threat landscape are all examined in this article. It suggests best practices for putting into practice integrated data security and governance models, placing an emphasis on ongoing adaptation. The advantages mentioned include a strengthened security posture, simpler procedures, lower costs, and reduced complexity. Additionally, issues including skill shortages, antiquated systems, and cultural obstacles are examined. Security executives and Chief Information Security Officers are given practical advice on how to evaluate, plan, and put into place strong data-centric security and management capabilities. The goal of the paper is to provide a thorough study of the data security and management landscape and to arm contemporary businesses with the knowledge they need to be proactive in protecting their data assets.

Keywords: Data security, security management, cloud computing, cybersecurity, data governance, security architecture, data management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 153
7274 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
7273 RFID-ready Master Data Management for Reverse Logistics

Authors: Jincheol Han, Hyunsun Ju, Jonghoon Chun

Abstract:

Sharing consistent and correct master data among disparate applications in a reverse-logistics chain has long been recognized as an intricate problem. Although a master data management (MDM) system can surely assume that responsibility, applications that need to co-operate with it must comply with proprietary query interfaces provided by the specific MDM system. In this paper, we present a RFID-ready MDM system which makes master data readily available for any participating applications in a reverse-logistics chain. We propose a RFID-wrapper as a part of our MDM. It acts as a gateway between any data retrieval request and query interfaces that process it. With the RFID-wrapper, any participating applications in a reverse-logistics chain can easily retrieve master data in a way that is analogous to retrieval of any other RFID-based logistics transactional data.

Keywords: Reverse Logistics, Master Data Management, RFID.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1923
7272 Dynamic Models versus Frailty Models for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent event data is a special type of multivariate survival data. Dynamic and frailty models are one of the approaches that dealt with this kind of data. A comparison between these two models is studied using the empirical standard deviation of the standardized martingale residual processes as a way of assessing the fit of the two models based on the Aalen additive regression model. Here we found both approaches took heterogeneity into account and produce residual standard deviations close to each other both in the simulation study and in the real data set.

Keywords: Dynamic, frailty, misspecification, recurrent events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2313
7271 Issues and Architecture for Supporting Data Warehouse Queries in Web Portals

Authors: Minsoo Lee, Yoon-kyung Lee, Hyejung Yoon, Soo-kyung Song, Sujeong Cheong

Abstract:

Data Warehousing tools have become very popular and currently many of them have moved to Web-based user interfaces to make it easier to access and use the tools. The next step is to enable these tools to be used within a portal framework. The portal framework consists of pages having several small windows that contain individual data warehouse query results. There are several issues that need to be considered when designing the architecture for a portal enabled data warehouse query tool. Some issues need special techniques that can overcome the limitations that are imposed by the nature of data warehouse queries. Issues such as single sign-on, query result caching and sharing, customization, scheduling and authorization need to be considered. This paper discusses such issues and suggests an architecture to support data warehouse queries within Web portal frameworks.

Keywords: Data Warehousing tools, data warehousing queries, web portal frameworks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2075
7270 The Reliability of the Improved e-N Method for Transition Prediction as Checked by PSE Method

Authors: Caihong Su

Abstract:

Transition prediction of boundary layers has always been an important problem in fluid mechanics both theoretically and practically, yet notwithstanding the great effort made by many investigators, there is no satisfactory answer to this problem. The most popular method available is so-called e-N method which is heavily dependent on experiments and experience. The author has proposed improvements to the e-N method, so to reduce its dependence on experiments and experience to a certain extent. One of the key assumptions is that transition would occur whenever the velocity amplitude of disturbance reaches 1-2% of the free stream velocity. However, the reliability of this assumption needs to be verified. In this paper, transition prediction on a flat plate is investigated by using both the improved e-N method and the parabolized stability equations (PSE) methods. The results show that the transition locations predicted by both methods agree reasonably well with each other, under the above assumption. For the supersonic case, the critical velocity amplitude in the improved e-N method should be taken as 0.013, whereas in the subsonic case, it should be 0.018, both are within the range 1-2%.

Keywords: Boundary layer, e-N method, PSE, Transition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1460
7269 Effects of Oilfield Water Treated by Electroflocculation and Reverse Osmosis in a Typical Brazilian Semiarid Soil

Authors: P. S. A. Souza, M. R. C. Marques, M. M. Rigo, A. A. Cerqueira, J. L. Paiva, F. Merçon, D. V. Perez

Abstract:

Produced water (PW), which is water extracted along with oil, is the largest waste stream in the oil and gas industry. With the proper treatment, this wastewater can be used in agricultural irrigation. This study evaluated the effects the application of PW treated by electroflocculation (EF) and combined electroflocculation-reverse osmosis (EF-RO) on soil salinity and sodification parameters. Excessive sodium levels in PW treated by EF may affect soil structural stability and plant growth, and tends to accumulate in upper layers, displacing the nutrient K to deeper layers of the soil profile. PW treated by EF-RO did not promote salinization and soil sodification, indicating that this combined technique may be a viable alternative for oily water treatment aiming at irrigation use in semiarid regions.

Keywords: Electroflocculation, irrigation, produced water, reverse osmosis, soil.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 530
7268 Data Mining Using Learning Automata

Authors: M. R. Aghaebrahimi, S. H. Zahiri, M. Amiri

Abstract:

In this paper a data miner based on the learning automata is proposed and is called LA-miner. The LA-miner extracts classification rules from data sets automatically. The proposed algorithm is established based on the function optimization using learning automata. The experimental results on three benchmarks indicate that the performance of the proposed LA-miner is comparable with (sometimes better than) the Ant-miner (a data miner algorithm based on the Ant Colony optimization algorithm) and CNZ (a well-known data mining algorithm for classification).

Keywords: Data mining, Learning automata, Classification rules, Knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893
7267 Secure and Efficient Transmission of Aggregated Data for Mobile Wireless Sensor Networks

Authors: A. Krishna Veni, R.Geetha

Abstract:

Wireless Sensor Networks (WSNs) are suitable for many scenarios in the real world. The retrieval of data is made efficient by the data aggregation techniques. Many techniques for the data aggregation are offered and most of the existing schemes are not energy efficient and secure. However, the existing techniques use the traditional clustering approach where there is a delay during the packet transmission since there is no proper scheduling. The presented system uses the Velocity Energy-efficient and Link-aware Cluster-Tree (VELCT) scheme in which there is a Data Collection Tree (DCT) which improves the lifetime of the network. The VELCT scheme and the construction of DCT reduce the delay and traffic. The network lifetime can be increased by avoiding the frequent change in cluster topology. Secure and Efficient Transmission of Aggregated data (SETA) improves the security of the data transmission via the trust value of the nodes prior the aggregation of data. Since SETA considers the data only from the trustworthy nodes for aggregation, it is more secure in transmitting the data thereby improving the accuracy of aggregated data.

Keywords: Aggregation, lifetime, network security, wireless sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173
7266 Development of Greenhouse Analysis Tools for Home Agriculture Project

Authors: M. Amir Abas, M. Dahlui

Abstract:

This paper presents the development of analysis tools for Home Agriculture project. The tools are required for monitoring the condition of greenhouse which involves two components: measurement hardware and data analysis engine. Measurement hardware is functioned to measure environment parameters such as temperature, humidity, air quality, dust and etc while analysis tool is used to analyse and interpret the integrated data against the condition of weather, quality of health, irradiance, quality of soil and etc. The current development of the tools is completed for off-line data recorded technique. The data is saved in MMC and transferred via ZigBee to Environment Data Manager (EDM) for data analysis. EDM converts the raw data and plot three combination graphs. It has been applied in monitoring three months data measurement for irradiance, temperature and humidity of the greenhouse..

Keywords: Monitoring, Environment, Greenhouse, Analysis tools

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
7265 Technical Aspects of Closing the Loop in Depth-of-Anesthesia Control

Authors: Gorazd Karer

Abstract:

When performing a diagnostic procedure or surgery in general anesthesia (GA), a proper introduction and dosing of anesthetic agents is one of the main tasks of the anesthesiologist. That being said, depth of anesthesia (DoA) also seems to be a suitable process for closed-loop control implementation. To implement such a system, one must be able to acquire the relevant signals online and in real-time, as well as stream the calculated control signal to the infusion pump. However, during a procedure, patient monitors and infusion pumps are purposely unable to connect to an external (possibly medically unapproved) device for safety reasons, thus preventing closed-loop control. This paper proposes a conceptual solution to the aforementioned problem. First, it presents some important aspects of contemporary clinical practice. Next, it introduces the closed-loop-control-system structure and the relevant information flow. Focusing on transferring the data from the patient to the computer, it presents a non-invasive image-based system for signal acquisition from a patient monitor for online depth-of-anesthesia assessment. Furthermore, it introduces a User-Datagram-Protocol-based (UDP-based) communication method that can be used for transmitting the calculated anesthetic inflow to the infusion pump. The proposed system is independent of medical-device manufacturer and is implemented in MATLAB-Simulink, which can be conveniently used for DoA control implementation. The proposed scheme has been tested in a simulated GA setting and is ready to be evaluated in an operating theatre. However, the proposed system is only a step towards a proper closed-loop control system for DoA, which could routinely be used in clinical practice.

Keywords: Closed-loop control, Depth of Anesthesia, DoA, optical signal acquisition, Patient State index, PSi, UDP communication protocol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 410
7264 A Normalization-based Robust Watermarking Scheme Using Zernike Moments

Authors: Say Wei Foo, Qi Dong

Abstract:

Digital watermarking has become an important technique for copyright protection but its robustness against attacks remains a major problem. In this paper, we propose a normalizationbased robust image watermarking scheme. In the proposed scheme, original host image is first normalized to a standard form. Zernike transform is then applied to the normalized image to calculate Zernike moments. Dither modulation is adopted to quantize the magnitudes of Zernike moments according to the watermark bit stream. The watermark extracting method is a blind method. Security analysis and false alarm analysis are then performed. The quality degradation of watermarked image caused by the embedded watermark is visually transparent. Experimental results show that the proposed scheme has very high robustness against various image processing operations and geometric attacks.

Keywords: Image watermarking, Image normalization, Zernike moments, Robustness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713
7263 Comprehensive Analysis of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi

Abstract:

Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.

Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2396
7262 Edible Oil Industry Wastewater Treatment by Microfiltration with Ceramic Membrane

Authors: Zita Šereš, Dragana Šoronja Simović, Ljubica Dokić, Lidietta Giorno, Biljana Pajin, Cecilia Hodur, Nikola Maravić

Abstract:

Membrane technology is convenient for separation of suspended solids, colloids and high molecular weight materials that are present. The idea is that the waste stream from edible oil industry, after the separation of oil by using skimmers is subjected to microfiltration and the obtained permeate can be used again in the production process. The wastewater from edible oil industry was used for the microfiltration. For the microfiltration of this effluent a tubular membrane was used with a pore size of 200 nm at transmembrane pressure in range up to 3 bar and in range of flow rate up to 300 L/h. Box–Behnken design was selected for the experimental work and the responses considered were permeate flux and chemical oxygen demand (COD) reduction. The reduction of the permeate COD was in the range 40-60% according to the feed. The highest permeate flux achieved during the process of microfiltration was 160 L/m2h.

Keywords: Ceramic membrane, edible oil, microfiltration, wastewater.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
7261 A Prediction of Attractive Evaluation Objects Based On Complex Sequential Data

Authors: Shigeaki Sakurai, Makino Kyoko, Shigeru Matsumoto

Abstract:

This paper proposes a method that predicts attractive evaluation objects. In the learning phase, the method inductively acquires trend rules from complex sequential data. The data is composed of two types of data. One is numerical sequential data. Each evaluation object has respective numerical sequential data. The other is text sequential data. Each evaluation object is described in texts. The trend rules represent changes of numerical values related to evaluation objects. In the prediction phase, the method applies new text sequential data to the trend rules and evaluates which evaluation objects are attractive. This paper verifies the effect of the proposed method by using stock price sequences and news headline sequences. In these sequences, each stock brand corresponds to an evaluation object. This paper discusses validity of predicted attractive evaluation objects, the process time of each phase, and the possibility of application tasks.

Keywords: Trend rule, frequent pattern, numerical sequential data, text sequential data, evaluation object.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
7260 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2270
7259 A Comparative Study of Fine Grained Security Techniques Based on Data Accessibility and Inference

Authors: Azhar Rauf, Sareer Badshah, Shah Khusro

Abstract:

This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.

Keywords: Fine Grained Security, Data Accessibility, Inference, Row, Cell, Column Level Security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1427
7258 Weka Based Desktop Data Mining as Web Service

Authors: Sujala.D.Shetty, S.Vadivel, Sakshi Vaghella

Abstract:

Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.

Keywords: desktop application, Weka mining, web service

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4024
7257 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: Local nonlinear estimation, LWPR algorithm, Online training method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
7256 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: Web log data, web user profile, user interest, noise web data learning, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1676
7255 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475
7254 Vortex Shedding on Combined Bodies at Incidence to a Uniform Air Stream

Authors: T. Yavuz, Y. E. Akansu, M. Sarıoglu, M. Ozmert

Abstract:

Vortex-shedding phenomenon of the flow around combined two bodies having various geometries and sizes has been investigated experimentally in the Reynolds number range between 4.1x103 and 1.75x104. To see the effect of the rotation of the bodies on the vortex shedding, the combined bodies were rotated from 0° to 180°. The combined models have a cross section composing of a main circular cylinder and an attached circular or square cylinder. Results have shown that Strouhal numbers for two cases were changed considerably with the angle of incidence, while it was found to be largely independent of Reynolds number at 150. Characteristics of the vortex formation region and location of flow attachments, reattachments, and separations were observed by means of the flow visualizations. Depending on the inclination angle the effects of flow attachment, separation and reattachment on vortex-shedding phenomenon have been discussed.

Keywords: Bluff body, vortex shedding, flow separation, flow reattachment

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2075
7253 Water Pollution in Soshanguve Environs of South Africa

Authors: O. I. Nkwonta, G. M. Ochieng

Abstract:

Surface water pollution is one of the serious environmental problems in rural areas of South Africa due to discharge of household waste into the streams, turning them into open sewers. In this study, samples of water were collected from a stream in Soshanguve and analysed. The result showed that pollution in the area was caused by man and its activities. The water quality in the area was found to have deterioted significantly after water runoff from farms and household wastes. The result shows, fertilizer runoff contributes 50% of the pollution while pesticides and sediments contribute up to 10% respectively in the streams, while household waste contributes up to 30%. This study gives an outline of the sources of water pollution in the area and provides a process of creating a clean and unpolluted environment for Soshanguve community in Pretoria north in order to achieve the 7th aim of the millennium development goals by 2015, which is ensuring environmental sustainability.

Keywords: Fertilizer, Household waste, Pollution, Roughing filters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3771
7252 Moving Data Mining Tools toward a Business Intelligence System

Authors: Nittaya Kerdprasop, Kittisak Kerdprasop

Abstract:

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Keywords: Business intelligence, data mining, functionalprogramming, intelligent system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682
7251 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2146
7250 Evaluation of Water Quality for the Kurtbogazi Dam Outlet and the Streams Feeding the Dam in Ankara, Turkey

Authors: G. Tozsin, F. Bakir, C. Acar, E. Koç

Abstract:

Kurtbogazi Dam has gained special meaning for Ankara, Turkey for the last decade due to the rapid depletion of nearby resources of drinking water. In this study, the results of the analyses of Kurtbogazi Dam outlet water and the rivers flowing into the Kurtbogazi Dam were discussed for the period of last five years between 2008 and 2012. Some physical and chemical properties (pH, temperature, biochemical oxygen demand (BOD5), nitrate, phosphate and chlorine) of these water resources were evaluated. They were classified according to the Council Directive (75/440/EEC). Moreover, the properties of these surface waters were assessed to determine the quality of water for drinking and irrigation purposes using Piper, US Salinity Laboratory and Wilcox diagrams. The results showed that all the water resources are acceptable level as surface water except for Pazar Stream in terms of ortho-phosphate and BOD5 concentration for 2008.

Keywords: Kurtbogazi dam, water quality assessment, Ankara water, water supply.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1843