Search results for: Correlated Binary data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7844

Search results for: Correlated Binary data

7304 Assessment of Landfill Pollution Load on Hydroecosystem by Use of Heavy Metal Bioaccumulation Data in Fish

Authors: Gintarė Sauliutė, Gintaras Svecevičius

Abstract:

Landfill leachates contain a number of persistent pollutants, including heavy metals. They have the ability to spread in ecosystems and accumulate in fish which most of them are classified as top-consumers of trophic chains. Fish are freely swimming organisms; but perhaps, due to their species-specific ecological and behavioral properties, they often prefer the most suitable biotopes and therefore, did not avoid harmful substances or environments. That is why it is necessary to evaluate the persistent pollutant dispersion in hydroecosystem using fish tissue metal concentration. In hydroecosystems of hybrid type (e.g. river-pond-river) the distance from the pollution source could be a perfect indicator of such a kind of metal distribution. The studies were carried out in the Kairiai landfill neighboring hybrid-type ecosystem which is located 5 km east of the Šiauliai City. Fish tissue (gills, liver, and muscle) metal concentration measurements were performed on two types of ecologically-different fishes according to their feeding characteristics: benthophagous (Gibel carp, roach) and predatory (Northern pike, perch). A number of mathematical models (linear, non-linear, using log and other transformations) have been applied in order to identify the most satisfactorily description of the interdependence between fish tissue metal concentration and the distance from the pollution source. However, the only one log-multiple regression model revealed the pattern that the distance from the pollution source is closely and positively correlated with metal concentration in all predatory fish tissues studied (gills, liver, and muscle).

Keywords: Bioaccumulation in fish, heavy metals, hydroecosystem, landfill leachate, mathematical model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819
7303 Pesticides Use in Rural Settings in Romania

Authors: Anca E. Gurzau, Alexandru Coman, Eugen S. Gurzau, Marinela Penes, Daniela Dumitrescu, DorinMarchean, Ioan Chera

Abstract:

The environment pollution with pesticides and heavy metals is a recognized problem nowadays, with extension to the global scale the tendency of amplification. Even with all the progress in the environmental field, both in the emphasize of the effect of the pollutants upon health, the linked studies environment-health are insufficient, not only in Romania but all over the world also. We aim to describe the particular situation in Romania regarding the uncontrolled use of pesticides, to identify and evaluate the risk zones for health and the environment in Romania, with the final goal of designing adequate programs for reduction and control of the risk sources. An exploratory study was conducted to determine the magnitude of the pesticide use problem in a population living in Saliste, a rural setting in Transylvania, Romania. The significant stakeholders in Saliste region were interviewed and a sample from the population living in Saliste area was selected to fill in a designed questionnaire. All the selected participants declared that they used pesticides in their activities for more than one purpose. They declared they annually applied pesticides for a period of time between 11 and 30 years, from 5 to 9 days per year on average, mainly on crops situated at some distance from the houses but high risk behavior was identified as the volunteers declared the use of pesticides in the backyard gardens, near their homes, where children were playing. The pesticide applicators did not have the necessary knowledge about safety and exposure. The health data must be correlated with exposure biomarkers in attempt to identify the possible health effects of the pesticides exposure. Future plans include educational campaigns to raise the awareness of the population on the danger of uncontrolled use of pesticides.

Keywords: Pesticides, health effects, Romania, Saliste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808
7302 Network Analysis in a Natural Perturbed Ecosystem

Authors: Nelson F.F. Ebecken, Gilberto C. Pereira

Abstract:

The objective of this work is to explicit knowledge on the interactions between the chlorophyll-a and nine meroplankton larvae of epibenthonic fauna. The studied case is the Arraial do Cabo upwelling system, Southeastern of Brazil, which provides different environmental conditions. To assess this information a network approach based in probability estimative was used. Comparisons among the generated graphs are made in the light of different water masses, application of Shannon biodiversity index, and the closeness and betweenness centralities measurements. Our results show the main pattern among different water masses and how the core organisms belonging to the network skeleton are correlated to the main environmental variable. We conclude that the approach of complex networks is a promising tool for environmental diagnostic.

Keywords: Coastal upwelling, Ecological networks, Plankton - interactions, Environmental analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
7301 EPR Hiding in Medical Images for Telemedicine

Authors: K. A. Navas, S. Archana Thampy, M. Sasikumar

Abstract:

Medical image data hiding has strict constrains such as high imperceptibility, high capacity and high robustness. Achieving these three requirements simultaneously is highly cumbersome. Some works have been reported in the literature on data hiding, watermarking and stegnography which are suitable for telemedicine applications. None is reliable in all aspects. Electronic Patient Report (EPR) data hiding for telemedicine demand it blind and reversible. This paper proposes a novel approach to blind reversible data hiding based on integer wavelet transform. Experimental results shows that this scheme outperforms the prior arts in terms of zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and large EPR data embedding capacity with WPSNR (Weighted Peak Signal to Noise Ratio) around 53 dB, compared with the existing reversible data hiding schemes.

Keywords: Biomedical imaging, Data security, Datacommunication, Teleconferencing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2737
7300 A Robust Method for Encrypted Data Hiding Technique Based on Neighborhood Pixels Information

Authors: Ali Shariq Imran, M. Younus Javed, Naveed Sarfraz Khattak

Abstract:

This paper presents a novel method for data hiding based on neighborhood pixels information to calculate the number of bits that can be used for substitution and modified Least Significant Bits technique for data embedding. The modified solution is independent of the nature of the data to be hidden and gives correct results along with un-noticeable image degradation. The technique, to find the number of bits that can be used for data hiding, uses the green component of the image as it is less sensitive to human eye and thus it is totally impossible for human eye to predict whether the image is encrypted or not. The application further encrypts the data using a custom designed algorithm before embedding bits into image for further security. The overall process consists of three main modules namely embedding, encryption and extraction cm.

Keywords: Data hiding, image processing, information security, stagonography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2331
7299 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2623
7298 Stress Relaxation of Date at Different Temperature and Moisture Content of Product: A New Approach

Authors: D. Zare, M. Alirezaei, S.M. Nassiri

Abstract:

Iran is one of the greatest producers of date in the world. However due to lack of information about its viscoelastic properties, much of the production downgraded during harvesting and postharvesting processes. In this study the effect of temperature and moisture content of product were investigated on stress relaxation characteristics. Therefore, the freshly harvested date (kabkab) at tamar stage were put in controlled environment chamber to obtain different temperature levels (25, 35, 45, and 55 0C) and moisture contents (8.5, 8.7, 9.2, 15.3, 20, 32.2 %d.b.). A texture analyzer TAXT2 (Stable Microsystems, UK) was used to apply uniaxial compression tests. A chamber capable to control temperature was designed and fabricated around the plunger of texture analyzer to control the temperature during the experiment. As a new approach a CCD camera (A4tech, 30 fps) was mounted on a cylindrical glass probe to scan and record contact area between date and disk. Afterwards, pictures were analyzed using image processing toolbox of Matlab software. Individual date fruit was uniaxially compressed at speed of 1 mm/s. The constant strain of 30% of thickness of date was applied to the horizontally oriented fruit. To select a suitable model for describing stress relaxation of date, experimental data were fitted with three famous stress relaxation models including the generalized Maxwell, Nussinovitch, and Pelege. The constant in mentioned model were determined and correlated with temperature and moisture content of product using non-linear regression analysis. It was found that Generalized Maxwell and Nussinovitch models appropriately describe viscoelastic characteristics of date fruits as compared to Peleg mode.

Keywords: Stress relaxation, Viscoelastic properties, Date, Texture analyzer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899
7297 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data

Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri

Abstract:

In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.

Keywords: Gaussian process, Nonlinearity distribution, Particle filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714
7296 Exponentially Weighted Simultaneous Estimation of Several Quantiles

Authors: Valeriy Naumov, Olli Martikainen

Abstract:

In this paper we propose new method for simultaneous generating multiple quantiles corresponding to given probability levels from data streams and massive data sets. This method provides a basis for development of single-pass low-storage quantile estimation algorithms, which differ in complexity, storage requirement and accuracy. We demonstrate that such algorithms may perform well even for heavy-tailed data.

Keywords: Quantile estimation, data stream, heavy-taileddistribution, tail index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525
7295 Emotional Intelligence as Predictor of Academic Success among Third Year College Students of PIT

Authors: Sonia Arradaza-Pajaron

Abstract:

College students are expected to engage in an on-the-job training or internship for completion of a course requirement prior to graduation. In this scenario, they are exposed to the real world of work outside their training institution. To find out their readiness both emotionally and academically, this study has been conducted. A descriptive-correlational research design was employed and random sampling technique method was utilized among 265 randomly selected third year college students of PIT, SY 2014-15. A questionnaire on Emotional Intelligence (bearing the four components namely; emotional literacy, emotional quotient competence, values and beliefs and emotional quotient outcomes) was fielded to the respondents and GWA was extracted from the school automate. Data collected were statistically treated using percentage, weighted mean and Pearson-r for correlation.

Results revealed that respondents’ emotional intelligence level is moderately high while their academic performance is good. A high significant relationship was found between the EI component; Emotional Literacy and their academic performance while only significant relationship was found between Emotional Quotient Outcomes and their academic performance. Therefore, if EI influences academic performance significantly when correlated, a possibility that their OJT performance can also be affected either positively or negatively. Thus, EI can be considered predictor of their academic and academic-related performance. Based on the result, it is then recommended that the institution would try to look deeply into the consideration of embedding emotional intelligence as part of the (especially on Emotional Literacy and Emotional Quotient Outcomes of the students) college curriculum. It can be done if the school shall have an effective Emotional Intelligence framework or program manned by qualified and competent teachers, guidance counselors in different colleges in its implementation.

Keywords: Academic performance, emotional intelligence, emotional literacy, emotional quotient competence, emotional quotient outcomes, values and beliefs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1838
7294 Enhanced Data Access Control of Cooperative Environment used for DMU Based Design

Authors: Wei Lifan, Zhang Huaiyu, Yang Yunbin, Li Jia

Abstract:

Through the analysis of the process digital design based on digital mockup, the fact indicates that a distributed cooperative supporting environment is the foundation conditions to adopt design approach based on DMU. Data access authorization is concerned firstly because the value and sensitivity of the data for the enterprise. The access control for administrators is often rather weak other than business user. So authors established an enhanced system to avoid the administrators accessing the engineering data by potential approach and without authorization. Thus the data security is improved.

Keywords: access control, DMU, PLM, virtual prototype.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458
7293 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3136
7292 Object Motion Tracking Based On Color Detection for Android Devices

Authors: Zacharenia I. Garofalaki, John T. Amorginos, John N. Ellinas

Abstract:

This paper presents the development of a robot car that can track the motion of an object by detecting its color through an Android device. The employed computer vision algorithm uses the OpenCV library, which is embedded into an Android application of a smartphone, for manipulating the captured image of the object. The captured image of the object is subjected to color conversion and is transformed to a binary image for further processing after color filtering. The desired object is clearly determined after removing pixel noise by applying image morphology operations and contour definition. Finally, the area and the center of the object are determined so that object’s motion to be tracked. The smartphone application has been placed on a robot car and transmits by Bluetooth to an Arduino assembly the motion directives so that to follow objects of a specified color. The experimental evaluation of the proposed algorithm shows reliable color detection and smooth tracking characteristics.

Keywords: Android, Arduino Uno, Image processing, Object motion detection, OpenCV library.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4558
7291 Speed Characteristics of Mixed Traffic Flow on Urban Arterials

Authors: Ashish Dhamaniya, Satish Chandra

Abstract:

Speed and traffic volume data are collected on different sections of four lane and six lane roads in three metropolitan cities in India. Speed data are analyzed to fit the statistical distribution to individual vehicle speed data and all vehicles speed data. It is noted that speed data of individual vehicle generally follows a normal distribution but speed data of all vehicle combined at a section of urban road may or may not follow the normal distribution depending upon the composition of traffic stream. A new term Speed Spread Ratio (SSR) is introduced in this paper which is the ratio of difference in 85th and 50th percentile speed to the difference in 50th and 15th percentile speed. If SSR is unity then speed data are truly normally distributed. It is noted that on six lane urban roads, speed data follow a normal distribution only when SSR is in the range of 0.86 – 1.11. The range of SSR is validated on four lane roads also.

Keywords: Normal distribution, percentile speed, speed spread ratio, traffic volume.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4232
7290 A Comparative Study between Discrete Wavelet Transform and Maximal Overlap Discrete Wavelet Transform for Testing Stationarity

Authors: Amel Abdoullah Ahmed Dghais, Mohd Tahir Ismail

Abstract:

In this paper the core objective is to apply discrete wavelet transform and maximal overlap discrete wavelet transform functions namely Haar, Daubechies2, Symmlet4, Coiflet2 and discrete approximation of the Meyer wavelets in non stationary financial time series data from Dow Jones index (DJIA30) of US stock market. The data consists of 2048 daily data of closing index from December 17, 2004 to October 23, 2012. Unit root test affirms that the data is non stationary in the level. A comparison between the results to transform non stationary data to stationary data using aforesaid transforms is given which clearly shows that the decomposition stock market index by discrete wavelet transform is better than maximal overlap discrete wavelet transform for original data.

Keywords: Discrete wavelet transform, maximal overlap discrete wavelet transform, stationarity, autocorrelation function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4717
7289 Adaptive Digital Watermarking Integrating Fuzzy Inference HVS Perceptual Model

Authors: Sherin M. Youssef, Ahmed Abouelfarag, Noha M. Ghatwary

Abstract:

An adaptive Fuzzy Inference Perceptual model has been proposed for watermarking of digital images. The model depends on the human visual characteristics of image sub-regions in the frequency multi-resolution wavelet domain. In the proposed model, a multi-variable fuzzy based architecture has been designed to produce a perceptual membership degree for both candidate embedding sub-regions and strength watermark embedding factor. Different sizes of benchmark images with different sizes of watermarks have been applied on the model. Several experimental attacks have been applied such as JPEG compression, noises and rotation, to ensure the robustness of the scheme. In addition, the model has been compared with different watermarking schemes. The proposed model showed its robustness to attacks and at the same time achieved a high level of imperceptibility.

Keywords: Watermarking, The human visual system (HVS), Fuzzy Inference System (FIS), Local Binary Pattern (LBP), Discrete Wavelet Transform (DWT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
7288 Comparative Study of Transformed and Concealed Data in Experimental Designs and Analyses

Authors: K. Chinda, P. Luangpaiboon

Abstract:

This paper presents the comparative study of coded data methods for finding the benefit of concealing the natural data which is the mercantile secret. Influential parameters of the number of replicates (rep), treatment effects (τ) and standard deviation (σ) against the efficiency of each transformation method are investigated. The experimental data are generated via computer simulations under the specified condition of the process with the completely randomized design (CRD). Three ways of data transformation consist of Box-Cox, arcsine and logit methods. The difference values of F statistic between coded data and natural data (Fc-Fn) and hypothesis testing results were determined. The experimental results indicate that the Box-Cox results are significantly different from natural data in cases of smaller levels of replicates and seem to be improper when the parameter of minus lambda has been assigned. On the other hand, arcsine and logit transformations are more robust and obviously, provide more precise numerical results. In addition, the alternate ways to select the lambda in the power transformation are also offered to achieve much more appropriate outcomes.

Keywords: Experimental Designs, Box-Cox, Arcsine, Logit Transformations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
7287 Design of a Low Cost Motion Data Acquisition Setup for Mechatronic Systems

Authors: Barış Can Yalçın

Abstract:

Motion sensors have been commonly used as a valuable component in mechatronic systems, however, many mechatronic designs and applications that need motion sensors cost enormous amount of money, especially high-tech systems. Design of a software for communication protocol between data acquisition card and motion sensor is another issue that has to be solved. This study presents how to design a low cost motion data acquisition setup consisting of MPU 6050 motion sensor (gyro and accelerometer in 3 axes) and Arduino Mega2560 microcontroller. Design parameters are calibration of the sensor, identification and communication between sensor and data acquisition card, interpretation of data collected by the sensor.

Keywords: Calibration of sensors, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4324
7286 Conceptual Multidimensional Model

Authors: Manpreet Singh, Parvinder Singh, Suman

Abstract:

The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.

Keywords: Multidimensional, data precision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1445
7285 Real Time Approach for Data Placement in Wireless Sensor Networks

Authors: Sanjeev Gupta, Mayank Dave

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
7284 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1862
7283 A Hybrid Machine Learning System for Stock Market Forecasting

Authors: Rohit Choudhry, Kumkum Garg

Abstract:

In this paper, we propose a hybrid machine learning system based on Genetic Algorithm (GA) and Support Vector Machines (SVM) for stock market prediction. A variety of indicators from the technical analysis field of study are used as input features. We also make use of the correlation between stock prices of different companies to forecast the price of a stock, making use of technical indicators of highly correlated stocks, not only the stock to be predicted. The genetic algorithm is used to select the set of most informative input features from among all the technical indicators. The results show that the hybrid GA-SVM system outperforms the stand alone SVM system.

Keywords: Genetic Algorithms, Support Vector Machines, Stock Market Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9302
7282 A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
7281 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 799
7280 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects.  Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1471
7279 Distributed Data-Mining by Probability-Based Patterns

Authors: M. Kargar, F. Gharbalchi

Abstract:

In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.

Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548
7278 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3268
7277 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611
7276 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2548
7275 Modeling of Random Variable with Digital Probability Hyper Digraph: Data-Oriented Approach

Authors: A. Habibizad Navin, M. Naghian Fesharaki, M. Mirnia, M. Kargar

Abstract:

In this paper we introduce Digital Probability Hyper Digraph for modeling random variable as the hierarchical data-oriented model.

Keywords: Data-Oriented Models, Data Structure, DigitalProbability Hyper Digraph, Random Variable, Statistic andProbability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1258