Search results for: gaps in data ecosystems

7333 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2851

7332 EPR Hiding in Medical Images for Telemedicine

Authors: K. A. Navas, S. Archana Thampy, M. Sasikumar

Abstract:

Medical image data hiding has strict constrains such as high imperceptibility, high capacity and high robustness. Achieving these three requirements simultaneously is highly cumbersome. Some works have been reported in the literature on data hiding, watermarking and stegnography which are suitable for telemedicine applications. None is reliable in all aspects. Electronic Patient Report (EPR) data hiding for telemedicine demand it blind and reversible. This paper proposes a novel approach to blind reversible data hiding based on integer wavelet transform. Experimental results shows that this scheme outperforms the prior arts in terms of zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and large EPR data embedding capacity with WPSNR (Weighted Peak Signal to Noise Ratio) around 53 dB, compared with the existing reversible data hiding schemes.

Keywords: Biomedical imaging, Data security, Datacommunication, Teleconferencing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2757

7331 A Robust Method for Encrypted Data Hiding Technique Based on Neighborhood Pixels Information

Authors: Ali Shariq Imran, M. Younus Javed, Naveed Sarfraz Khattak

Abstract:

This paper presents a novel method for data hiding based on neighborhood pixels information to calculate the number of bits that can be used for substitution and modified Least Significant Bits technique for data embedding. The modified solution is independent of the nature of the data to be hidden and gives correct results along with un-noticeable image degradation. The technique, to find the number of bits that can be used for data hiding, uses the green component of the image as it is less sensitive to human eye and thus it is totally impossible for human eye to predict whether the image is encrypted or not. The application further encrypts the data using a custom designed algorithm before embedding bits into image for further security. The overall process consists of three main modules namely embedding, encryption and extraction cm.

Keywords: Data hiding, image processing, information security, stagonography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2341

7330 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2637

7329 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data

Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri

Abstract:

In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.

Keywords: Gaussian process, Nonlinearity distribution, Particle filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1722

7328 Exponentially Weighted Simultaneous Estimation of Several Quantiles

Authors: Valeriy Naumov, Olli Martikainen

Abstract:

In this paper we propose new method for simultaneous generating multiple quantiles corresponding to given probability levels from data streams and massive data sets. This method provides a basis for development of single-pass low-storage quantile estimation algorithms, which differ in complexity, storage requirement and accuracy. We demonstrate that such algorithms may perform well even for heavy-tailed data.

Keywords: Quantile estimation, data stream, heavy-taileddistribution, tail index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533

7327 Underwater Wireless Sensor Network Layer Design for Reef Restoration

Authors: T. T. Manikandan, Rajeev Sukumaran

Abstract:

Coral Reefs are very important for the majority of marine ecosystems. But, such vital species are under major threat due to the factors such as ocean acidification, overfishing, and coral bleaching. To conserve the coral reefs, reef restoration activities are carried out across the world. After reef restoration, various parameters have to be monitored in order to ensure the overall effectiveness of the reef restoration. Underwater Wireless Sensor Network (UWSN) based monitoring is widely adopted for such long monitoring activities. Since monitoring of coral reef restoration activities is time sensitive, the QoS guarantee offered by the network with respect to delay is vital. So this research focuses on the analytical modeling of network layer delay using Stochastic Network Calculus (SNC). The core focus of the proposed model will be on the analysis of stochastic dependencies between the network flow and deriving the stochastic delay bounds for the flows that traverse in tandem in UWSNs. The derived analytical bounds are evaluated for their effectiveness using discrete event simulations.

Keywords: Coral Reef Restoration, SNC, SFA, PMOO, Tandem of Queues, Delay Bound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 426

7326 Review of Innovation Management Frameworks and Assessment Tools

Authors: Qiang Fu, Md. Abu Saleh

Abstract:

Research studies are highly fragmented when an Innovation Management Framework is being discussed. With the aim to identify an Innovation Management Framework/Assessment Tool suitable for Small & Medium Enterprises (SMEs) in the service industry, this researcher critically reviewed existing innovation management frameworks and assessment models/tools and discovered a number of literature gaps. It is established that the existing literature lacks generally agreed innovation management dimensions, commonly accepted knowledge creation through empirical studies on innovation management in SMEs, effective innovation management performance measurements, suitable innovation management framework in SMEs, and studies on innovation management in the service industry, in particular in retail SMEs. As such, there is a dire need to develop an appropriate firm-level innovation management framework suitable for SMEs in the service industry for future research projects and further studies. In addition, this researcher also discussed the significance of establishing such an innovation management framework.

Keywords: innovation management, innovation management framework, innovation management assessment tools, SMEs, service industry

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 744

7325 Enhanced Data Access Control of Cooperative Environment used for DMU Based Design

Authors: Wei Lifan, Zhang Huaiyu, Yang Yunbin, Li Jia

Abstract:

Through the analysis of the process digital design based on digital mockup, the fact indicates that a distributed cooperative supporting environment is the foundation conditions to adopt design approach based on DMU. Data access authorization is concerned firstly because the value and sensitivity of the data for the enterprise. The access control for administrators is often rather weak other than business user. So authors established an enhanced system to avoid the administrators accessing the engineering data by potential approach and without authorization. Thus the data security is improved.

Keywords: access control, DMU, PLM, virtual prototype.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463

7324 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3151

7323 Towards Better Quality in Healthcare and Operations Management: A Developmental Literature Review

Authors: Towards Better Quality in Healthcare, Operations Management: A Developmental Literature Review

Abstract:

This work presents the various perspectives, dimensions, components and definitions given to quality in the operations management (OM) and healthcare services (HCS) literature in time, highlighting gaps and learning opportunities between the two disciplines through a thorough search into their rich and distinct body of knowledge. Greater and new insights about the general nature of quality are obtained with findings such as in OM, quality has been approached in six fairly distinct paradigms (excellence, value, conformity to specifications, attributes, satisfaction and meeting or exceeding customer expectations), whereas in HCS, two approaches are prominent (Donabedian’s structure, process and outcomes model and Lohr and Schroeder’s circumscribed definition). The two disciplines views on quality seem to have progressed much in parallel with little cross-learning from each other. This work then proposes an encompassing definition of quality as a lever and suggests further research and development avenues for a better use of the concept of quality by academics and practitioners alike toward the goals of greater organizational performance and improved management in healthcare and possibly other service domains.

Keywords: Healthcare, management, operations, quality, services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1277

7322 Speed Characteristics of Mixed Traffic Flow on Urban Arterials

Authors: Ashish Dhamaniya, Satish Chandra

Abstract:

Speed and traffic volume data are collected on different sections of four lane and six lane roads in three metropolitan cities in India. Speed data are analyzed to fit the statistical distribution to individual vehicle speed data and all vehicles speed data. It is noted that speed data of individual vehicle generally follows a normal distribution but speed data of all vehicle combined at a section of urban road may or may not follow the normal distribution depending upon the composition of traffic stream. A new term Speed Spread Ratio (SSR) is introduced in this paper which is the ratio of difference in 85^th and 50^th percentile speed to the difference in 50^th and 15^th percentile speed. If SSR is unity then speed data are truly normally distributed. It is noted that on six lane urban roads, speed data follow a normal distribution only when SSR is in the range of 0.86 – 1.11. The range of SSR is validated on four lane roads also.

Keywords: Normal distribution, percentile speed, speed spread ratio, traffic volume.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4246

7321 A Comparative Study between Discrete Wavelet Transform and Maximal Overlap Discrete Wavelet Transform for Testing Stationarity

Authors: Amel Abdoullah Ahmed Dghais, Mohd Tahir Ismail

Abstract:

In this paper the core objective is to apply discrete wavelet transform and maximal overlap discrete wavelet transform functions namely Haar, Daubechies2, Symmlet4, Coiflet2 and discrete approximation of the Meyer wavelets in non stationary financial time series data from Dow Jones index (DJIA30) of US stock market. The data consists of 2048 daily data of closing index from December 17, 2004 to October 23, 2012. Unit root test affirms that the data is non stationary in the level. A comparison between the results to transform non stationary data to stationary data using aforesaid transforms is given which clearly shows that the decomposition stock market index by discrete wavelet transform is better than maximal overlap discrete wavelet transform for original data.

Keywords: Discrete wavelet transform, maximal overlap discrete wavelet transform, stationarity, autocorrelation function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4728

7320 Application of a Theoretical Framework as a Context for a Travel Behavior Change Policy Intervention

Authors: F. Moghtaderi, M. Burke, J. Troelsen

Abstract:

There has been a significant decline in active travel and a massive increase in the use of car dependent travel in many countries during the past two decades. Evidential risks for people’s physical and mental health problems are correlated with this increased use of motorized travel. These health related problems range from overweight and obesity to increased air pollution. In response to these rising concerns health professionals, traffic planers, local authorities and others have introduced a variety of initiatives to counterbalance the dominance of cars for daily journeys. However, the nature of travel behavior change interventions, which aim to reduce car use, are very complex and challenging regarding their interactions with human behavior. To change travel behavior at least two aspects have to be taken into consideration. First, how to alter attitudes and perceptions toward the sustainable and healthy modes of travel, in competition with experiences of private car use. And second, how to make these behavior change processes irreversible and sustainable. There are no comprehensive models available to guide policy interventions to increase the level of success of travel behavior change interventions across both these dimensions. A comprehensive theoretical framework is required in the effort to optimize how to facilitate and guide the processes of data collection and analysis to achieve the best possible guidelines for policy makers. Regarding the gaps in the travel behavior change research literature, this paper attempted to identify and suggest a multidimensional framework in order to facilitate planning the implemented travel behavior change interventions. A structured mixed-method model is suggested to improve the analytic power of the results according to the complexity of human behavior. In order to recognize people’s attitudes towards a specific travel mode, the Theory of Planned Behavior (TPB) was operationalized. But in order to capture decision making processes the Transtheoretical model of Behavior Change (TTM) was also used. Consequently, the combination of these two theories (TTM and TPB) has resulted in a synthesis with appropriate concepts to identify and design an implemented travel behavior change interventions.

Keywords: Behavior change theories, Theoretical framework, Travel behavior change interventions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2869

7319 Comparative Study of Transformed and Concealed Data in Experimental Designs and Analyses

Authors: K. Chinda, P. Luangpaiboon

Abstract:

This paper presents the comparative study of coded data methods for finding the benefit of concealing the natural data which is the mercantile secret. Influential parameters of the number of replicates (rep), treatment effects (τ) and standard deviation (σ) against the efficiency of each transformation method are investigated. The experimental data are generated via computer simulations under the specified condition of the process with the completely randomized design (CRD). Three ways of data transformation consist of Box-Cox, arcsine and logit methods. The difference values of F statistic between coded data and natural data (Fc-Fn) and hypothesis testing results were determined. The experimental results indicate that the Box-Cox results are significantly different from natural data in cases of smaller levels of replicates and seem to be improper when the parameter of minus lambda has been assigned. On the other hand, arcsine and logit transformations are more robust and obviously, provide more precise numerical results. In addition, the alternate ways to select the lambda in the power transformation are also offered to achieve much more appropriate outcomes.

Keywords: Experimental Designs, Box-Cox, Arcsine, Logit Transformations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622

7318 Design of a Low Cost Motion Data Acquisition Setup for Mechatronic Systems

Authors: Barış Can Yalçın

Abstract:

Motion sensors have been commonly used as a valuable component in mechatronic systems, however, many mechatronic designs and applications that need motion sensors cost enormous amount of money, especially high-tech systems. Design of a software for communication protocol between data acquisition card and motion sensor is another issue that has to be solved. This study presents how to design a low cost motion data acquisition setup consisting of MPU 6050 motion sensor (gyro and accelerometer in 3 axes) and Arduino Mega2560 microcontroller. Design parameters are calibration of the sensor, identification and communication between sensor and data acquisition card, interpretation of data collected by the sensor.

Keywords: Calibration of sensors, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4338

7317 Conceptual Multidimensional Model

Authors: Manpreet Singh, Parvinder Singh, Suman

Abstract:

The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.

Keywords: Multidimensional, data precision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458

7316 Real Time Approach for Data Placement in Wireless Sensor Networks

Authors: Sanjeev Gupta, Mayank Dave

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1815

7315 Early Requirement Engineering for Design of Learner Centric Dynamic LMS

Authors: Kausik Halder, Nabendu Chaki, Ranjan Dasgupta

Abstract:

We present a modeling framework that supports the engineering of early requirements specifications for design of learner centric dynamic Learning Management System. The framework is based on i* modeling tool and Means End Analysis, that adopts primitive concepts for modeling early requirements (such as actor, goal, and strategic dependency). We show how pedagogical and computational requirements for designing a learner centric Learning Management system can be adapted for the automatic early requirement engineering specifications. Finally, we presented a model on a Learner Quanta based adaptive Courseware. Our early requirement analysis shows that how means end analysis reveals gaps and inconsistencies in early requirements specifications that are by no means trivial to discover without the help of formal analysis tool.

Keywords: Adaptive Courseware, Early Requirement Engineering, Means End Analysis, Organizational Modeling, Requirement Modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648

7314 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870

7313 A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1980

7312 Predictive Modelling Techniques in Sediment Yield and Hydrological Modelling

Authors: Adesoji T. Jaiyeola, Josiah Adeyemo

Abstract:

This paper presents an extensive review of literature relevant to the modelling techniques adopted in sediment yield and hydrological modelling. Several studies relating to sediment yield are discussed. Many research areas of sedimentation in rivers, runoff and reservoirs are presented. Different types of hydrological models, different methods employed in selecting appropriate models for different case studies are analysed. Applications of evolutionary algorithms and artificial intelligence techniques are discussed and compared especially in water resources management and modelling. This review concentrates on Genetic Programming (GP) and fully discusses its theories and applications. The successful applications of GP as a soft computing technique were reviewed in sediment modelling. Some fundamental issues such as benchmark, generalization ability, bloat, over-fitting and other open issues relating to the working principles of GP are highlighted. This paper concludes with the identification of some research gaps in hydrological modelling and sediment yield.

Keywords: Artificial intelligence, evolutionary algorithm, genetic programming, sediment yield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861

7311 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805

7310 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478

7309 Towards Incorporating Context Awareness into Business Process Management

Authors: Xiaohui Zhao, Shahan Mafuz

Abstract:

Context-aware technologies provide system applications with the awareness of environmental conditions, customer behaviours, object movements, etc. Further, with such capability system applications can be smart to intelligently adapt their responses to the changing conditions. In regard to business operations, this promises businesses that their business processes can run more intelligently, adaptively and flexibly, and thereby either improve customer experience, enhance reliability of service delivery, or lower operational cost, to make the business more competitive and sustainable. Aiming at realising such context-aware business process management, this paper firstly explores its potential benefit, and then identifies some gaps between the current business process management support and the expected. In addition, some preliminary solutions are also discussed in regard to context definition, rule-based process execution, run-time process evolution, etc. A framework is also presented to give a conceptual architecture of context-aware business process management system to guide system implementation.

Keywords: Business process adaptation, business process evolution, business process modelling, and context awareness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972

7308 Distributed Data-Mining by Probability-Based Patterns

Authors: M. Kargar, F. Gharbalchi

Abstract:

In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.

Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1557

7307 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3281

7306 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620

7305 Deterioration of Groundwater in Arid Environments: What Impact in Oasis Dynamics? Case Study of Tafilalet, Morocco

Authors: W. EL Khoumsi, A. Hammani, M. Kuper, A. Bouaziz

Abstract:

Oases are complex and fragile agro-ecosystems. They have always existed in environments characterized by an arid climate, scarcity of rainfall, high temperatures and high evaporation. These palms have grown up despite the severity of the physical characteristics thanks to the water's existence and irrigation practice. The oases are generally spread along non-perennial rivers (wadis), shallow water table or deep artesian groundwater. However, the sustainability of oasis system is threatened by water scarcity and declining of water table levels particularly in arid areas. Located in the southern east area of Morocco, Tafilalet plain encompasses one of the largest palm groves in the kingdom. In recent years, this area has become increasingly threatened by water shortage and has seen a sharp deterioration under the effect of several combined anthropogenic and climatic factors. The Bayoud disease, successive years of drought, Hassan Addakhil dam construction etc are all factors that have affected both water and phoenicicole heritage of the area. The objective of this study is to understand the interaction between qualitative and quantitative degradation of groundwater resources, and the palm grove dynamics, while reviewing the assumption that groundwater resources contribute in a direct way to the conservation of this oasis agroecosystem. A historical analysis tracing both the oasis dynamics and the groundwater evolution has been established. Data were collected from satellite images, surveys with different actors (farmers, Regional Office for Agricultural Development, Basin agency...). They were complemented by a synthesis of numerous technical reports in the area. The results showed that within 40 years, the thickness of the groundwater table has dropped in 50 %. Along with this, there has been a downsizing of date palm by 50 %. Areas with higher groundwater level were the least affected by the downsizing. So we can say that the shallow groundwater contribute significantly and directly to the water supply of date palm through its root system, and largely ensures the oasis ecosystem sustainability.

Keywords: Oasis dynamics, Arid environments, Groundwater deterioration, Date palm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2506

7304 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2568