Search results for: multiple data stores
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8477

Search results for: multiple data stores

7907 Evaluation of AR-4BL-MAST with Multiple Markers Interaction Technique for Augmented Reality Based Engineering Application

Authors: Waleed Maqableh, Ahmad Al-Hamad, Manjit Sidhu

Abstract:

Augmented reality (AR) technology has the capability to provide many benefits in the field of education as a modern technology which aided learning and improved the learning experience. This paper evaluates AR based application with multiple markers interaction technique (touch-to-print) which is designed for analyzing the kinematics of 4BL mechanism in mechanical engineering. The application is termed as AR-4BL-MAST and it allows the users to touch the symbols on a paper in natural way of interaction. The evaluation of this application was performed with mechanical engineering students and human–computer interaction (HCI) experts to test its effectiveness as a tangible user interface application where the statistical results show its ability as an interaction technique, and it gives the users more freedom in interaction with the virtual mechanical objects.

Keywords: Augmented reality, engineering, four-bar linkage, Multimedia, user interface, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
7906 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1604
7905 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1087
7904 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2185
7903 Automatic Map Simplification for Visualization on Mobile Devices

Authors: Hang Yu

Abstract:

The visualization of geographic information on mobile devices has become popular as the widespread use of mobile Internet. The mobility of these devices brings about much convenience to people-s life. By the add-on location-based services of the devices, people can have an access to timely information relevant to their tasks. However, visual analysis of geographic data on mobile devices presents several challenges due to the small display and restricted computing resources. These limitations on the screen size and resources may impair the usability aspects of the visualization applications. In this paper, a variable-scale visualization method is proposed to handle the challenge of small mobile display. By merging multiple scales of information into a single image, the viewer is able to focus on the interesting region, while having a good grasp of the surrounding context. This is essentially visualizing the map through a fisheye lens. However, the fisheye lens induces undesirable geometric distortion in the peripheral, which renders the information meaningless. The proposed solution is to apply map generalization that removes excessive information around the peripheral and an automatic smoothing process to correct the distortion while keeping the local topology consistent. The proposed method is applied on both artificial and real geographical data for evaluation.

Keywords: Map simplification, visualization, mobile devices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420
7902 Program Memories Error Detection and Correction On-Board Earth Observation Satellites

Authors: Y. Bentoutou

Abstract:

Memory Errors Detection and Correction aim to secure the transaction of data between the central processing unit of a satellite onboard computer and its local memory. In this paper, the application of a double-bit error detection and correction method is described and implemented in Field Programmable Gate Array (FPGA) technology. The performance of the proposed EDAC method is measured and compared with two different EDAC devices, using the same FPGA technology. Statistical analysis of single-event upset (SEU) and multiple-bit upset (MBU) activity in commercial memories onboard the first Algerian microsatellite Alsat-1 is given.

Keywords: Error Detection and Correction, On-board computer, small satellite missions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2201
7901 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1651
7900 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
7899 IoT Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Seani Rananga

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway, and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. Several results obtained from this study on data privacy models show that when two or more data privacy models are integrated via a fog storage gateway, we often have more secure data. Our main focus in the study is to design a framework for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, including its structure, and its interrelationships.

Keywords: IoT, fog storage, cloud storage, data analysis, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 200
7898 Production Plan and Technological Variants Optimization by Goal Programming Methods

Authors: Tunjo Perić, Franjo Bratić

Abstract:

In this paper, the goal programming methodology for solving multiple objective problem of the technological variants and production plan optimization has been applied. The optimization criteria are determined and the multiple objective linear programming model for solving a problem of the technological variants and production plan optimization is formed and solved. Then the obtained results are analysed. The obtained results point out to the possibility of efficient application of the goal programming methodology in solving the problem of the technological variants and production plan optimization. The paper points out on the advantages of the application of the goal programming methodology compare to the Surrogat Worth Trade-off method in solving this problem.

Keywords: Goal programming, multi objective programming, production plan, SWT method, technological variants.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1901
7897 Robust Coordinated Design of Multiple Power System Stabilizers Using Particle Swarm Optimization Technique

Authors: Sidhartha Panda, C. Ardil

Abstract:

Power system stabilizers (PSS) are now routinely used in the industry to damp out power system oscillations. In this paper, particle swarm optimization (PSO) technique is applied to coordinately design multiple power system stabilizers (PSS) in a multi-machine power system. The design problem of the proposed controllers is formulated as an optimization problem and PSO is employed to search for optimal controller parameters. By minimizing the time-domain based objective function, in which the deviation in the oscillatory rotor speed of the generator is involved; stability performance of the system is improved. The non-linear simulation results are presented for various severe disturbances and small disturbance at different locations as well as for various fault clearing sequences to show the effectiveness and robustness of the proposed controller and their ability to provide efficient damping of low frequency oscillations.

Keywords: Low frequency oscillations, Particle swarm optimization, power system stability, power system stabilizer, multimachine power system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 856
7896 Socio-Spatial Transformations in Obsolete Port Regions: A Case for Istanbul-Karakoy District

Authors: Umut Tuğlu Karslı

Abstract:

Istanbul-Karakoy Port, field of this study, has lost its former significance in time due to the transformation of urban functions. Today, activities for regeneration of this region continue in two forms and scales. First of these activities is the "planned transformation projects," which also includes “Galataport project”, and the second one is "spontaneous transformation," which consists of individual interventions. Galataport project that based on the idea of arranging the area specifically for tourists was prepared in 2005 and became a topic of tremendous public debate. On the other hand, the "spontaneous transformation" that is observed in Karakoy District starts in 2004 with the foundation of “Istanbul Modern Museum” which allowed the cultural integration of old naval warehouses of the port to the daily life. Following this adaptive reuse intervention, the district started to accommodate numerous art galleries, studios, caféworkshops and design stores. In this context, this paper first examines regeneration studies in obsolete port regions, analyzes the planned and ongoing socio-spatial transformations in the specific case of Karakoy and performs a critical review of the sustainability of the proposals on how to reinstate the district in the active life of Istanbul.

Keywords: Port Cities, Socio-Spatial Transformation, Urban Regeneration, Urban Revitalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007
7895 Literature-Based Discoveries in Lupus Treatment

Authors: Oluwaseyi Jaiyeoba, Vetria Byrd

Abstract:

Systemic lupus erythematosus (aka lupus) is a chronic disease known for its chameleon-like ability to mimic symptoms of other diseases rendering it hard to detect, diagnose and treat. The heterogeneous nature of the disease generates disparate data that are often multifaceted and multi-dimensional. Musculoskeletal manifestation of lupus is one of the most common clinical manifestations of lupus. This research links disparate literature on the treatment of lupus as it affects the musculoskeletal system using the discoveries from literature-based research articles available on the PubMed database. Several Natural Language Processing (NPL) tools exist to connect disjointed but related literature, such as Connected Papers, Bitola, and Gopalakrishnan. Literature-based discovery (LBD) has been used to bridge unconnected disciplines based on text mining procedures. The technical/medical literature consists of many technical/medical concepts, each having its  sub-literature. This approach has been used to link Parkinson’s, Raynaud, and Multiple Sclerosis treatment within works of literature.  Literature-based discovery methods can connect two or more related but disjointed literature concepts to produce a novel and plausible approach to solving a research problem. Data visualization techniques with the help of natural language processing tools are used to visually represent the result of literature-based discoveries. Literature search results can be voluminous, but Data visualization processes can provide insight and detect subtle patterns in large data. These insights and patterns can lead to discoveries that would have otherwise been hidden from disjointed literature. In this research, literature data are mined and combined with visualization techniques for heterogeneous data to discover viable treatments reported in the literature for lupus expression in the musculoskeletal system. This research answers the question of using literature-based discovery to identify potential treatments for a multifaceted disease like lupus. A three-pronged methodology is used in this research: text mining, natural language processing, and data visualization. These three research-related fields are employed to identify patterns in lupus-related data that, when visually represented, could aid research in the treatment of lupus. This work introduces a method for visually representing interconnections of various lupus-related literature. The methodology outlined in this work is the first step toward literature-based research and treatment planning for the musculoskeletal manifestation of lupus. The results also outline the interconnection of complex, disparate data associated with the manifestation of lupus in the musculoskeletal system. The societal impact of this work is broad. Advances in this work will improve the quality of life for millions of persons in the workforce currently diagnosed and silently living with a musculoskeletal disease associated with lupus.

Keywords: Systemic lupus erythematosus, LBD, Data Visualization, musculoskeletal system, treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 469
7894 Event Template Generation for News Articles

Authors: A. Kowcika, E. Umamaheswari, T.V. Geetha

Abstract:

In this paper we focus on event extraction from Tamil news article. This system utilizes a scoring scheme for extracting and grouping event-specific sentences. Using this scoring scheme eventspecific clustering is performed for multiple documents. Events are extracted from each document using a scoring scheme based on feature score and condition score. Similarly event specific sentences are clustered from multiple documents using this scoring scheme. The proposed system builds the Event Template based on user specified query. The templates are filled with event specific details like person, location and timeline extracted from the formed clusters. The proposed system applies these methodologies for Tamil news articles that have been enconverted into UNL graphs using a Tamil to UNL-enconverter. The main intention of this work is to generate an event based template.

Keywords: Event Extraction, Score based Clustering, Segmentation, Template Generation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
7893 New Features for Specific JPEG Steganalysis

Authors: Johann Barbier, Eric Filiol, Kichenakoumar Mayoura

Abstract:

We present in this paper a new approach for specific JPEG steganalysis and propose studying statistics of the compressed DCT coefficients. Traditionally, steganographic algorithms try to preserve statistics of the DCT and of the spatial domain, but they cannot preserve both and also control the alteration of the compressed data. We have noticed a deviation of the entropy of the compressed data after a first embedding. This deviation is greater when the image is a cover medium than when the image is a stego image. To observe this deviation, we pointed out new statistic features and combined them with the Multiple Embedding Method. This approach is motivated by the Avalanche Criterion of the JPEG lossless compression step. This criterion makes possible the design of detectors whose detection rates are independent of the payload. Finally, we designed a Fisher discriminant based classifier for well known steganographic algorithms, Outguess, F5 and Hide and Seek. The experiemental results we obtained show the efficiency of our classifier for these algorithms. Moreover, it is also designed to work with low embedding rates (< 10-5) and according to the avalanche criterion of RLE and Huffman compression step, its efficiency is independent of the quantity of hidden information.

Keywords: Compressed frequency domain, Fisher discriminant, specific JPEG steganalysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2145
7892 Post-Traumatic Stress Disorder: Management at the Montfort Hospital

Authors: Kay-Anne Haykal, Issack Biyong

Abstract:

The post-traumatic stress disorder (PTSD) rises from exposure to a traumatic event and appears by a persistent experience of this event. Several psychiatric co-morbidities are associated with PTSD and include mood disorders, anxiety disorders, and substance abuse. The main objective was to compare the criteria for PTSD according to the literature to those used to diagnose a patient in a francophone hospital and to check the correspondence of these two criteria. 700 medical charts of admitted patients on the medicine or psychiatric unit at the Montfort Hospital were identified with the following diagnoses: major depressive disorder, bipolar disorder, anxiety disorder, substance abuse, and PTSD for the period of time between April 2005 and March 2006. Multiple demographic criteria were assembled. Also, for every chart analyzed, the PTSD criteria, according to the Manual of Mental Disorders (DSM) IV were found, identified, and grouped according to pre-established codes. An analysis using the receiver operating characteristic (ROC) method was elaborated for the study of data. A sample of 57 women and 50 men was studied. Age was varying between 18 and 88 years with a median age of 48. According to the PTSD criteria in the DSM IV, 12 patients should have the diagnosis of PTSD in opposition to only two identified in the medical charts. The ROC method establishes that with the combination of data from PTSD and depression, the sensitivity varies between 0,127 and 0,282, and the specificity varies between 0,889 and 0,917. Otherwise, if we examine the PTSD data alone, the sensibility jumps to 0.50, and the specificity varies between 0,781 and 0,895. This study confirms the presence of an underdiagnosed and treated PTSD that causes severe perturbations for the affected individual.

Keywords: Post-Traumatic Stress Disorder, diagnosis, co-morbidities, mental health disorders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1071
7891 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: Data quality, performance, system quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2101
7890 The Impact of Recommendation Sources on Online Purchase Intentions: The Moderating Effects of Gender and Perceived Risk

Authors: Chiao-Chen Chang, Yang-Chieh Chin

Abstract:

This study examines the issue of recommendation sources from the perspectives of gender and consumers- perceived risk, and validates a model for the antecedents of consumer online purchases. The method of obtaining quantitative data was that of the instrument of a survey questionnaire. Data were collected via questionnaires from 396 undergraduate students aged 18-24, and a multiple regression analysis was conducted to identify causal relationships. Empirical findings established the link between recommendation sources (word-of-mouth, advertising, and recommendation systems) and the likelihood of making online purchases and demonstrated the role of gender and perceived risk as moderators in this context. The results showed that the effects of word-of-mouth on online purchase intentions were stronger than those of advertising and recommendation systems. In addition, female consumers have less experience with online purchases, so they may be more likely than males to refer to recommendations during the decision-making process. The findings of the study will help marketers to address the recommendation factor which influences consumers- intention to purchase and to improve firm performances to meet consumer needs.

Keywords: Recommendation sources, Online purchaseintentions, Gender differences, Perceived risk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2991
7889 Multiple Moving Talker Tracking by Integration of Two Successive Algorithms

Authors: Kenji Suyama, Masahiro Oshida, Noboru Owada

Abstract:

In this paper, an estimation accuracy of multiple moving talker tracking using a microphone array is improved. The tracking can be achieved by the adaptive method in which two algorithms are integrated, namely, the PAST (Projection Approximation Subspace Tracking) algorithm and the IPLS (Interior Point Least Square) algorithm. When either talker begins to speak again after a silent period, an appropriate feasible region for an evaluation function of the IPLS algorithm might not be set. Then, the tracking fails due to the incorrect updating. Therefore, if an increment of the number of active talkers is detected, the feasible region must be reset. Then, a low cost realization is required for the high speed tracking and a high accuracy realization is desired for the precise tracking. In this paper, the directions roughly estimated using the delayed-sum-array method are used for the resetting. Several results of experiments performed in an actual room environment show the effectiveness of the proposed method.

Keywords: moving talkers tracking, microphone array, signal subspace

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1321
7888 Integration of Multi-Source Data to Monitor Coral Biodiversity

Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.

Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535
7887 Global GMRES with Deflated Restarting for Families of Shifted Linear Systems

Authors: Jing Meng, Peiyong Zhu, Houbiao Li

Abstract:

Many problems in science and engineering field require the solution of shifted linear systems with multiple right hand sides and multiple shifts. To solve such systems efficiently, the implicitly restarted global GMRES algorithm is extended in this paper. However, the shift invariant property could no longer hold over the augmented global Krylov subspace due to adding the harmonic Ritz matrices. To remedy this situation, we enforce the collinearity condition on the shifted system and propose shift implicitly restarted global GMRES. The new method not only improves the convergence but also has a potential to simultaneously compute approximate solution for the shifted systems using only as many matrix vector multiplications as the solution of the seed system requires. In addition, some numerical experiments also confirm the effectiveness of our method.

Keywords: Shifted linear systems, global Krylov subspace, GLGMRESIR, GLGMRESIRsh, harmonic Ritz matrix, harmonic Ritz vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1950
7886 Decision Support System Based on Data Warehouse

Authors: Yang Bao, LuJing Zhang

Abstract:

Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.

Keywords: Decision Support System, Data Warehouse, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3842
7885 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
7884 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1033
7883 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data

Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala

Abstract:

Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.

Keywords: Databases, data mining, multi-agent, spatial datamart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2030
7882 Latent Topic Based Medical Data Classification

Authors: Jian-hua Yeh, Shi-yi Kuo

Abstract:

This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.

Keywords: classification, latent topics, outlier adjustment, feature scaling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628
7881 Three Computational Mathematics Techniques: Comparative Determination of Area under Curve

Authors: Khalid Pervaiz Akhter, Mahmood Ahmad, Ghulam Murtaza, Ishrat Shafi, Zafar Javed

Abstract:

The objective of this manuscript is to find area under the plasma concentration- time curve (AUC) for multiple doses of salbutamol sulphate sustained release tablets (Ventolin® oral tablets SR 8 mg, GSK, Pakistan) in the group of 18 healthy adults by using computational mathematics techniques. Following the administration of 4 doses of Ventolin® tablets 12 hourly to 24 healthy human subjects and bioanalysis of obtained plasma samples, plasma drug concentration-time profile was constructed. AUC, an important pharmacokinetic parameter, was measured using integrated equation of multiple oral dose regimens. The approximated AUC was also calculated by using computational mathematics techniques such as repeated rectangular, repeated trapezium and repeated Simpson's rule and compared with exact value of AUC calculated by using integrated equation of multiple oral dose regimens to find best computational mathematics method that gives AUC values closest to exact. The exact values of AUC for four consecutive doses of Ventolin® oral tablets were 150.5819473, 157.8131756, 164.4178231 and 162.78 ng.h/ml while the closest values approximated AUC values were 149.245962, 157.336171, 164.2585768 and 162.289224 ng.h/ml, respectively as found by repeated rectangular rule. The errors in the approximated values of AUC were negligible. It is concluded that all computational tools approximated values of AUC accurately but the repeated rectangular rule gives slightly better approximated values of AUC as compared to repeated trapezium and repeated Simpson's rules.

Keywords: Salbutamol sulphate, Area under curve (AUC), repeated rectangular rule, repeated trapezium rule, repeated Simpson's rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
7880 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 619
7879 Extraction of Data from Web Pages: A Vision Based Approach

Authors: P. S. Hiremath, Siddu P. Algur

Abstract:

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

Keywords: Web data records, web data regions, web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1884
7878 Visual-Graphical Methods for Exploring Longitudinal Data

Authors: H. W. Ker

Abstract:

Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.

Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076