Search results for: Data specification
7393 A Survey of Semantic Integration Approaches in Bioinformatics
Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir
Abstract:
Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18197392 A Framework for Teaching Distributed Requirements Engineering in Latin American Universities
Authors: G. Sevilla, S. Zapata, F. Giraldo, E. Torres, C. Collazos
Abstract:
This work describes a framework for teaching of global software engineering (GSE) in university undergraduate programs. This framework proposes a method of teaching that incorporates adequate techniques of software requirements elicitation and validated tools of communication, critical aspects to global software development scenarios. The use of proposed framework allows teachers to simulate small software development companies formed by Latin American students, which build information systems. Students from three Latin American universities played the roles of engineers by applying an iterative development of a requirements specification in a global software project. The proposed framework involves the use of a specific purpose Wiki for asynchronous communication between the participants of the process. It is also a practice to improve the quality of software requirements that are formulated by the students. The additional motivation of students to participate in these practices, in conjunction with peers from other countries, is a significant additional factor that positively contributes to the learning process. The framework promotes skills for communication, negotiation, and other complementary competencies that are useful for working on GSE scenarios.Keywords: Requirements analysis, distributed requirements engineering, practical experiences, collaborative support.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7067391 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering
Authors: Yunus Doğan, Ahmet Durap
Abstract:
Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.
Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12447390 Dimensional Modeling of HIV Data Using Open Source
Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer
Abstract:
Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19597389 Development of Cooling Load Demand Program for Building in Malaysia
Authors: Zamri Noranai, Dayang Siti Zainab Abang Bujang, Rosli Asmawi, Hamidon Salleh, Mohammad Zainal Md Yusof
Abstract:
Air conditioning is mainly to be used as human comfort medium. It has been use more often in country in which the daily temperatures are high. In scientific, air conditioning is defined as a process of controlling the moisture, cooling, heating and cleaning air. Without proper estimation of cooling load, big amount of waste energy been used because of unsuitable of air conditioning system are not considering to overcoming heat gains from surrounding. This is due to the size of the room is too big and the air conditioning has to use more energy to cool the room and the air conditioning is too small for the room. The studies are basically to develop a program to calculate cooling load. Through this study it is easy to calculate cooling load estimation. Furthermore it-s help to compare the cooling load estimation by hourly and yearly. Base on the last study that been done, the developed software are not user-friendly. For individual without proper knowledge of calculating cooling load estimation might be problem. Easy excess and user-friendly should be the main objective to design something. This program will allow cooling load able be estimate by any users rather than estimation by using rule of thumb. Several of limitation of case study is judged to sure it-s meeting to Malaysia building specification. Finally validation is done by comparison manual calculation and by developed program.Keywords: Building, Energy and Coaling Load
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29467388 Efficient Lossless Compression of Weather Radar Data
Authors: Wei-hua Ai, Wei Yan, Xiang Li
Abstract:
Data compression is used operationally to reduce bandwidth and storage requirements. An efficient method for achieving lossless weather radar data compression is presented. The characteristics of the data are taken into account and the optical linear prediction is used for the PPI images in the weather radar data in the proposed method. The next PPI image is identical to the current one and a dramatic reduction in source entropy is achieved by using the prediction algorithm. Some lossless compression methods are used to compress the predicted data. Experimental results show that for the weather radar data, the method proposed in this paper outperforms the other methods.
Keywords: Lossless compression, weather radar data, optical linear prediction, PPI image
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22587387 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises
Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto
Abstract:
The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.
Keywords: Data management, digitization, Industry 4.0, knowledge engineering, metamodel.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14587386 A Methodology for Data Migration between Different Database Management Systems
Authors: Bogdan Walek, Cyril Klimes
Abstract:
In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.
Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34927385 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions
Authors: K. Hardy, A. Maurushat
Abstract:
Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.
Keywords: Big data, open data, productivity, transparency.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16377384 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data
Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin
Abstract:
Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.
Keywords: Big data, correlation analysis, data recommendation system, urban data network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11057383 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example
Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh
Abstract:
With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.
Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22007382 A Mapping Approach of Code Generation for Arinc653-Based Avionics Software
Authors: Lu Zou, Dianfu MA, Ying Wang, Xianqi Zhao
Abstract:
Avionic software architecture has transit from a federated avionics architecture to an integrated modular avionics (IMA) .ARINC 653 (Avionics Application Standard Software Interface) is a software specification for space and time partitioning in Safety-critical avionics Real-time operating systems. Methods to transform the abstract avionics application logic function to the executable model have been brought up, however with less consideration about the code generating input and output model specific for ARINC 653 platform and inner-task synchronous dynamic interaction order sequence. In this paper, we proposed an AADL-based model-driven design methodology to fulfill the purpose to automatically generating Cµ executable model on ARINC 653 platform from the ARINC653 architecture which defined as AADL653 in order to facilitate the development of the avionics software constructed on ARINC653 OS. This paper presents the mapping rules between the AADL653 elements and the elements in Cµ language, and define the code generating rules , designs an automatic C µ code generator .Then, we use a case to illustrate our approach. Finally, we give the related work and future research directions.Keywords: IMA, ARINC653, AADL653, code generation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30417381 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes
Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin
Abstract:
Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.Keywords: Missing data, Imputation, Missing Data Techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16687380 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists
Authors: George E. Tsekouras, Evi Sampanikou
Abstract:
We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16357379 Controller Design of Discrete Systems by Order Reduction Technique Employing Differential Evolution Optimization Algorithm
Authors: J. S. Yadav, N. P. Patidar, J. Singhai
Abstract:
One of the main objectives of order reduction is to design a controller of lower order which can effectively control the original high order system so that the overall system is of lower order and easy to understand. In this paper, a simple method is presented for controller design of a higher order discrete system. First the original higher order discrete system in reduced to a lower order model. Then a Proportional Integral Derivative (PID) controller is designed for lower order model. An error minimization technique is employed for both order reduction and controller design. For the error minimization purpose, Differential Evolution (DE) optimization algorithm has been employed. DE method is based on the minimization of the Integral Squared Error (ISE) between the desired response and actual response pertaining to a unit step input. Finally the designed PID controller is connected to the original higher order discrete system to get the desired specification. The validity of the proposed method is illustrated through a numerical example.Keywords: Discrete System, Model Order Reduction, PIDController, Integral Squared Error, Differential Evolution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19047378 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain
Authors: Amal M. Alrayes
Abstract:
Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.Keywords: Data quality, performance, system quality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21197377 Formal Analysis of a Public-Key Algorithm
Authors: Markus Kaiser, Johannes Buchmann
Abstract:
In this article, a formal specification and verification of the Rabin public-key scheme in a formal proof system is presented. The idea is to use the two views of cryptographic verification: the computational approach relying on the vocabulary of probability theory and complexity theory and the formal approach based on ideas and techniques from logic and programming languages. A major objective of this article is the presentation of the first computer-proved implementation of the Rabin public-key scheme in Isabelle/HOL. Moreover, we explicate a (computer-proven) formalization of correctness as well as a computer verification of security properties using a straight-forward computation model in Isabelle/HOL. The analysis uses a given database to prove formal properties of our implemented functions with computer support. The main task in designing a practical formalization of correctness as well as efficient computer proofs of security properties is to cope with the complexity of cryptographic proving. We reduce this complexity by exploring a light-weight formalization that enables both appropriate formal definitions as well as efficient formal proofs. Consequently, we get reliable proofs with a minimal error rate augmenting the used database, what provides a formal basis for more computer proof constructions in this area.
Keywords: public-key encryption, Rabin public-key scheme, formalproof system, higher-order logic, formal verification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15377376 Study on Optimization Design of Pressure Hull for Underwater Vehicle
Authors: Qasim Idrees, Gao Liangtian, Liu Bo, Miao Yiran
Abstract:
In order to improve the efficiency and accuracy of the pressure hull structure, optimization of underwater vehicle based on response surface methodology, a method for optimizing the design of pressure hull structure was studied. To determine the pressure shell of five dimensions as a design variable, the application of thin shell theory and the Chinese Classification Society (CCS) specification was carried on the preliminary design. In order to optimize variables of the feasible region, different methods were studied and implemented such as Opt LHD method (to determine the design test sample points in the feasible domain space), parametric ABAQUS solution for each sample point response, and the two-order polynomial response for the surface model of the limit load of structures. Based on the ultimate load of the structure and the quality of the shell, the two-generation genetic algorithm was used to solve the response surface, and the Pareto optimal solution set was obtained. The final optimization result was 41.68% higher than that of the initial design, and the shell quality was reduced by about 27.26%. The parametric method can ensure the accuracy of the test and improve the efficiency of optimization.
Keywords: Parameterization, response surface, structure optimization, pressure hull.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11627375 The Dynamics of Algeria’s Natural Gas Exports to Europe: Evidence from ARDL Bounds Testing Approach with Breakpoints
Authors: Hicham Benamirouche, Oum Elkheir Moussi
Abstract:
The purpose of the study is to examine the dynamics of Algeria’s natural gas exports through the Autoregressive Distributed Lag (ARDL) bounds testing approach with break points. The analysis was carried out for the period from 1967 to 2015. Based on imperfect substitution specification, the ARDL approach reveals a long-run equilibrium relationship between Algeria’s Natural gas exports and their determinant factors (Algeria’s gas reserves, Domestic gas consumption, Europe’s GDP per capita, relative prices, the European gas production and the market share of competitors). All the long-run elasticities estimated are statistically significant with a large impact of domestic factors, which constitute the supply constraints. In short term, the elasticities are statistically significant, and almost comparable to those of the long term. Furthermore, the speed of adjustment towards long-run equilibrium is less than one year because of the little flexibility of the long term export contracts. Two break points have been estimated when we employ the domestic gas consumption as a break variable; 1984 and 2010, which reflect the arbitration policy between the domestic gas market and gas exports.
Keywords: Natural gas exports, elasticity, ARDL bounds testing, break points, Algeria.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7527374 Integration of Multi-Source Data to Monitor Coral Biodiversity
Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee
Abstract:
This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15537373 Decision Support System Based on Data Warehouse
Authors: Yang Bao, LuJing Zhang
Abstract:
Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.
Keywords: Decision Support System, Data Warehouse, Data Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38637372 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams
Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush
Abstract:
Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.Keywords: Data Stream, Classification, Concept Shift, History.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12787371 Incremental Learning of Independent Topic Analysis
Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda
Abstract:
In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10597370 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data
Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala
Abstract:
Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.
Keywords: Databases, data mining, multi-agent, spatial datamart.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20457369 Latent Topic Based Medical Data Classification
Authors: Jian-hua Yeh, Shi-yi Kuo
Abstract:
This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.
Keywords: classification, latent topics, outlier adjustment, feature scaling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16427368 Adaptive Fuzzy Control for Air-Fuel Ratio of Automobile Spark Ignition Engine
Authors: Ali Ghaffari, A. Hosein Shamekhi, Akbar Saki, Ehsan Kamrani
Abstract:
In order to meet the limits imposed on automotive emissions, engine control systems are required to constrain air/fuel ratio (AFR) in a narrow band around the stoichiometric value, due to the strong decay of catalyst efficiency in case of rich or lean mixture. This paper presents a model of a sample spark ignition engine and demonstrates Simulink-s capabilities to model an internal combustion engine from the throttle to the crankshaft output. We used welldefined physical principles supplemented, where appropriate, with empirical relationships that describe the system-s dynamic behavior without introducing unnecessary complexity. We also presents a PID tuning method that uses an adaptive fuzzy system to model the relationship between the controller gains and the target output response, with the response specification set by desired percent overshoot and settling time. The adaptive fuzzy based input-output model is then used to tune on-line the PID gains for different response specifications. Experimental results demonstrate that better performance can be achieved with adaptive fuzzy tuning relative to similar alternative control strategies. The actual response specifications with adaptive fuzzy matched the desired response specifications.Keywords: Modelling, Air–fuel ratio control, SI engine, Adaptive fuzzy Control.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25257367 Data Collection in Hospital Emergencies: A Questionnaire Survey
Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala
Abstract:
Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.
Keywords: Data collection, survey, database, data analysis, hospital emergencies.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6687366 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data
Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop
Abstract:
Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20407365 Extraction of Data from Web Pages: A Vision Based Approach
Authors: P. S. Hiremath, Siddu P. Algur
Abstract:
With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.
Keywords: Web data records, web data regions, web mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19017364 Visual-Graphical Methods for Exploring Longitudinal Data
Authors: H. W. Ker
Abstract:
Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095