Search results for: private data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7623

Search results for: private data

7443 Effects of the In-Situ Upgrading Project in Afghanistan: A Case Study on the Formally and Informally Developed Areas in Kabul

Authors: Maisam Rafiee, Chikashi Deguchi, Akio Odake, Minoru Matsui, Takanori Sata

Abstract:

Cities in Afghanistan have been rapidly urbanized; however, many parts of these cities have been developed with no detailed land use plan or infrastructure. In other words, they have been informally developed without any government leadership. The new government started the In-situ Upgrading Project in Kabul to upgrade roads, the water supply network system, and the surface water drainage system on the existing street layout in 2002, with the financial support of international agencies. This project is an appropriate emergency improvement for living life, but not an essential improvement of living conditions and infrastructure problems because the life expectancies of the improved facilities are as short as 10–15 years, and residents cannot obtain land tenure in the unplanned areas. The Land Readjustment System (LRS) conducted in Japan has good advantages that rearrange irregularly shaped land lots and develop the infrastructure effectively. This study investigates the effects of the In-situ Upgrading Project on private investment, land prices, and residents’ satisfaction with projects in Kart-e-Char, where properties are registered, and in Afshar-e-Silo Lot 1, where properties are unregistered. These projects are located 5 km and 7 km from the CBD area of Kabul, respectively. This study discusses whether LRS should be applied to the unplanned area based on the questionnaire and interview responses of experts experienced in the In-situ Upgrading Project who have knowledge of LRS. The analysis results reveal that, in Kart-e-Char, a lot of private investment has been made in the construction of medium-rise (five- to nine-story) buildings for commercial and residential purposes. Land values have also incrementally increased since the project, and residents are commonly satisfied with the road pavement, drainage systems, and water supplies, but dissatisfied with the poor delivery of electricity as well as the lack of public facilities (e.g., parks and sport facilities). In Afshar-e-Silo Lot 1, basic infrastructures like paved roads and surface water drainage systems have improved from the project. After the project, a few four- and five-story residential buildings were built with very low-level private investments, but significant increases in land prices were not evident. The residents are satisfied with the contribution ratio, drainage system, and small increase in land price, but there is still no drinking water supply system or tenure security; moreover, there are substandard paved roads and a lack of public facilities, such as parks, sport facilities, mosques, and schools. The results of the questionnaire and interviews with the four engineers highlight the problems that remain to be solved in the unplanned areas if LRS is applied—namely, land use differences, types and conditions of the infrastructure still to be installed by the project, and time spent for positive consensus building among the residents, given the project’s budget limitation.

Keywords: In-Situ Upgrading, Kabul, Land Readjustment, Land value, Planned areas, Private investment, Resident satisfaction, Unplanned areas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1123
7442 Verifying X.509 Certificates on Smart Cards

Authors: Olaf Henniger, Karim Lafou, Dirk Scheuermann, Bruno Struif

Abstract:

This paper presents a smart-card applet that is able to verify X.509 certificates and to use the public key contained in the certificate for verifying digital signatures that have been created using the corresponding private key, e.g. for the purpose of authenticating the certificate owner against the card. The approach has been implemented as an operating prototype on Java cards.

Keywords: Public key cryptographic applications, smart cards.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2835
7441 Modeling the Influence of Socioeconomic and Land-Use Factors on Mode Choice: A Comparison of Riyadh, Saudi Arabia, and Melbourne, Australia

Authors: M. Alqhatani, S. Bajwa, S. Setunge

Abstract:

Metropolitan areas have suffered from traffic problems, which have steadily increased in many monocentric cities. Urban expansion, population growth, and road network development have resulted in a structural shift toward urban sprawl, increasing commuters’ dependence on private modes of transport. This paper aims to model the influence of socioeconomic and land-use factors on mode choice using a multinomial and nested logit model. Land-use patterns—such as residential, commercial, retail, educational and employment related—affect the choice of mode and destination in the short and medium term. Socioeconomic factors—such as age, gender, income, household size, and house type—also affect choice, while residential location is affected in the long term. Riyadh in Saudi Arabia and Melbourne in Australia were chosen as case studies. Riyadh is a car-dependent city with limited public transport, whereas Melbourne has good public transport but an increase in car dependence. Aggregate level land-use data and disaggregate level individual, household, and journey-to-work data are used to determine the effects of land use and socioeconomic factors on mode choice. The model results determined that urban sprawl is the main factor that affects mode choice, income, and house type.

Keywords: Socioeconomic, land use, mode choice, multinomial logit and nested logit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2414
7440 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595
7439 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1085
7438 A Consumption-Based Hybrid Life Cycle Assessment of Carbon Footprints in California: High Footprints in Small Urban Households

Authors: Jukka Heinonen

Abstract:

Higher density reduces distances, private car dependency and thus reduces greenhouse gas emissions (GHGs). As a result, increased density has been given a central role among urban development targets. However, it is not just travel behavior that changes along with density. Rather, the consumption patterns, or overall lifestyles, change along with changing urban structure, particularly with changing housing types and consumption opportunities. Furthermore, elevated consumption of services, more frequent flying and less intra-household sharing have been shown to potentially outweigh the gains from reduced driving in more dense urban settlements. In this study, the geography of carbon footprints (CFs) in California is analyzed paying close attention to the household size differences and the resulting economies-of-scale advantages and disadvantages. A hybrid life cycle assessment (LCA) framework is employed together with consumer expenditure data to assess the CFs. According to the study, small urban households have the highest CFs in California. Their transport related emissions are significantly lower than those of the residents of less urbanized areas, but higher emissions from other consumption categories, together with the low degree of sharing of goods, overweigh the gains. Two functional units, per capita and per household, are used to analyze the CFs and to demonstrate the importance of household size. The lifestyle impacts visible through the consumption data are also discussed. The study suggests that there are still significant gaps in our understanding of the premises of low-carbon human settlements.

Keywords: Carbon footprint, life cycle assessment, consumption, lifestyle, household size, economies-of-scale.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1201
7437 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2179
7436 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1644
7435 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1617
7434 IoT Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Seani Rananga

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway, and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. Several results obtained from this study on data privacy models show that when two or more data privacy models are integrated via a fog storage gateway, we often have more secure data. Our main focus in the study is to design a framework for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, including its structure, and its interrelationships.

Keywords: IoT, fog storage, cloud storage, data analysis, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 188
7433 Design of Service-Oriented Pervasive System for Urban Computing in Cali Zoo (OpenZoo)

Authors: Claudia L. Zuñiga, Andres F. Millan, Jose L. Abadia, Monica Lora, Andres Navarro, Juan C. Burguillo, Pedro S. Rodriguez

Abstract:

The increasing popularity of wireless technologies and mobile computing devices has enabled new application areas and research. One of these new areas is pervasive systems in urban environments, because urban environments are characterized by high concentration of these technologies and devices. In this paper we will show the process of pervasive system design in urban environments, using as use case a local zoo in Cali, Colombia. Based on an ethnographic studio, we present the design of a pervasive system for urban computing based on service oriented architecture to controlled environment of Cali Zoo. In this paper, the reader will find a methodological approach for the design of similar systems, using data collection methods, conceptual frameworks for urban environments and considerations of analysis and design of service oriented systems.

Keywords: Service Oriented Architecture, Urban Computing, Design of pervasive systems for urban environments, PSP Design Framework (Public Social Private), Cali Zoo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535
7432 Comparison of Different Neural Network Approaches for the Prediction of Kidney Dysfunction

Authors: Ali Hussian Ali AlTimemy, Fawzi M. Al Naima

Abstract:

This paper presents the prediction of kidney dysfunction using different neural network (NN) approaches. Self organization Maps (SOM), Probabilistic Neural Network (PNN) and Multi Layer Perceptron Neural Network (MLPNN) trained with Back Propagation Algorithm (BPA) are used in this study. Six hundred and sixty three sets of analytical laboratory tests have been collected from one of the private clinical laboratories in Baghdad. For each subject, Serum urea and Serum creatinin levels have been analyzed and tested by using clinical laboratory measurements. The collected urea and cretinine levels are then used as inputs to the three NN models in which the training process is done by different neural approaches. SOM which is a class of unsupervised network whereas PNN and BPNN are considered as class of supervised networks. These networks are used as a classifier to predict whether kidney is normal or it will have a dysfunction. The accuracy of prediction, sensitivity and specificity were found for each type of the proposed networks .We conclude that PNN gives faster and more accurate prediction of kidney dysfunction and it works as promising tool for predicting of routine kidney dysfunction from the clinical laboratory data.

Keywords: Kidney Dysfunction, Prediction, SOM, PNN, BPNN, Urea and Creatinine levels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911
7431 Exploring Utility and Intrinsic Value among UAE Arabic Teachers in Integrating M-Learning

Authors: Dina Tareq Ismail, Alexandria A. Proff

Abstract:

The United Arab Emirates (UAE) is a nation seeking to advance in all fields, particularly education. One area of focus for UAE 2021 agenda is to restructure UAE schools and universities by equipping them with highly developed technology. The agenda also advises educational institutions to prepare students with applicable and transferrable Information and Communication Technology (ICT) skills. Despite the emphasis on ICT and computer literacy skills, there exists limited empirical data on the use of M-Learning in the literature. This qualitative study explores the motivation of higher primary Arabic teachers in private schools toward implementing and integrating M-Learning apps in their classrooms. This research employs a phenomenological approach through the use of semistructured interviews with nine purposefully selected Arabic teachers. The data were analyzed using a content analysis via multiple stages of coding: open, axial, and thematic. Findings reveal three primary themes: (1) Arabic teachers with high levels of procedural knowledge in ICT are more motivated to implement M-Learning; (2) Arabic teachers' perceptions of self-efficacy influence their motivation toward implementation of M-Learning; (3) Arabic teachers implement M-Learning when they possess high utility and/or intrinsic value in these applications. These findings indicate a strong need for further training, equipping, and creating buy-in among Arabic teachers to enhance their ICT skills in implementing M-Learning. Further, given the limited availability of M-Learning apps designed for use in the Arabic language on the market, it is imperative that developers consider designing M-Learning tools that Arabic teachers, and Arabic-speaking students, can use and access more readily. This study contributes to closing the knowledge gap on teacher-motivation for implementing M-Learning in their classrooms in the UAE.

Keywords: ICT Skills, M-Learning, self-efficacy, teachermotivation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 456
7430 Integration of Multi-Source Data to Monitor Coral Biodiversity

Authors: K. Jitkue, W. Srisang, C. Yaiprasert, K. Jaroensutasinee, M. Jaroensutasinee

Abstract:

This study aims at using multi-source data to monitor coral biodiversity and coral bleaching. We used coral reef at Racha Islands, Phuket as a study area. There were three sources of data: coral diversity, sensor based data and satellite data.

Keywords: Coral reefs, Remote sensing, Sea surfacetemperatue, Satellite imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1527
7429 Decision Support System Based on Data Warehouse

Authors: Yang Bao, LuJing Zhang

Abstract:

Typical Intelligent Decision Support System is 4-based, its design composes of Data Warehouse, Online Analytical Processing, Data Mining and Decision Supporting based on models, which is called Decision Support System Based on Data Warehouse (DSSBDW). This way takes ETL,OLAP and DM as its implementing means, and integrates traditional model-driving DSS and data-driving DSS into a whole. For this kind of problem, this paper analyzes the DSSBDW architecture and DW model, and discusses the following key issues: ETL designing and Realization; metadata managing technology using XML; SQL implementing, optimizing performance, data mapping in OLAP; lastly, it illustrates the designing principle and method of DW in DSSBDW.

Keywords: Decision Support System, Data Warehouse, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3837
7428 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263
7427 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: Text mining, topic extraction, independent, incremental, independent component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1028
7426 A Framework for Data Mining Based Multi-Agent: An Application to Spatial Data

Authors: H. Baazaoui Zghal, S. Faiz, H. Ben Ghezala

Abstract:

Data mining is an extraordinarily demanding field referring to extraction of implicit knowledge and relationships, which are not explicitly stored in databases. A wide variety of methods of data mining have been introduced (classification, characterization, generalization...). Each one of these methods includes more than algorithm. A system of data mining implies different user categories,, which mean that the user-s behavior must be a component of the system. The problem at this level is to know which algorithm of which method to employ for an exploratory end, which one for a decisional end, and how can they collaborate and communicate. Agent paradigm presents a new way of conception and realizing of data mining system. The purpose is to combine different algorithms of data mining to prepare elements for decision-makers, benefiting from the possibilities offered by the multi-agent systems. In this paper the agent framework for data mining is introduced, and its overall architecture and functionality are presented. The validation is made on spatial data. Principal results will be presented.

Keywords: Databases, data mining, multi-agent, spatial datamart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2025
7425 Latent Topic Based Medical Data Classification

Authors: Jian-hua Yeh, Shi-yi Kuo

Abstract:

This paper discusses the classification process for medical data. In this paper, we use the data from ACM KDDCup 2008 to demonstrate our classification process based on latent topic discovery. In this data set, the target set and outliers are quite different in their nature: target set is only 0.6% size in total, while the outliers consist of 99.4% of the data set. We use this data set as an example to show how we dealt with this extremely biased data set with latent topic discovery and noise reduction techniques. Our experiment faces two major challenge: (1) extremely distributed outliers, and (2) positive samples are far smaller than negative ones. We try to propose a suitable process flow to deal with these issues and get a best AUC result of 0.98.

Keywords: classification, latent topics, outlier adjustment, feature scaling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
7424 Data Collection in Hospital Emergencies: A Questionnaire Survey

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

Many methods are used to collect data like questionnaires, surveys, focus group interviews. Or the collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses. In this context, and to overcome the aforementioned problems, we suggest in this paper an approach to achieve the collection of relevant data, by carrying out a large-scale questionnaire-based survey. We have been able to collect good quality, consistent and practical data on hospital emergencies to improve emergency services in hospitals, especially in the case of epidemics or pandemics.

Keywords: Data collection, survey, database, data analysis, hospital emergencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 604
7423 Data Transformation Services (DTS): Creating Data Mart by Consolidating Multi-Source Enterprise Operational Data

Authors: J. D. D. Daniel, K. N. Goh, S. M. Yusop

Abstract:

Trends in business intelligence, e-commerce and remote access make it necessary and practical to store data in different ways on multiple systems with different operating systems. As business evolve and grow, they require efficient computerized solution to perform data update and to access data from diverse enterprise business applications. The objective of this paper is to demonstrate the capability of DTS [1] as a database solution for automatic data transfer and update in solving business problem. This DTS package is developed for the sales of variety of plants and eventually expanded into commercial supply and landscaping business. Dimension data modeling is used in DTS package to extract, transform and load data from heterogeneous database systems such as MySQL, Microsoft Access and Oracle that consolidates into a Data Mart residing in SQL Server. Hence, the data transfer from various databases is scheduled to run automatically every quarter of the year to review the efficient sales analysis. Therefore, DTS is absolutely an attractive solution for automatic data transfer and update which meeting today-s business needs.

Keywords: Data Transformation Services (DTS), ObjectLinking and Embedding Database (OLEDB), Data Mart, OnlineAnalytical Processing (OLAP), Online Transactional Processing(OLTP).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
7422 Extraction of Data from Web Pages: A Vision Based Approach

Authors: P. S. Hiremath, Siddu P. Algur

Abstract:

With the explosive growth of information sources available on the World Wide Web, it has become increasingly difficult to identify the relevant pieces of information, since web pages are often cluttered with irrelevant content like advertisements, navigation-panels, copyright notices etc., surrounding the main content of the web page. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value-added services. Currently available automatic techniques to mine data regions from web pages are still unsatisfactory because of their poor performance and tag-dependence. In this paper a novel method to extract data items from the web pages automatically is proposed. It comprises of two steps: (1) Identification and Extraction of the data regions based on visual clues information. (2) Identification of data records and extraction of data items from a data region. For step1, a novel and more effective method is proposed based on visual clues, which finds the data regions formed by all types of tags using visual clues. For step2 a more effective method namely, Extraction of Data Items from web Pages (EDIP), is adopted to mine data items. The EDIP technique is a list-based approach in which the list is a linear data structure. The proposed technique is able to mine the non-contiguous data records and can correctly identify data regions, irrespective of the type of tag in which it is bound. Our experimental results show that the proposed technique performs better than the existing techniques.

Keywords: Web data records, web data regions, web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
7421 Visual-Graphical Methods for Exploring Longitudinal Data

Authors: H. W. Ker

Abstract:

Longitudinal data typically have the characteristics of changes over time, nonlinear growth patterns, between-subjects variability, and the within errors exhibiting heteroscedasticity and dependence. The data exploration is more complicated than that of cross-sectional data. The purpose of this paper is to organize/integrate of various visual-graphical techniques to explore longitudinal data. From the application of the proposed methods, investigators can answer the research questions include characterizing or describing the growth patterns at both group and individual level, identifying the time points where important changes occur and unusual subjects, selecting suitable statistical models, and suggesting possible within-error variance.

Keywords: Data exploration, exploratory analysis, HLMs/LMEs, longitudinal data, visual-graphical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2070
7420 A Materialized Approach to the Integration of XML Documents: the OSIX System

Authors: H. Ahmad, S. Kermanshahani, A. Simonet, M. Simonet

Abstract:

The data exchanged on the Web are of different nature from those treated by the classical database management systems; these data are called semi-structured data since they do not have a regular and static structure like data found in a relational database; their schema is dynamic and may contain missing data or types. Therefore, the needs for developing further techniques and algorithms to exploit and integrate such data, and extract relevant information for the user have been raised. In this paper we present the system OSIX (Osiris based System for Integration of XML Sources). This system has a Data Warehouse model designed for the integration of semi-structured data and more precisely for the integration of XML documents. The architecture of OSIX relies on the Osiris system, a DL-based model designed for the representation and management of databases and knowledge bases. Osiris is a viewbased data model whose indexing system supports semantic query optimization. We show that the problem of query processing on a XML source is optimized by the indexing approach proposed by Osiris.

Keywords: Data integration, semi-structured data, views, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
7419 Energy Consumption Forecast Procedure for an Industrial Facility

Authors: Tatyana Aleksandrovna Barbasova, Lev Sergeevich Kazarinov, Olga Valerevna Kolesnikova, Aleksandra Aleksandrovna Filimonova

Abstract:

We regard forecasting of energy consumption by private production areas of a large industrial facility as well as by the facility itself. As for production areas, the forecast is made based on empirical dependencies of the specific energy consumption and the production output. As for the facility itself, implementation of the task to minimize the energy consumption forecasting error is based on adjustment of the facility’s actual energy consumption values evaluated with the metering device and the total design energy consumption of separate production areas of the facility. The suggested procedure of optimal energy consumption was tested based on the actual data of core product output and energy consumption by a group of workshops and power plants of the large iron and steel facility. Test results show that implementation of this procedure gives the mean accuracy of energy consumption forecasting for winter 2014 of 0.11% for the group of workshops and 0.137% for the power plants.

Keywords: Energy consumption, energy consumption forecasting error, energy efficiency, forecasting accuracy, forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1704
7418 The Most Secure Smartphone Operating System: A Survey

Authors: Sundus Ayyaz, Saad Rehman

Abstract:

In the recent years, a fundamental revolution in the Mobile Phone technology from just being able to provide voice and short message services to becoming the most essential part of our lives by connecting to network and various app stores for downloading software apps of almost every activity related to our life from finding location to banking from getting news updates to downloading HD videos and so on. This progress in Smart Phone industry has modernized and transformed our way of living into a trouble-free world. The smart phone has become our personal computers with the addition of significant features such as multi core processors, multi-tasking, large storage space, bluetooth, WiFi, including large screen and cameras. With this evolution, the rise in the security threats have also been amplified. In Literature, different threats related to smart phones have been highlighted and various precautions and solutions have been proposed to keep the smart phone safe which carries all the private data of a user. In this paper, a survey has been carried out to find out the most secure and the most unsecure smart phone operating system among the most popular smart phones in use today.

Keywords: Smart phone, operating system, security threats, Android, iOS, Balckberry, Windows.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4153
7417 Data-Driven Decision-Making in Digital Entrepreneurship

Authors: Abeba Nigussie Turi, Xiangming Samuel Li

Abstract:

Data-driven business models are more typical for established businesses than early-stage startups that strive to penetrate a market. This paper provided an extensive discussion on the principles of data analytics for early-stage digital entrepreneurial businesses. Here, we developed data-driven decision-making (DDDM) framework that applies to startups prone to multifaceted barriers in the form of poor data access, technical and financial constraints, to state some. The startup DDDM framework proposed in this paper is novel in its form encompassing startup data analytics enablers and metrics aligning with startups' business models ranging from customer-centric product development to servitization which is the future of modern digital entrepreneurship.

Keywords: Startup data analytics, data-driven decision-making, data acquisition, data generation, digital entrepreneurship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 777
7416 Classifying Bio-Chip Data using an Ant Colony System Algorithm

Authors: Minsoo Lee, Yearn Jeong Kim, Yun-mi Kim, Sujeung Cheong, Sookyung Song

Abstract:

Bio-chips are used for experiments on genes and contain various information such as genes, samples and so on. The two-dimensional bio-chips, in which one axis represent genes and the other represent samples, are widely being used these days. Instead of experimenting with real genes which cost lots of money and much time to get the results, bio-chips are being used for biological experiments. And extracting data from the bio-chips with high accuracy and finding out the patterns or useful information from such data is very important. Bio-chip analysis systems extract data from various kinds of bio-chips and mine the data in order to get useful information. One of the commonly used methods to mine the data is classification. The algorithm that is used to classify the data can be various depending on the data types or number characteristics and so on. Considering that bio-chip data is extremely large, an algorithm that imitates the ecosystem such as the ant algorithm is suitable to use as an algorithm for classification. This paper focuses on finding the classification rules from the bio-chip data using the Ant Colony algorithm which imitates the ecosystem. The developed system takes in consideration the accuracy of the discovered rules when it applies it to the bio-chip data in order to predict the classes.

Keywords: Ant Colony System, DNA chip data, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
7415 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat

Abstract:

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805
7414 Towards Development of Solution for Business Process-Oriented Data Analysis

Authors: M. Klimavicius

Abstract:

This paper proposes a modeling methodology for the development of data analysis solution. The Author introduce the approach to address data warehousing issues at the at enterprise level. The methodology covers the process of the requirements eliciting and analysis stage as well as initial design of data warehouse. The paper reviews extended business process model, which satisfy the needs of data warehouse development. The Author considers that the use of business process models is necessary, as it reflects both enterprise information systems and business functions, which are important for data analysis. The Described approach divides development into three steps with different detailed elaboration of models. The Described approach gives possibility to gather requirements and display them to business users in easy manner.

Keywords: Data warehouse, data analysis, business processmanagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1373