Search results for: structured data

7146 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

Authors: Binu Thomas, Raju G., Sonam Wangmo

Abstract:

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1955

7145 A Research of the Influence that MP3 Sound Gives EEG of the Person

Authors: Seiya Teshima, Kazushige Magatani

Abstract:

Currently, many types of no-reversible compressed sound source, represented by MP3 (MPEG Audio Layer-3) are popular in the world and they are widely used to make the music file size smaller. The sound data created in this way has less information as compared to pre-compressed data. The objective of this study is by analyzing EEG to determine if people can recognize such difference as differences in sound. A measurement system that can measure and analyze EEG when a subject listens to music were experimentally developed. And ten subjects were studied with this system. In this experiment, a WAVE formatted music data and a MP3 compressed music data that is made from the WAVE formatted data were prepared. Each subject was made to hear these music sources at the same volume. From the results of this experiment, clear differences were confirmed between two wound sources.

Keywords: EEG, Biological signal , Sound , MP3

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1754

7144 Real-Time Visualization Using GPU-Accelerated Filtering of LiDAR Data

Authors: Sašo Pečnik, Borut Žalik

Abstract:

This paper presents a real-time visualization technique and filtering of classified LiDAR point clouds. The visualization is capable of displaying filtered information organized in layers by the classification attribute saved within LiDAR datasets. We explain the used data structure and data management, which enables real-time presentation of layered LiDAR data. Real-time visualization is achieved with LOD optimization based on the distance from the observer without loss of quality. The filtering process is done in two steps and is entirely executed on the GPU and implemented using programmable shaders.

Keywords: Filtering, graphics, level-of-details, LiDAR, realtime visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2524

7143 Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

Authors: Karima Qayumi, Alex Norta

Abstract:

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.

Keywords: Agent-oriented modeling, business Intelligence management, distributed data mining, multi-agent system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1353

7142 Abnormal IP Packets on 3G Mobile Data Networks

Authors: Joo-Hyung Oh, Dongwan Kang, JunHyung Cho, Chaetae Im

Abstract:

As the mobile Internet has become widespread in recent years, communication based on mobile networks is increasing. As a result, security threats have been posed with regard to the abnormal traffic of mobile networks, but mobile security has been handled with focus on threats posed by mobile malicious codes, and researches on security threats to the mobile network itself have not attracted much attention. In mobile networks, the IP address of the data packet is a very important factor for billing purposes. If one mobile terminal use an incorrect IP address that either does not exist or could be assigned to another mobile terminal, billing policy will cause problems. We monitor and analyze 3G mobile data networks traffics for a period of time and finds some abnormal IP packets. In this paper, we analyze the reason for abnormal IP packets on 3G Mobile Data Networks. And we also propose an algorithm based on IP address table that contains addresses currently in use within the mobile data network to detect abnormal IP packets.

Keywords: WCDMA, 3G, Abnormal IP address, Mobile Data Network Attack

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2312

7141 Identifying Knowledge Gaps in Incorporating Toxicity of Particulate Matter Constituents for Developing Regulatory Limits on Particulate Matter

Authors: Ananya Das, Arun Kumar, Gazala Habib, Vivekanandan Perumal

Abstract:

Regulatory bodies has proposed limits on Particulate Matter (PM) concentration in air; however, it does not explicitly indicate the incorporation of effects of toxicities of constituents of PM in developing regulatory limits. This study aimed to provide a structured approach to incorporate toxic effects of components in developing regulatory limits on PM. A four-step human health risk assessment framework consists of - (1) hazard identification (parameters: PM and its constituents and their associated toxic effects on health), (2) exposure assessment (parameters: concentrations of PM and constituents, information on size and shape of PM; fate and transport of PM and constituents in respiratory system), (3) dose-response assessment (parameters: reference dose or target toxicity dose of PM and its constituents), and (4) risk estimation (metric: hazard quotient and/or lifetime incremental risk of cancer as applicable). Then parameters required at every step were obtained from literature. Using this information, an attempt has been made to determine limits on PM using component-specific information. An example calculation was conducted for exposures of PM_2.5 and its metal constituents from Indian ambient environment to determine limit on PM values. Identified data gaps were: (1) concentrations of PM and its constituents and their relationship with sampling regions, (2) relationship of toxicity of PM with its components.

Keywords: Air, component-specific toxicity, human health risks, particulate matter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1160

7140 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons

Authors: Said Boularouk, Didier Josselin, Eitan Altman

Abstract:

In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.

Keywords: Ontology, OpenStreetMap, visually impaired people, TTS, taxonomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 861

7139 A Data Mining Model for Detecting Financial and Operational Risk Indicators of SMEs

Authors: Ali Serhan Koyuncugil, Nermin Ozgulbas

Abstract:

In this paper, a data mining model to SMEs for detecting financial and operational risk indicators by data mining is presenting. The identification of the risk factors by clarifying the relationship between the variables defines the discovery of knowledge from the financial and operational variables. Automatic and estimation oriented information discovery process coincides the definition of data mining. During the formation of model; an easy to understand, easy to interpret and easy to apply utilitarian model that is far from the requirement of theoretical background is targeted by the discovery of the implicit relationships between the data and the identification of effect level of every factor. In addition, this paper is based on a project which was funded by The Scientific and Technological Research Council of Turkey (TUBITAK).

Keywords: Risk Management, Financial Risk, Operational Risk, Financial Early Warning System, Data Mining, CHAID Decision Tree Algorithm, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3097

7138 Satellite Data Classification Accuracy Assessment Based from Reference Dataset

Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff

Abstract:

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3113

7137 Analysis of Construction Waste Generation and Its Effect in a Construction Site

Authors: R. K. D. G. Kaluarachchi

Abstract:

The generation of solid waste and its effective management are debated topics in Sri Lanka as well as in the global environment. It was estimated that the most of the waste generated in global was originated from construction and demolition of buildings. Thus, the proportion of construction waste in solid waste generation cannot be underestimated. The construction waste, which is the by-product generated and removed from work sites is collected in direct and indirect processes. Hence, the objectives of this research are to identify the proportion of construction waste which can be reused and identify the methods to reduce the waste generation without reducing the quality of the process. A 6-storey building construction site was selected for this research. The site was divided into six zones depending on the process. Ten waste materials were identified by considering the adverse effects on safety and health of people and the economic value of them. The generated construction waste in each zone was recorded per week for a period of five months. The data revealed that sand, cement, wood used for form work and rusted steel rods were the generated waste which has higher economic value in all zones. Structured interviews were conducted to gather information on how the materials are categorized as waste and the capability of reducing, reusing and recycling the waste. It was identified that waste is generated in following processes; ineffective storage of material for a longer time and improper handling of material during the work process. Further, the alteration of scheduled activities of construction work also yielded more waste. Finally, a proper management of construction waste is suggested to reduce and reuse waste.

Keywords: Construction waste, effective management, reduce, reuse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1228

7136 Cultural Practices as a Coping Measure for Women who Terminated a Pregnancy in Adolescence: A Qualitative Study

Authors: Botshelo R. Sebola

Abstract:

Unintended pregnancy often results in pregnancy termination. Most countries have legalised the termination of a pregnancy and pregnant adolescents can visit designated clinics without their parents’ consent. In most African and Asian countries, certain cultural practices are performed following any form of childbirth, including abortion, and such practices are ingrained in societies. The aim of this paper was to understand how women who terminated a pregnancy during adolescence coped by embracing cultural practices. A descriptive multiple case study design was adopted for the study. In-depth, semi-structured interviews and reflective diaries were used for data collection. Participants were 13 women aged 20 to 35 years who had terminated a pregnancy in adolescence. Three women kept their soiled sanitary pads, burned them to ash and waited for the rainy season to scatter the ash in a flowing stream. This ritual was performed to appease the ancestors, ask them for forgiveness and as a send-off for the aborted foetus. Five women secretly consulted Sangoma (traditional healers) to perform certain rituals. Three women isolated themselves to perform herbal cleansings, and the last two chose not to engage in any sexual activity for one year, which led to the loss of their partners. This study offers a unique contribution to understanding the solitary journey of women who terminate a pregnancy. The study challenges healthcare professionals who work in clinics that offer pregnancy termination services to look beyond releasing the foetus to advocating and providing women with the necessary care and support in performing cultural practices.

Keywords: Adolescence, case study, cultural rituals, pregnancy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 308

7135 Analysis of Diverse Cluster Ensemble Techniques

Authors: S. Sarumathi, N. Shanthi, P. Ranjetha

Abstract:

Data mining is the procedure of determining interesting patterns from the huge amount of data. With the intention of accessing the data faster the most supporting processes needed is clustering. Clustering is the process of identifying similarity between data according to the individuality present in the data and grouping associated data objects into clusters. Cluster ensemble is the technique to combine various runs of different clustering algorithms to obtain a general partition of the original dataset, aiming for consolidation of outcomes from a collection of individual clustering outcomes. The performances of clustering ensembles are mainly affecting by two principal factors such as diversity and quality. This paper presents the overview about the different cluster ensemble algorithm along with their methods used in cluster ensemble to improve the diversity and quality in the several cluster ensemble related papers and shows the comparative analysis of different cluster ensemble also summarize various cluster ensemble methods. Henceforth this clear analysis will be very useful for the world of clustering experts and also helps in deciding the most appropriate one to determine the problem in hand.

Keywords: Cluster Ensemble, Consensus Function, CSPA, Diversity, HGPA, MCLA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1812

7134 A Distributed Approach to Extract High Utility Itemsets from XML Data

Authors: S. Kannimuthu, K. Premalatha

Abstract:

This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.

Keywords: Data mining, Knowledge as a Service, service oriented computing, utility mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2428

7133 A Comparative Analysis of Solid Waste Treatment Technologies on Cost and Environmental Basis

Authors: Nesli Aydin

Abstract:

Waste management decision making in developing countries has moved towards being more pragmatic, transparent, sustainable and comprehensive. Turkey is required to make its waste related legislation compatible with European Legislation as it is a candidate country of the European Union. Improper Turkish practices such as open burning and open dumping practices must be abandoned urgently, and robust waste management systems have to be structured. The determination of an optimum waste management system in any region requires a comprehensive analysis in which many criteria are taken into account by stakeholders. In conducting this sort of analysis, there are two main criteria which are evaluated by waste management analysts; economic viability and environmentally friendliness. From an analytical point of view, a central characteristic of sustainable development is an economic-ecological integration. It is predicted that building a robust waste management system will need significant effort and cooperation between the stakeholders in developing countries such as Turkey. In this regard, this study aims to provide data regarding the cost and environmental burdens of waste treatment technologies such as an incinerator, an autoclave (with different capacities), a hydroclave and a microwave coupled with updated information on calculation methods, and a framework for comparing any proposed scenario performances on a cost and environmental basis.

Keywords: Decision making, economic viability, environmentally friendliness, stakeholder, waste management systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1244

7132 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: Activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2846

7131 Performance and Availability Analyses of PV Generation Systems in Taiwan

Authors: H. S. Huang, J. C. Jao, K. L. Yen, C. T. Tsai

Abstract:

The purpose of this article applies the monthly final energy yield and failure data of 202 PV systems installed in Taiwan to analyze the PV operational performance and system availability. This data is collected by Industrial Technology Research Institute through manual records. Bad data detection and failure data estimation approaches are proposed to guarantee the quality of the received information. The performance ratio value and system availability are then calculated and compared with those of other countries. It is indicated that the average performance ratio of Taiwan-s PV systems is 0.74 and the availability is 95.7%. These results are similar with those of Germany, Switzerland, Italy and Japan.

Keywords: availability, performance ratio, PV system, Taiwan

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4403

7130 Stealthy Network Transfer of Data

Authors: N. Veerasamy, C. J. Cheyne

Abstract:

Users of computer systems may often require the private transfer of messages/communications between parties across a network. Information warfare and the protection and dominance of information in the military context is a prime example of an application area in which the confidentiality of data needs to be maintained. The safe transportation of critical data is therefore often a vital requirement for many private communications. However, unwanted interception/sniffing of communications is also a possibility. An elementary stealthy transfer scheme is therefore proposed by the authors. This scheme makes use of encoding, splitting of a message and the use of a hashing algorithm to verify the correctness of the reconstructed message. For this proof-of-concept purpose, the authors have experimented with the random sending of encoded parts of a message and the construction thereof to demonstrate how data can stealthily be transferred across a network so as to prevent the obvious retrieval of data.

Keywords: Construction, encode, interception, stealthy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1171

7129 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: Big Data, Social Networks, Sentiment Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4312

7128 Mean Shift-based Preprocessing Methodology for Improved 3D Buildings Reconstruction

Authors: Nikolaos Vassilas, Theocharis Tsenoglou, Djamchid Ghazanfarpour

Abstract:

In this work, we explore the capability of the mean shift algorithm as a powerful preprocessing tool for improving the quality of spatial data, acquired from airborne scanners, from densely built urban areas. On one hand, high resolution image data corrupted by noise caused by lossy compression techniques are appropriately smoothed while at the same time preserving the optical edges and, on the other, low resolution LiDAR data in the form of normalized Digital Surface Map (nDSM) is upsampled through the joint mean shift algorithm. Experiments on both the edge-preserving smoothing and upsampling capabilities using synthetic RGB-z data show that the mean shift algorithm is superior to bilateral filtering as well as to other classical smoothing and upsampling algorithms. Application of the proposed methodology for 3D reconstruction of buildings of a pilot region of Athens, Greece results in a significant visual improvement of the 3D building block model.

Keywords: 3D buildings reconstruction, data fusion, data upsampling, mean shift.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972

7127 Identification of Common Indicators of Family Environment of Pupils of Alternative Schools

Authors: Yveta Pohnětalová, Veronika Nováková, Lucie Hrašová

Abstract:

The paper presents the results of research in which we were looking for common characteristics of the family environment of students alternative and innovative education systems. Topicality comes from the fact that nowadays in the Czech Republic there are several civic and parental initiatives held with the aim to establish schools for their children. The goal of our research was to reveal key aspects of these families and to identify their common indicators. Among other things, we were interested what reasons lead parents to decide to enroll their child into different education than standard (common). The survey was qualitative and there were eighteen respondents of parents of alternative schools´ pupils. The reason to implement qualitative design was the opportunity to gain deeper insight into the essence of phenomena and to obtain detailed information, which would become the basis for subsequent quantitative research. There have been semi structured interviews done with the respondents which had been recorded and transcribed. By an analysis of gained data (categorization and by coding), we found out that common indicator of our respondents is higher education and higher economic level. This issue should be at the forefront of the researches because there is lack of analysis which would provide a comparison of common and alternative schools in the Czech Republic especially with regard to quality of education. Based on results, we consider questions whether approaches of these parents towards standard education come from their own experience or from the lack of knowledge of current goals and objectives of education policy of the Czech Republic.

Keywords: Alternative schools, family environment, quality of education, parents´ approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 976

7126 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential

Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag

Abstract:

Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.

Keywords: Climate, reanalysis, renewable energy, solar radiation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 875

7125 Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions

Authors: Patrick Wessa, Bart Baesens

Abstract:

This paper discusses the use of explorative data mining tools that allow the educator to explore new relationships between reported learning experiences and actual activities, even if there are multiple dimensions with a large number of measured items. The underlying technology is based on the so-called Compendium Platform for Reproducible Computing (http://www.freestatistics.org) which was built on top the computational R Framework (http://www.wessa.net).

Keywords: Reproducible computing, data mining, explorative data analysis, compendium technology, computer assisted education

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231

7124 Assessment of Knowledge, Attitudes and Practices of Street Vendors in Mangaung Metro South Africa

Authors: Gaofetoge Lenetha, Malerato Moloi, Ntsoaki Malebo

Abstract:

Microbial contamination of ready-to-eat foods and beverages sold by street vendors has become an important public health issue. In developing countries including South Africa, health risks related to such kinds of foods are thought to be common. Thus, this study assessed knowledge, attitude and practices of street food vendors. Street vendors in the city of Mangaung Metro were investigated in order to assess their knowledge, attitudes and handling practices. A semi-structured questionnaire and checklist were used in interviews to determine the status of the vending sites and associa. ted food-handling practices. Data was collected by means of a face-to-face interview. The majority of respondents were black females. Hundred percent (100%) of the participants did not have any food safety training. However, street vendors showed a positive attitude towards food safety. Despite the positive attitude, vendors showed some non-compliance when it comes to handling food. During the survey, it was also observed that the vending stalls lack basic infrastructures like toilets and potable water that is currently a major problem. This study indicates a need for improvements in the environmental conditions at these sites to prevent foodborne diseases. Moreover, based on the results observed food safety and food hygiene training or workshops for street vendors are highly recommended.

Keywords: Food hygiene, foodborne illnesses, food safety, street foods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1094

7123 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1269

7122 Cross-Cultural Cooperation and Innovation: An Exploration of Chinese Foreign Direct Investment in Europe

Authors: Yongsheng Guo, Shuchao Li

Abstract:

This study explores Chinese Foreign Direct Investment (FDI) in Europe and the cross-cultural cooperation between Chinese and European managers. The aim of this research is to shed light on the phenomenon of investments in developed countries from an emerging market and to gain insights into the cooperation process. A grounded theory approach is adopted, and 46 semi-structured interviews were conducted with 10 case companies in Germany and 13 case companies in the UK. Grounded theory models are developed from primary data and interview quotes are used to support the themes. The interviewees perceived differences between the two parties in cultural traits, management concepts, knowledge structure and resource endowment between the two parties. Chinese and European partners can take advantage of different resources and cooperate in innovative ways to improve corporate performance. Moreover, both parties appreciate different ethical and cultural characteristics and complement each other to develop a combined organizational culture. This study proposes an ethical and cultural diversity theory in international management arguing that a team with diversified values and behaviours may be more excited and motivated. This study suggests that “resource complement” and “cross-cultural cooperation” might be an advantage for international investment. Firms are encouraged to open their minds and cooperate with partners with different resources and cultures. The authorities may review the FDI policies to reduce social and political barriers.

Keywords: Cross-culture, FDI, China, Europe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 110

7121 Using Automated Database Reverse Engineering for Database Integration

Authors: M. R. Abbasifard, M. Rahgozar, A. Bayati, P. Pournemati

Abstract:

One important problem in today organizations is the existence of non-integrated information systems, inconsistency and lack of suitable correlations between legacy and modern systems. One main solution is to transfer the local databases into a global one. In this regards we need to extract the data structures from the legacy systems and integrate them with the new technology systems. In legacy systems, huge amounts of a data are stored in legacy databases. They require particular attention since they need more efforts to be normalized, reformatted and moved to the modern database environments. Designing the new integrated (global) database architecture and applying the reverse engineering requires data normalization. This paper proposes the use of database reverse engineering in order to integrate legacy and modern databases in organizations. The suggested approach consists of methods and techniques for generating data transformation rules needed for the data structure normalization.

Keywords: Reverse Engineering, Database Integration, System Integration, Data Structure Normalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826

7120 Multi-Label Hierarchical Classification for Protein Function Prediction

Authors: Helyane B. Borges, Julio Cesar Nievola

Abstract:

Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Multi-label Hierarchical Classification using a Competitive Neural Network (MHC-CNN). It was tested in ten datasets the Gene Ontology (GO) Cellular Component Domain. The results are compared with the Clus-HMC and Clus-HSC using the hF-Measure.

Keywords: Hierarchical Classification, Competitive Neural Network, Global Classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2360

7119 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: Behavior pattern, cooperative learning, data analyze, K-means clustering algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 787

7118 A Security Cloud Storage Scheme Based Accountable Key-Policy Attribute-Based Encryption without Key Escrow

Authors: Ming Lun Wang, Yan Wang, Ning Ruo Sun

Abstract:

With the development of cloud computing, more and more users start to utilize the cloud storage service. However, there exist some issues: 1) cloud server steals the shared data, 2) sharers collude with the cloud server to steal the shared data, 3) cloud server tampers the shared data, 4) sharers and key generation center (KGC) conspire to steal the shared data. In this paper, we use advanced encryption standard (AES), hash algorithms, and accountable key-policy attribute-based encryption without key escrow (WOKE-AKP-ABE) to build a security cloud storage scheme. Moreover, the data are encrypted to protect the privacy. We use hash algorithms to prevent the cloud server from tampering the data uploaded to the cloud. Analysis results show that this scheme can resist conspired attacks.

Keywords: Cloud storage security, sharing storage, attributes, Hash algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1012

7117 Multimethod Approach to Research in Interlanguage Pragmatics

Authors: Saad Al-Gahtani, Ghassan H Al Shatter

Abstract:

Argument over the use of particular method in interlanguage pragmatics has increased recently. Researchers argued the advantages and disadvantages of each method either natural or elicited. Findings of different studies indicated that the use of one method may not provide enough data to answer all its questions. The current study investigated the validity of using multimethod approach in interlanguage pragmatics to understand the development of requests in Arabic as a second language (Arabic L2). To this end, the study adopted two methods belong to two types of data sources: the institutional discourse (natural data), and the role play (elicited data). Participants were 117 learners of Arabic L2 at the university level, representing four levels (beginners, low-intermediate, highintermediate, and advanced). Results showed that using two or more methods in interlanguage pragmatics affect the size and nature of data.

Keywords: Arabic L2, Development of requests, Interlanguage Pragmatics, Multimethod approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807