Search results for: Distributed Data Mining

7133 Design of a Low Cost Motion Data Acquisition Setup for Mechatronic Systems

Abstract:

Motion sensors have been commonly used as a valuable component in mechatronic systems, however, many mechatronic designs and applications that need motion sensors cost enormous amount of money, especially high-tech systems. Design of a software for communication protocol between data acquisition card and motion sensor is another issue that has to be solved. This study presents how to design a low cost motion data acquisition setup consisting of MPU 6050 motion sensor (gyro and accelerometer in 3 axes) and Arduino Mega2560 microcontroller. Design parameters are calibration of the sensor, identification and communication between sensor and data acquisition card, interpretation of data collected by the sensor.

Keywords: Calibration of sensors, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4336

7132 Real Time Approach for Data Placement in Wireless Sensor Networks

Authors: Sanjeev Gupta, Mayank Dave

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814

7131 A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1979

7130 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805

7129 New PTH Moment Stable Criteria of Stochastic Neural Networks

Authors: Zixin Liu, Huawei Yang, Fangwei Chen

Abstract:

In this paper, the issue of pth moment stability of a class of stochastic neural networks with mixed delays is investigated. By establishing two integro-differential inequalities, some new sufficient conditions ensuring pth moment exponential stability are obtained. Compared with some previous publications, our results generalize some earlier works reported in the literature, and remove some strict constraints of time delays and kernel functions. Two numerical examples are presented to illustrate the validity of the main results.

Keywords: Neural networks, stochastic, PTH moment stable, time varying delays, distributed delays.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470

7128 Are XBRL-based Financial Reports Better than Non-XBRL Reports? A Quality Assessment

Authors: Zhenkun Wang, Simon S. Gao

Abstract:

Using a scoring system, this paper provides a comparative assessment of the quality of data between XBRL formatted financial reports and non-XBRL financial reports. It shows a major improvement in the quality of data of XBRL formatted financial reports. Although XBRL formatted financial reports do not show much advantage in the quality at the beginning, XBRL financial reports lately display a large improvement in the quality of data in almost all aspects. With the improved XBRL web data managing, presentation and analysis applications, XBRL formatted financial reports have a much better accessibility, are more accurate and better in timeliness.

Keywords: Data Quality; Financial Report; Information; XBRL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2566

7127 Modeling of Random Variable with Digital Probability Hyper Digraph: Data-Oriented Approach

Authors: A. Habibizad Navin, M. Naghian Fesharaki, M. Mirnia, M. Kargar

Abstract:

In this paper we introduce Digital Probability Hyper Digraph for modeling random variable as the hierarchical data-oriented model.

Keywords: Data-Oriented Models, Data Structure, DigitalProbability Hyper Digraph, Random Variable, Statistic andProbability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1273

7126 Wireless Transmission of Big Data Using Novel Secure Algorithm

Authors: K. Thiagarajan, K. Saranya, A. Veeraiah, B. Sudha

Abstract:

This paper presents a novel algorithm for secure, reliable and flexible transmission of big data in two hop wireless networks using cooperative jamming scheme. Two hop wireless networks consist of source, relay and destination nodes. Big data has to transmit from source to relay and from relay to destination by deploying security in physical layer. Cooperative jamming scheme determines transmission of big data in more secure manner by protecting it from eavesdroppers and malicious nodes of unknown location. The novel algorithm that ensures secure and energy balance transmission of big data, includes selection of data transmitting region, segmenting the selected region, determining probability ratio for each node (capture node, non-capture and eavesdropper node) in every segment, evaluating the probability using binary based evaluation. If it is secure transmission resume with the two- hop transmission of big data, otherwise prevent the attackers by cooperative jamming scheme and transmit the data in two-hop transmission.

Keywords: Big data, cooperative jamming, energy balance, physical layer, two-hop transmission, wireless security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2180

7125 Minimization of Power Loss in Distribution Networks by Different Techniques

Authors: L.Ramesh, S.P.Chowdhury, S.Chowdhury, A.A.Natarajan, C.T.Gaunt

Abstract:

Accurate loss minimization is the critical component for efficient electrical distribution power flow .The contribution of this work presents loss minimization in power distribution system through feeder restructuring, incorporating DG and placement of capacitor. The study of this work was conducted on IEEE distribution network and India Electricity Board benchmark distribution system. The executed experimental result of Indian system is recommended to board and implement practically for regulated stable output.

Keywords: Distribution system, Distributed Generation LossMinimization, Network Restructuring

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6233

7124 Study of Efficiency and Capability LZW++ Technique in Data Compression

Authors: Yusof. Mohd Kamir, Mat Deris. Mohd Sufian, Abidin. Ahmad Faisal Amri

Abstract:

The purpose of this paper is to show efficiency and capability LZWµ in data compression. The LZWµ technique is enhancement from existing LZW technique. The modification the existing LZW is needed to produce LZWµ technique. LZW read one by one character at one time. Differ with LZWµ technique, where the LZWµ read three characters at one time. This paper focuses on data compression and tested efficiency and capability LZWµ by different data format such as doc type, pdf type and text type. Several experiments have been done by different types of data format. The results shows LZWµ technique is better compared to existing LZW technique in term of file size.

Keywords: Data Compression, Huffman Encoding, LZW, LZWµ, RLL, Size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2089

7123 Periodic Solutions for a Higher Order Nonlinear Neutral Functional Differential Equation

Authors: Yanling Zhu

Abstract:

In this paper, a higher order nonlinear neutral functional differential equation with distributed delay is studied by using the continuation theorem of coincidence degree theory. Some new results on the existence of periodic solutions are obtained.

Keywords: Neutral functional differential equation, higher order, periodic solution, coincidence degree theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263

7122 A Self-Consistent Scheme for Elastic-Plastic Asperity Contact

Authors: Xu Jianguo

Abstract:

In this paper, a generalized self-consistent scheme, or “three phase model", is used to set up a micro-mechanics model for rough surface contact with randomly distributed asperities. The dimensionless average real pressure p is obtained as function of the ratio of the real contact area to the apparent contact area, 0 A / A r . Both elastic and plastic materials are considered, and the influence of the plasticity of material on p is discussed. Both two-dimensional and three-dimensional rough surface contact problems are considered.

Keywords: Contact mechanics, plastic deformation, self-consistent scheme.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1756

7121 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: Hit rate, Locality of program, Stack cache, and Stack data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1508

7120 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2546

7119 Numerical Calculation of the Ionization Energy of Donors in a Cubic Quantum well and Wire

Authors: Sara Sedaghat, Mahmood Barati, Iraj Kazeminezhad

Abstract:

The ionization energy in semiconductor systems in nano scale was investigated by using effective mass approximation. By introducing the Hamiltonian of the system, the variational technique was employed to calculate the ground state and the ionization energy of a donor at the center and in the case that the impurities are randomly distributed inside a cubic quantum well. The numerical results for GaAs/GaAlAs show that the ionization energy strongly depends on the well width for both cases and it decreases as the well width increases. The ionization energy of a quantum wire was also calculated and compared with the results for the well.

Keywords: quantum well, quantum wire, quantum dot, impuritystate

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728

7118 Construction 4.0: The Future of the Construction Industry in South Africa

Authors: Temidayo. O. Osunsanmi, Clinton Aigbavboa, Ayodeji Oke

Abstract:

The construction industry is a renowned latecomer to the efficiency offered by the adoption of information technology. Whereas, the banking, manufacturing, retailing industries have keyed into the future by using digitization and information technology as a new approach for ensuring competitive gain and efficiency. The construction industry has yet to fully realize similar benefits because the adoption of ICT is still at the infancy stage with a major concentration on the use of software. Thus, this study evaluates the awareness and readiness of construction professionals towards embracing a full digitalization of the construction industry using construction 4.0. The term ‘construction 4.0’ was coined from the industry 4.0 concept which is regarded as the fourth industrial revolution that originated from Germany. A questionnaire was utilized for sourcing data distributed to practicing construction professionals through a convenience sampling method. Using SPSS v24, the hypotheses posed were tested with the Mann Whitney test. The result revealed that there are no differences between the consulting and contracting organizations on the readiness for adopting construction 4.0 concepts in the construction industry. Using factor analysis, the study discovers that adopting construction 4.0 will improve the performance of the construction industry regarding cost and time savings and also create sustainable buildings. In conclusion, the study determined that construction professionals have a low awareness towards construction 4.0 concepts. The study recommends an increase in awareness of construction 4.0 concepts through seminars, workshops and training, while construction professionals should take hold of the benefits of adopting construction 4.0 concepts. The study contributes to the roadmap for the implementation of construction industry 4.0 concepts in the South African construction industry.

Keywords: Building information technology, Construction 4.0, Industry 4.0, Smart Site.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5814

7117 Association between ADHD Medication, Cannabis, and Nicotine Use, Mental Distress, and Other Psychoactive Substances

Authors: Nicole Scott, Emily Dwyer, Cara Patrissy, Samantha Bonventre, Lina Begdache

Abstract:

Across North America, the use and abuse of Attention Deficit Hyperactivity Disorder (ADHD) medication, cannabis, nicotine, and other psychoactive substances across college campuses have become an increasingly prevalent problem. Students frequently use these substances to aid their studying or deal with their mental health issues. However, it is still unknown what psychoactive substances are likely to be abused when college students illicitly use ADHD medication. In addition, it is not clear which psychoactive substance is associated with mental distress. Thus, the purpose of this study is to fill these gaps by assessing the use of different psychoactive substances when illicit ADHD medication is used; and how this association relates to mental stress. A total of 702 undergraduate students from different college campuses in the US completed an anonymous survey distributed online. Data were self-reported on demographics, the use of ADHD medications, cannabis, nicotine, other psychoactive drugs, and mental distress, and feelings and opinions on the use of illicit study drugs were all included in the survey. Mental distress was assessed using the Kessler Psychological Distress 6 Scale. Data were analyzed in SPSS, Version 25.0, using Pearson’s Correlation Coefficient. Our results show use of ADHD medication, cannabis use (non-frequent and very frequent), and nicotine use (non-frequent and very frequent); there were both statistically significant positive and negative correlations to specific psychoactive substances and their corresponding frequencies. Along the same lines, ADHD medication, cannabis use (non-frequent and very frequent), and nicotine use (non-frequent and very frequent) had statistically significant positive and negative correlations to specific mental distress experiences. As these findings are combined, a vicious loop can initiate a cycle where individuals who abuse psychoactive substances may or may not be inclined to use other psychoactive substances. This may later inhibit brain functions in those main areas of the brain stem, amygdala, and prefrontal cortex where this vicious cycle may or may not impact their mental distress. Addressing the impact of study drug abuse and its potential to be associated with further substance abuse may provide an educational framework and support proactive approaches to promote awareness among college students.

Keywords: Stimulant, depressant, nicotine, ADHD medication, psychoactive substances, mental health, illicit, ecstasy, adrenochrome.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1318

7116 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.

Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2106

7115 Mining News Sites to Create Special Domain News Collections

Authors: David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Abstract:

We present a method to create special domain collections from news sites. The method only requires a single sample article as a seed. No prior corpus statistics are needed and the method is applicable to multiple languages. We examine various similarity measures and the creation of document collections for English and Japanese. The main contributions are as follows. First, the algorithm can build special domain collections from as little as one sample document. Second, unlike other algorithms it does not require a second “general" corpus to compute statistics. Third, in our testing the algorithm outperformed others in creating collections made up of highly relevant articles.

Keywords: Information Retrieval, News, Special DomainCollections,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1487

7114 An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Authors: Minsoo Lee, Yun-mi Kim, Yearn Jeong Kim, Yoon-kyung Lee, Hyejung Yoon

Abstract:

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Keywords: Ant colony system, biological data, clustering, DNA chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1974

7113 XML Data Management in Compressed Relational Database

Authors: Hongzhi Wang, Jianzhong Li, Hong Gao

Abstract:

XML is an important standard of data exchange and representation. As a mature database system, using relational database to support XML data may bring some advantages. But storing XML in relational database has obvious redundancy that wastes disk space, bandwidth and disk I/O when querying XML data. For the efficiency of storage and query XML, it is necessary to use compressed XML data in relational database. In this paper, a compressed relational database technology supporting XML data is presented. Original relational storage structure is adaptive to XPath query process. The compression method keeps this feature. Besides traditional relational database techniques, additional query process technologies on compressed relations and for special structure for XML are presented. In this paper, technologies for XQuery process in compressed relational database are presented..

Keywords: XML, compression, query processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806

7112 Framework for the Modeling of the Supply Chain Collaborative Planning Process

Authors: D. Pérez, M. M. E. Alemany

Abstract:

In this work, a framework to model the Supply Chain (SC) Collaborative Planning (CP) process is proposed. The main contributions of this framework concern 1) the presentation of the decision view, the most important one due to the characteristics of the process, jointly within the physical, organisation and information views, and 2) the simultaneous consideration of the spatial and temporal integration among the different supply chain decision centres. This framework provides the basis for a realistic and integrated perspective of the supply chain collaborative planning process and also the analytical modeling of each of its decisional activities.

Keywords: Collaborative Planning, Decision View, Distributed Decision-Making, Framework.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1324

7111 Event Monitoring Based On Web Services for Heterogeneous Event Sources

Authors: Arne Koschel

Abstract:

This article discusses event monitoring options for heterogeneous event sources as they are given in nowadays heterogeneous distributed information systems. It follows the central assumption, that a fully generic event monitoring solution cannot provide complete support for event monitoring; instead, event source specific semantics such as certain event types or support for certain event monitoring techniques have to be taken into account. Following from this, the core result of the work presented here is the extension of a configurable event monitoring (Web) service for a variety of event sources. A service approach allows us to trade genericity for the exploitation of source specific characteristics. It thus delivers results for the areas of SOA, Web services, CEP and EDA.

Keywords: Event monitoring, ECA, CEP, SOA, Web services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2343

7110 Svision: Visual Identification of Scanning and Denial of Service Attacks

Authors: Iosif-Viorel Onut, Bin Zhu, Ali A. Ghorbani

Abstract:

We propose a novel graphical technique (SVision) for intrusion detection, which pictures the network as a community of hosts independently roaming in a 3D space defined by the set of services that they use. The aim of SVision is to graphically cluster the hosts into normal and abnormal ones, highlighting only the ones that are considered as a threat to the network. Our experimental results using DARPA 1999 and 2000 intrusion detection and evaluation datasets show the proposed technique as a good candidate for the detection of various threats of the network such as vertical and horizontal scanning, Denial of Service (DoS), and Distributed DoS (DDoS) attacks.

Keywords: Anomaly Visualization, Network Security, Intrusion Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1711

7109 Gated Communities and Sense of Community: A Review on the Social Features of Gated Communities

Authors: R. Rafiemanzelat

Abstract:

Since the mid-1970s, gated communities distributed in Latin America. They are a kind of residential development where there are privatized public spaces, and access to the area is restricted. They have specific impacts on the neighborhoods that located outside their walls such as threatening security, limiting access, and spreading social inequality. This research mainly focused on social features of gated community as; segregation, fragmentation, exclusion, specifically on sense of community and typology of gated communities. The conclusion will clarify the pros and cons of gated communities and how it could be successful or not.

Keywords: Walled community, gated community, urban development, urban sociology, sense of community.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927

7108 Upgraded Rough Clustering and Outlier Detection Method on Yeast Dataset by Entropy Rough K-Means Method

Authors: P. Ashok, G. M. Kadhar Nawaz

Abstract:

Rough set theory is used to handle uncertainty and incomplete information by applying two accurate sets, Lower approximation and Upper approximation. In this paper, the rough clustering algorithms are improved by adopting the Similarity, Dissimilarity–Similarity and Entropy based initial centroids selection method on three different clustering algorithms namely Entropy based Rough K-Means (ERKM), Similarity based Rough K-Means (SRKM) and Dissimilarity-Similarity based Rough K-Means (DSRKM) were developed and executed by yeast dataset. The rough clustering algorithms are validated by cluster validity indexes namely Rand and Adjusted Rand indexes. An experimental result shows that the ERKM clustering algorithm perform effectively and delivers better results than other clustering methods. Outlier detection is an important task in data mining and very much different from the rest of the objects in the clusters. Entropy based Rough Outlier Factor (EROF) method is seemly to detect outlier effectively for yeast dataset. In rough K-Means method, by tuning the epsilon (ᶓ) value from 0.8 to 1.08 can detect outliers on boundary region and the RKM algorithm delivers better results, when choosing the value of epsilon (ᶓ) in the specified range. An experimental result shows that the EROF method on clustering algorithm performed very well and suitable for detecting outlier effectively for all datasets. Further, experimental readings show that the ERKM clustering method outperformed the other methods.

Keywords: Clustering, Entropy, Outlier, Rough K-Means, validity index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1412

7107 Improved K-Modes for Categorical Clustering Using Weighted Dissimilarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

K-Modes is an extension of K-Means clustering algorithm, developed to cluster the categorical data, where the mean is replaced by the mode. The similarity measure proposed by Huang is the simple matching or mismatching measure. Weight of attribute values contribute much in clustering; thus in this paper we propose a new weighted dissimilarity measure for K-Modes, based on the ratio of frequency of attribute values in the cluster and in the data set. The new weighted measure is experimented with the data sets obtained from the UCI data repository. The results are compared with K-Modes and K-representative, which show that the new measure generates clusters with high purity.

Keywords: Clustering, categorical data, K-Modes, weighted dissimilarity measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3689

7106 Mobile Phone as a Tool for Data Collection in Field Research

Authors: Sandro Mourão, Karla Okada

Abstract:

The necessity of accurate and timely field data is shared among organizations engaged in fundamentally different activities, public services or commercial operations. Basically, there are three major components in the process of the qualitative research: data collection, interpretation and organization of data, and analytic process. Representative technological advancements in terms of innovation have been made in mobile devices (mobile phone, PDA-s, tablets, laptops, etc). Resources that can be potentially applied on the data collection activity for field researches in order to improve this process. This paper presents and discuss the main features of a mobile phone based solution for field data collection, composed of basically three modules: a survey editor, a server web application and a client mobile application. The data gathering process begins with the survey creation module, which enables the production of tailored questionnaires. The field workforce receives the questionnaire(s) on their mobile phones to collect the interviews responses and sending them back to a server for immediate analysis.

Keywords: Data Gathering, Field Research, Mobile Phone, Survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2058

7105 On Pooling Different Levels of Data in Estimating Parameters of Continuous Meta-Analysis

Authors: N. R. N. Idris, S. Baharom

Abstract:

A meta-analysis may be performed using aggregate data (AD) or an individual patient data (IPD). In practice, studies may be available at both IPD and AD level. In this situation, both the IPD and AD should be utilised in order to maximize the available information. Statistical advantages of combining the studies from different level have not been fully explored. This study aims to quantify the statistical benefits of including available IPD when conducting a conventional summary-level meta-analysis. Simulated meta-analysis were used to assess the influence of the levels of data on overall meta-analysis estimates based on IPD-only, AD-only and the combination of IPD and AD (mixed data, MD), under different study scenario. The percentage relative bias (PRB), root mean-square-error (RMSE) and coverage probability were used to assess the efficiency of the overall estimates. The results demonstrate that available IPD should always be included in a conventional meta-analysis using summary level data as they would significantly increased the accuracy of the estimates.On the other hand, if more than 80% of the available data are at IPD level, including the AD does not provide significant differences in terms of accuracy of the estimates. Additionally, combining the IPD and AD has moderating effects on the biasness of the estimates of the treatment effects as the IPD tends to overestimate the treatment effects, while the AD has the tendency to produce underestimated effect estimates. These results may provide some guide in deciding if significant benefit is gained by pooling the two levels of data when conducting meta-analysis.

Keywords: Aggregate data, combined-level data, Individual patient data, meta analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1740

7104 Multivariate Assessment of Mathematics Test Scores of Students in Qatar

Authors: Ali Rashash Alzahrani, Elizabeth Stojanovski

Abstract:

Data on various aspects of education are collected at the institutional and government level regularly. In Australia, for example, students at various levels of schooling undertake examinations in numeracy and literacy as part of NAPLAN testing, enabling longitudinal assessment of such data as well as comparisons between schools and states within Australia. Another source of educational data collected internationally is via the PISA study which collects data from several countries when students are approximately 15 years of age and enables comparisons in the performance of science, mathematics and English between countries as well as ranking of countries based on performance in these standardised tests. As well as student and school outcomes based on the tests taken as part of the PISA study, there is a wealth of other data collected in the study including parental demographics data and data related to teaching strategies used by educators. Overall, an abundance of educational data is available which has the potential to be used to help improve educational attainment and teaching of content in order to improve learning outcomes. A multivariate assessment of such data enables multiple variables to be considered simultaneously and will be used in the present study to help develop profiles of students based on performance in mathematics using data obtained from the PISA study.

Keywords: Cluster analysis, education, mathematics, profiles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 892