Search results for: Classification rule mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1820

Search results for: Classification rule mining

950 An Effective Islanding Detection and Classification Method Using Neuro-Phase Space Technique

Authors: Aziah Khamis, H. Shareef

Abstract:

The purpose of planned islanding is to construct a power island during system disturbances which are commonly formed for maintenance purpose. However, in most of the cases island mode operation is not allowed. Therefore distributed generators (DGs) must sense the unplanned disconnection from the main grid. Passive technique is the most commonly used method for this purpose. However, it needs improvement in order to identify the islanding condition. In this paper an effective method for identification of islanding condition based on phase space and neural network techniques has been developed. The captured voltage waveforms at the coupling points of DGs are processed to extract the required features. For this purposed a method known as the phase space techniques is used. Based on extracted features, two neural network configuration namely radial basis function and probabilistic neural networks are trained to recognize the waveform class. According to the test result, the investigated technique can provide satisfactory identification of the islanding condition in the distribution system.

Keywords: Classification, Islanding detection, Neural network, Phase space.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2112
949 Automatic Musical Genre Classification Using Divergence and Average Information Measures

Authors: Hassan Ezzaidi, Jean Rouat

Abstract:

Recently many research has been conducted to retrieve pertinent parameters and adequate models for automatic music genre classification. In this paper, two measures based upon information theory concepts are investigated for mapping the features space to decision space. A Gaussian Mixture Model (GMM) is used as a baseline and reference system. Various strategies are proposed for training and testing sessions with matched or mismatched conditions, long training and long testing, long training and short testing. For all experiments, the file sections used for testing are never been used during training. With matched conditions all examined measures yield the best and similar scores (almost 100%). With mismatched conditions, the proposed measures yield better scores than the GMM baseline system, especially for the short testing case. It is also observed that the average discrimination information measure is most appropriate for music category classifications and on the other hand the divergence measure is more suitable for music subcategory classifications.

Keywords: Audio feature, information measures, music genre.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
948 Characterization, Classification and Agricultural Potentials of Soils on a Toposequence in Southern Guinea Savanna of Nigeria

Authors: B. A. Lawal, A. G. Ojanuga, P. A. Tsado, A. Mohammed

Abstract:

This work assessed some properties of three pedons on a toposequence in Ijah-Gbagyi district in Niger State, Nigeria. The pedons were designated as JG1, JG2 and JG3 representing the upper, middle and lower slopes respectively. The surface soil was characterized by dark yellowish brown (10YR3/4) color at the JG1 and JG2 and very dark grayish brown (10YR3/2) color at JG3. Sand dominated the mineral fraction and its content in the surface horizon decreased down the slope, whereas silt content increased down the slope due to sorting by geological and pedogenic processes. Although organic carbon (OC), total nitrogen (TN) and available phosphorus (P) were rated high, TN and available P decreased down the slope. High cation exchange capacity (CEC) was an indication that the soils have high potential for plant nutrients retention. The pedons were classified as Typic Haplustepts/ Haplic Cambisols (Eutric), Plinthic Petraquepts/ Petric Plinthosols (Abruptic) and Typic Endoaquepts/ Endogleyic Cambisols (Endoclayic).

Keywords: Ecological region, landscape positions, soil characterization, soil classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4301
947 Cluster Algorithm for Genetic Diversity

Authors: Manpreet Singh, Keerat Kaur, Bhavdeep Singh

Abstract:

With the hardware technology advancing, the cost of storing is decreasing. Thus there is an urgent need for new techniques and tools that can intelligently and automatically assist us in transferring this data into useful knowledge. Different techniques of data mining are developed which are helpful for handling these large size databases [7]. Data mining is also finding its role in the field of biotechnology. Pedigree means the associated ancestry of a crop variety. Genetic diversity is the variation in the genetic composition of individuals within or among species. Genetic diversity depends upon the pedigree information of the varieties. Parents at lower hierarchic levels have more weightage for predicting genetic diversity as compared to the upper hierarchic levels. The weightage decreases as the level increases. For crossbreeding, the two varieties should be more and more genetically diverse so as to incorporate the useful characters of the two varieties in the newly developed variety. This paper discusses the searching and analyzing of different possible pairs of varieties selected on the basis of morphological characters, Climatic conditions and Nutrients so as to obtain the most optimal pair that can produce the required crossbreed variety. An algorithm was developed to determine the genetic diversity between the selected wheat varieties. Cluster analysis technique is used for retrieving the results.

Keywords: Genetic diversity, pedigree, nutrients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781
946 Stabilization of Clay Soil Using A-3 Soil

Authors: Mohammed Mustapha Alhaji, Salawu Sadiku

Abstract:

A clay soil classified as A-7-6 and CH soil according to AASHTO and unified soil classification system respectively, was stabilized using A-3 soil (AASHTO soil classification system). The clay soil was replaced with 0%, 10%, 20%, to 100% A-3 soil, compacted at both British Standard Light (BSL) and British Standard Heavy (BSH) compaction energy levels and using Unconfined Compressive Strength (UCS) as evaluation criteria. The Maximum Dry Density (MDD) of the treated soils at both the BSL and BSH compaction energy levels showed increase from 0% to 40% A-3 soil replacement after which the values reduced to 100% replacement. The trend of the Optimum Moisture Content (OMC) with varied A-3 soil replacement was similar to that of MDD but in a reversed order. The OMC reduced from 0% to 40% A-3 soil replacement after which the values increased to 100% replacement. This trend was attributed to the observed reduction in void ratio from 0% to 40% replacement after which the void ratio increased to 100% replacement. The maximum UCS for the soil at varied A-3 soil replacement increased from 272 and 770 kN/m2 for BSL and BSH compaction energy level at 0% replacement to 295 and 795 kN/m2 for BSL and BSH compaction energy level respectively at 10% replacement after which the values reduced to 22 and 60 kN/m2 for BSL and BSH compaction energy level respectively at 70% replacement. Beyond 70% replacement, the mixtures could not be moulded for UCS test.

Keywords: A-3 soil, clay soil, pozzolanic action, stabilization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2377
945 Increasing the Capacity of Plant Bottlenecks by Using of Improving the Ratio of Mean Time between Failures to Mean Time to Repair

Authors: Jalal Soleimannejad, Mohammad Asadizeidabadi, Mahmoud Koorki, Mojtaba Azarpira

Abstract:

A significant percentage of production costs is the maintenance costs, and analysis of maintenance costs could to achieve greater productivity and competitiveness. With this is mind, the maintenance of machines and installations is considered as an essential part of organizational functions and applying effective strategies causes significant added value in manufacturing activities. Organizations are trying to achieve performance levels on a global scale with emphasis on creating competitive advantage by different methods consist of RCM (Reliability-Center-Maintenance), TPM (Total Productivity Maintenance) etc. In this study, increasing the capacity of Concentration Plant of Golgohar Iron Ore Mining & Industrial Company (GEG) was examined by using of reliability and maintainability analyses. The results of this research showed that instead of increasing the number of machines (in order to solve the bottleneck problems), the improving of reliability and maintainability would solve bottleneck problems in the best way. It should be mention that in the abovementioned study, the data set of Concentration Plant of GEG as a case study, was applied and analyzed.

Keywords: Bottleneck, Golgohar Iron Ore Mining and Industrial Company, maintainability, maintenance costs, reliability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 938
944 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
943 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: Classification, falls, health risk factors, machine learning, older adults.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1015
942 Uncontrollable Inaccuracy in Inverse Problems

Authors: Yu. Menshikov

Abstract:

In this paper the influence of errors of function derivatives in initial time which have been obtained by experiment (uncontrollable inaccuracy) to the results of inverse problem solution was investigated. It was shown that these errors distort the inverse problem solution as a rule near the beginning of interval where the solutions are analyzed. Several methods for removing the influence of uncontrollable inaccuracy have been suggested. 

Keywords: Inverse problems, uncontrollable inaccuracy, filtration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1151
941 Improving Fake News Detection Using K-means and Support Vector Machine Approaches

Authors: Kasra Majbouri Yazdi, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy

Abstract:

Fake news and false information are big challenges of all types of media, especially social media. There is a lot of false information, fake likes, views and duplicated accounts as big social networks such as Facebook and Twitter admitted. Most information appearing on social media is doubtful and in some cases misleading. They need to be detected as soon as possible to avoid a negative impact on society. The dimensions of the fake news datasets are growing rapidly, so to obtain a better result of detecting false information with less computation time and complexity, the dimensions need to be reduced. One of the best techniques of reducing data size is using feature selection method. The aim of this technique is to choose a feature subset from the original set to improve the classification performance. In this paper, a feature selection method is proposed with the integration of K-means clustering and Support Vector Machine (SVM) approaches which work in four steps. First, the similarities between all features are calculated. Then, features are divided into several clusters. Next, the final feature set is selected from all clusters, and finally, fake news is classified based on the final feature subset using the SVM method. The proposed method was evaluated by comparing its performance with other state-of-the-art methods on several specific benchmark datasets and the outcome showed a better classification of false information for our work. The detection performance was improved in two aspects. On the one hand, the detection runtime process decreased, and on the other hand, the classification accuracy increased because of the elimination of redundant features and the reduction of datasets dimensions.

Keywords: Fake news detection, feature selection, support vector machine, K-means clustering, machine learning, social media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4454
940 Assessing Land Cover Change Trajectories in Olomouc, Czech Republic

Authors: Mukesh Singh Boori, Vít Voženílek

Abstract:

Olomouc is a unique and complex landmark with widespread forestation and land use. This research work was conducted to assess important and complex land use change trajectories in Olomouc region. Multi-temporal satellite data from 1991, 2001 and 2013 were used to extract land use/cover types by object oriented classification method. To achieve the objectives, three different aspects were used: (1) Calculate the quantity of each transition; (2) Allocate location based landscape pattern (3) Compare land use/cover evaluation procedure. Land cover change trajectories shows that 16.69% agriculture, 54.33% forest and 21.98% other areas (settlement, pasture and water-body) were stable in all three decade. Approximately 30% of the study area maintained as a same land cove type from 1991 to 2013. Here broad scale of political and socioeconomic factors was also affect the rate and direction of landscape changes. Distance from the settlements was the most important predictor of land cover change trajectories. This showed that most of landscape trajectories were caused by socio-economic activities and mainly led to virtuous change on the ecological environment.

Keywords: Remote Sensing, land use/cover, Change trajectories, Image classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2825
939 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018

Authors: M. Sitoe, O. Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: Evasion and retention, cross validation, bagging, stacking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 73
938 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1329
937 Motor Imaginary Signal Classification Using Adaptive Recursive Bandpass Filter and Adaptive Autoregressive Models for Brain Machine Interface Designs

Authors: Vickneswaran Jeyabalan, Andrews Samraj, Loo Chu Kiong

Abstract:

The noteworthy point in the advancement of Brain Machine Interface (BMI) research is the ability to accurately extract features of the brain signals and to classify them into targeted control action with the easiest procedures since the expected beneficiaries are of disabled. In this paper, a new feature extraction method using the combination of adaptive band pass filters and adaptive autoregressive (AAR) modelling is proposed and applied to the classification of right and left motor imagery signals extracted from the brain. The introduction of the adaptive bandpass filter improves the characterization process of the autocorrelation functions of the AAR models, as it enhances and strengthens the EEG signal, which is noisy and stochastic in nature. The experimental results on the Graz BCI data set have shown that by implementing the proposed feature extraction method, a LDA and SVM classifier outperforms other AAR approaches of the BCI 2003 competition in terms of the mutual information, the competition criterion, or misclassification rate.

Keywords: Adaptive autoregressive, adaptive bandpass filter, brain machine Interface, EEG, motor imaginary.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2871
936 Site Selection of Traffic Camera based on Dempster-Shafer and Bagging Theory

Authors: S. Rokhsari, M. Delavar, A. Sadeghi-Niaraki, A. Abed-Elmdoust, B. Moshiri

Abstract:

Traffic incident has bad effect on all parts of society so controlling road networks with enough traffic devices could help to decrease number of accidents, so using the best method for optimum site selection of these devices could help to implement good monitoring system. This paper has considered here important criteria for optimum site selection of traffic camera based on aggregation methods such as Bagging and Dempster-Shafer concepts. In the first step, important criteria such as annual traffic flow, distance from critical places such as parks that need more traffic controlling were identified for selection of important road links for traffic camera installation, Then classification methods such as Artificial neural network and Decision tree algorithms were employed for classification of road links based on their importance for camera installation. Then for improving the result of classifiers aggregation methods such as Bagging and Dempster-Shafer theories were used.

Keywords: Aggregation, Bagging theory, Dempster-Shafer theory, Site selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
935 A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data

Authors: Rameswar Debnath, Haruhisa Takahashi

Abstract:

An evolutionary method whose selection and recombination operations are based on generalization error-bounds of support vector machine (SVM) can select a subset of potentially informative genes for SVM classifier very efficiently [7]. In this paper, we will use the derivative of error-bound (first-order criteria) to select and recombine gene features in the evolutionary process, and compare the performance of the derivative of error-bound with the error-bound itself (zero-order) in the evolutionary process. We also investigate several error-bounds and their derivatives to compare the performance, and find the best criteria for gene selection and classification. We use 7 cancer-related human gene expression datasets to evaluate the performance of the zero-order and first-order criteria of error-bounds. Though both criteria have the same strategy in theoretically, experimental results demonstrate the best criterion for microarray gene expression data.

Keywords: support vector machine, generalization error-bound, feature selection, evolutionary algorithm, microarray data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
934 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
933 Multivariate High Order Fuzzy Time Series Forecasting for Car Road Accidents

Authors: Tahseen A. Jilani, S. M. Aqil Burney, C. Ardil

Abstract:

In this paper, we have presented a new multivariate fuzzy time series forecasting method. This method assumes mfactors with one main factor of interest. History of past three years is used for making new forecasts. This new method is applied in forecasting total number of car accidents in Belgium using four secondary factors. We also make comparison of our proposed method with existing methods of fuzzy time series forecasting. Experimentally, it is shown that our proposed method perform better than existing fuzzy time series forecasting methods. Practically, actuaries are interested in analysis of the patterns of causalities in road accidents. Thus using fuzzy time series, actuaries can define fuzzy premium and fuzzy underwriting of car insurance and life insurance for car insurance. National Institute of Statistics, Belgium provides region of risk classification for each road. Thus using this risk classification, we can predict premium rate and underwriting of insurance policy holders.

Keywords: Average forecasting error rate (AFER), Fuzziness offuzzy sets Fuzzy, If-Then rules, Multivariate fuzzy time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2462
932 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3511
931 Machine Learning Approach for Identifying Dementia from MRI Images

Authors: S. K. Aruna, S. Chitra

Abstract:

This research paper presents a framework for classifying Magnetic Resonance Imaging (MRI) images for Dementia. Dementia, an age-related cognitive decline is indicated by degeneration of cortical and sub-cortical structures. Characterizing morphological changes helps understand disease development and contributes to early prediction and prevention of the disease. Modelling, that captures the brain’s structural variability and which is valid in disease classification and interpretation is very challenging. Features are extracted using Gabor filter with 0, 30, 60, 90 orientations and Gray Level Co-occurrence Matrix (GLCM). It is proposed to normalize and fuse the features. Independent Component Analysis (ICA) selects features. Support Vector Machine (SVM) classifier with different kernels is evaluated, for efficiency to classify dementia. This study evaluates the presented framework using MRI images from OASIS dataset for identifying dementia. Results showed that the proposed feature fusion classifier achieves higher classification accuracy.

Keywords: Magnetic resonance imaging, dementia, Gabor filter, gray level co-occurrence matrix, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2087
930 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 883
929 Platform-as-a-Service Sticky Policies for Privacy Classification in the Cloud

Authors: Maha Shamseddine, Amjad Nusayr, Wassim Itani

Abstract:

In this paper, we present a Platform-as-a-Service (PaaS) model for controlling the privacy enforcement mechanisms applied on user data when stored and processed in Cloud data centers. The proposed architecture consists of establishing user configurable ‘sticky’ policies on the Graphical User Interface (GUI) data-bound components during the application development phase to specify the details of privacy enforcement on the contents of these components. Various privacy classification classes on the data components are formally defined to give the user full control on the degree and scope of privacy enforcement including the type of execution containers to process the data in the Cloud. This not only enhances the privacy-awareness of the developed Cloud services, but also results in major savings in performance and energy efficiency due to the fact that the privacy mechanisms are solely applied on sensitive data units and not on all the user content. The proposed design is implemented in a real PaaS cloud computing environment on the Microsoft Azure platform.

Keywords: Privacy enforcement, Platform-as-a-Service privacy awareness, cloud computing privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 728
928 A Preference-Based Multi-Agent Data Mining Framework for Social Network Service Users' Decision Making

Authors: Ileladewa Adeoye Abiodun, Cheng Wai Khuen

Abstract:

Multi-Agent Systems (MAS) emerged in the pursuit to improve our standard of living, and hence can manifest complex human behaviors such as communication, decision making, negotiation and self-organization. The Social Network Services (SNSs) have attracted millions of users, many of whom have integrated these sites into their daily practices. The domains of MAS and SNS have lots of similarities such as architecture, features and functions. Exploring social network users- behavior through multiagent model is therefore our research focus, in order to generate more accurate and meaningful information to SNS users. An application of MAS is the e-Auction and e-Rental services of the Universiti Cyber AgenT(UniCAT), a Social Network for students in Universiti Tunku Abdul Rahman (UTAR), Kampar, Malaysia, built around the Belief- Desire-Intention (BDI) model. However, in spite of the various advantages of the BDI model, it has also been discovered to have some shortcomings. This paper therefore proposes a multi-agent framework utilizing a modified BDI model- Belief-Desire-Intention in Dynamic and Uncertain Situations (BDIDUS), using UniCAT system as a case study.

Keywords: Distributed Data Mining, Multi-Agent Systems, Preference-Based, SNS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470
927 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1715
926 Voltage Problem Location Classification Using Performance of Least Squares Support Vector Machine LS-SVM and Learning Vector Quantization LVQ

Authors: Khaled Abduesslam. M, Mohammed Ali, Basher H Alsdai, Muhammad Nizam, Inayati

Abstract:

This paper presents the voltage problem location classification using performance of Least Squares Support Vector Machine (LS-SVM) and Learning Vector Quantization (LVQ) in electrical power system for proper voltage problem location implemented by IEEE 39 bus New- England. The data was collected from the time domain simulation by using Power System Analysis Toolbox (PSAT). Outputs from simulation data such as voltage, phase angle, real power and reactive power were taken as input to estimate voltage stability at particular buses based on Power Transfer Stability Index (PTSI).The simulation data was carried out on the IEEE 39 bus test system by considering load bus increased on the system. To verify of the proposed LS-SVM its performance was compared to Learning Vector Quantization (LVQ). The results showed that LS-SVM is faster and better as compared to LVQ. The results also demonstrated that the LS-SVM was estimated by 0% misclassification whereas LVQ had 7.69% misclassification.

Keywords: IEEE 39 bus, Least Squares Support Vector Machine, Learning Vector Quantization, Voltage Collapse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2387
925 Multi-Sensor Image Fusion for Visible and Infrared Thermal Images

Authors: Amit Kr. Happy

Abstract:

This paper is motivated by the importance of multi-sensor image fusion with specific focus on Infrared (IR) and Visible image (VI) fusion for various applications including military reconnaissance. Image fusion can be defined as the process of combining two or more source images into a single composite image with extended information content that improves visual perception or feature extraction. These images can be from different modalities like Visible camera & IR Thermal Imager. While visible images are captured by reflected radiations in the visible spectrum, the thermal images are formed from thermal radiation (IR) that may be reflected or self-emitted. A digital color camera captures the visible source image and a thermal IR camera acquires the thermal source image. In this paper, some image fusion algorithms based upon Multi-Scale Transform (MST) and region-based selection rule with consistency verification have been proposed and presented. This research includes implementation of the proposed image fusion algorithm in MATLAB along with a comparative analysis to decide the optimum number of levels for MST and the coefficient fusion rule. The results are presented, and several commonly used evaluation metrics are used to assess the suggested method's validity. Experiments show that the proposed approach is capable of producing good fusion results. While deploying our image fusion algorithm approaches, we observe several challenges from the popular image fusion methods. While high computational cost and complex processing steps of image fusion algorithms provide accurate fused results, but they also make it hard to become deployed in system and applications that require real-time operation, high flexibility and low computation ability. So, the methods presented in this paper offer good results with minimum time complexity.

Keywords: Image fusion, IR thermal imager, multi-sensor, Multi-Scale Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 381
924 A Learning-Community Recommendation Approach for Web-Based Cooperative Learning

Authors: Jian-Wei Li, Yao-Tien Wang, Yi-Chun Chang

Abstract:

Cooperative learning has been defined as learners working together as a team to solve a problem to complete a task or to accomplish a common goal, which emphasizes the importance of interactions among members to promote the whole learning performance. With the popularity of society networks, cooperative learning is no longer limited to traditional classroom teaching activities. Since society networks facilitate to organize online learners, to establish common shared visions, and to advance learning interaction, the online community and online learning community have triggered the establishment of web-based societies. Numerous research literatures have indicated that the collaborative learning community is a critical issue to enhance learning performance. Hence, this paper proposes a learning community recommendation approach to facilitate that a learner joins the appropriate learning communities, which is based on k-nearest neighbor (kNN) classification. To demonstrate the viability of the proposed approach, the proposed approach is implemented for 117 students to recommend learning communities. The experimental results indicate that the proposed approach can effectively recommend appropriate learning communities for learners.

Keywords: k-nearest neighbor classification, learning community, Cooperative/Collaborative Learning and Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
923 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2577
922 A Real-Time Specific Weed Recognition System Using Statistical Methods

Authors: Imran Ahmed, Muhammad Islam, Syed Inayat Ali Shah, Awais Adnan

Abstract:

The identification and classification of weeds are of major technical and economical importance in the agricultural industry. To automate these activities, like in shape, color and texture, weed control system is feasible. The goal of this paper is to build a real-time, machine vision weed control system that can detect weed locations. In order to accomplish this objective, a real-time robotic system is developed to identify and locate outdoor plants using machine vision technology and pattern recognition. The algorithm is developed to classify images into broad and narrow class for real-time selective herbicide application. The developed algorithm has been tested on weeds at various locations, which have shown that the algorithm to be very effectiveness in weed identification. Further the results show a very reliable performance on weeds under varying field conditions. The analysis of the results shows over 90 percent classification accuracy over 140 sample images (broad and narrow) with 70 samples from each category of weeds.

Keywords: Weed detection, Image Processing, real-timerecognition, Standard Deviation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2239
921 Validation and Selection between Machine Learning Technique and Traditional Methods to Reduce Bullwhip Effects: a Data Mining Approach

Authors: Hamid R. S. Mojaveri, Seyed S. Mousavi, Mojtaba Heydar, Ahmad Aminian

Abstract:

The aim of this paper is to present a methodology in three steps to forecast supply chain demand. In first step, various data mining techniques are applied in order to prepare data for entering into forecasting models. In second step, the modeling step, an artificial neural network and support vector machine is presented after defining Mean Absolute Percentage Error index for measuring error. The structure of artificial neural network is selected based on previous researchers' results and in this article the accuracy of network is increased by using sensitivity analysis. The best forecast for classical forecasting methods (Moving Average, Exponential Smoothing, and Exponential Smoothing with Trend) is resulted based on prepared data and this forecast is compared with result of support vector machine and proposed artificial neural network. The results show that artificial neural network can forecast more precisely in comparison with other methods. Finally, forecasting methods' stability is analyzed by using raw data and even the effectiveness of clustering analysis is measured.

Keywords: Artificial Neural Networks (ANN), bullwhip effect, demand forecasting, Support Vector Machine (SVM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985