Search results for: Spatial data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7903

Search results for: Spatial data mining

7633 Segmentation of Breast Lesions in Ultrasound Images Using Spatial Fuzzy Clustering and Structure Tensors

Authors: Yan Xu, Toshihiro Nishimura

Abstract:

Segmentation in ultrasound images is challenging due to the interference from speckle noise and fuzziness of boundaries. In this paper, a segmentation scheme using fuzzy c-means (FCM) clustering incorporating both intensity and texture information of images is proposed to extract breast lesions in ultrasound images. Firstly, the nonlinear structure tensor, which can facilitate to refine the edges detected by intensity, is used to extract speckle texture. And then, a spatial FCM clustering is applied on the image feature space for segmentation. In the experiments with simulated and clinical ultrasound images, the spatial FCM clustering with both intensity and texture information gets more accurate results than the conventional FCM or spatial FCM without texture information.

Keywords: fuzzy c-means, spatial information, structure tensor, ultrasound image segmentation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747
7632 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering

Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel

Abstract:

Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.

Keywords: Classification, data mining, spam filtering, naive Bayes, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448
7631 Benefits and Issues of Open-Cut Coal Mining on the Socio-Economic Environment - The Iban Community in Mukah, Sarawak, Malaysia

Authors: Edward Lim

Abstract:

This paper deals principally with the socio-economic impact on the local Iban community in Mukah Division, Sarawak; with the commencement of the open-cut coal mining industry since 2003. To-date there are no actual studies being carried out by either the public or private sector to truly analyze how the Iban community is coping with the advent of a large influx of cash into their society. The Iban community has traditionally been practicing shifting cultivation and farming of domesticated animals; with a portion of the younger generation working as laborers and professional. This paper represents the views and observations of the author supported by some statistical facts extracted from published articles and non-published reports. The paper deals primarily in the following areas: • Background of the coal mining industry in Mukah Division, Sarawak; • Benefits of the coal mining industry towards the Iban community; • Issues / Problems arise in the Iban community because of the presence of the coal mining industry; and • Possible actions that need to be taken to overcome these issues/ problems.

Keywords: Coal Mining, Iban Community, Malaysia, Sub-Bituminous Coal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402
7630 Apply Super-SVA to SAR Imaging with Both Aperture Gaps and Bandwidth Gaps

Authors: Wenshuai Zhai, Yunhua Zhang

Abstract:

Synthetic aperture radar (SAR) imaging usually requires echo data collected continuously pulse by pulse with certain bandwidth. However in real situation, data collection or part of signal spectrum can be interrupted due to various reasons, i.e. there will be gaps in spatial spectrum. In this case we need to find ways to fill out the resulted gaps and get image with defined resolution. In this paper we introduce our work on how to apply iterative spatially variant apodization (Super-SVA) technique to extrapolate the spatial spectrum in both azimuthal and range directions so as to fill out the gaps and get correct radar image.

Keywords: SAR imaging, Sparse aperture, Stepped frequencychirp signal, high resolution, Super-SVA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1913
7629 An Application for Web Mining Systems with Services Oriented Architecture

Authors: Thiago M. R. Dias, Gray F. Moita, Paulo E. M. Almeida

Abstract:

Although the World Wide Web is considered the largest source of information there exists nowadays, due to its inherent dynamic characteristics, the task of finding useful and qualified information can become a very frustrating experience. This study presents a research on the information mining systems in the Web; and proposes an implementation of these systems by means of components that can be built using the technology of Web services. This implies that they can encompass features offered by a services oriented architecture (SOA) and specific components may be used by other tools, independent of platforms or programming languages. Hence, the main objective of this work is to provide an architecture to Web mining systems, divided into stages, where each step is a component that will incorporate the characteristics of SOA. The separation of these steps was designed based upon the existing literature. Interesting results were obtained and are shown here.

Keywords: Web Mining, Service Oriented Architecture, WebServices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430
7628 Analysis of Road Repairs in Undermined Areas

Authors: Tomáš Seidler, Marek Mihola, Denisa Cihlarova

Abstract:

The article presents analysis results of maps of expected subsidence in undermined areas for road repair management. The analysis was done in the area of Karvina district in the Czech Republic, including undermined areas with ongoing deep mining activities or finished deep mining in years 2003 - 2009. The article discusses the possibilities of local road maintenance authorities to determine areas that will need most repairs in the future with limited data available. Using the expected subsidence maps new map of surface curvature was calculated. Combined with road maps and historical data about repairs the result came for five main categories of undermined areas, proving very simple tool for management.

Keywords: GIS, Map of Subsidence, Road, Undermined Area

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266
7627 An Exploration of the Dimensions of Place-Making: A South African Case Study

Authors: W. J. Strydom, K. Puren

Abstract:

Place-making is viewed here as an empowering process in which people represent, improve and maintain their spatial (natural or built) environment. With the above-mentioned in mind, place-making is multi-dimensional and include a spatial dimension (including visual properties or the end product/plan), a procedural dimension during which (negotiation/discussion of ideas with all relevant stakeholders in terms of end product/plan) and a psychological dimension (inclusion of intrinsic values and meanings related to a place in the end product/plan). These three represent dimensions of place-making. The purpose of this paper is to explore these dimensions of place-making in a case study of a local community in Ikageng, Potchefstroom, North-West Province, South Africa. This case study represents an inclusive process that strives to empower a local community (forcefully relocated due to Apartheid legislation in South Africa). This case study focussed on the inclusion of participants in the decision-making process regarding their daily environment. By means of focus group discussions and a collaborative design workshop, data is generated and ultimately creates a linkage with the theoretical dimensions of place-making. This paper contributes to the field of spatial planning due to the exploration of the dimensions of place-making and the relevancy of this process on spatial planning (especially in a South African setting).

Keywords: Case study, place-making, spatial planning, spatial dimension, procedural dimension, psychological dimension.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
7626 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline M. R. Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.

Keywords: Brazil, classifiers, data-mining, Image Segmentation, oil well visualization, classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2508
7625 Gender Differences in Spatial Navigation

Authors: Bia Kim, Sewon Lee, Jaesik Lee

Abstract:

This study aims to investigate the gender differences in spatial navigation using the tasks of 2-D matrix navigation and recognition of real driving scene. The results can be summarized as followings. First, female subjects responded faster in 2-D matrix navigation task than male subjects when landmark instructions were provided. Second, in recognition task, male subjects recognized the key elements involved in the past driving scene more accurately than female subjects. In particular, female subjects tended to miss peripheral information. These results suggest the possibility of gender differences in spatial navigation.

Keywords: Gender differences, Spatial navigation, 2-D matrixnavigation, Recognition of driving scene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2688
7624 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069
7623 Anomaly Based On Frequent-Outlier for Outbreak Detection in Public Health Surveillance

Authors: Zalizah Awang Long, Abdul Razak Hamdan, Azuraliza Abu Bakar

Abstract:

Public health surveillance system focuses on outbreak detection and data sources used. Variation or aberration in the frequency distribution of health data, compared to historical data is often used to detect outbreaks. It is important that new techniques be developed to improve the detection rate, thereby reducing wastage of resources in public health. Thus, the objective is to developed technique by applying frequent mining and outlier mining techniques in outbreak detection. 14 datasets from the UCI were tested on the proposed technique. The performance of the effectiveness for each technique was measured by t-test. The overall performance shows that DTK can be used to detect outlier within frequent dataset. In conclusion the outbreak detection technique using anomaly-based on frequent-outlier technique can be used to identify the outlier within frequent dataset.

Keywords: Outlier detection, frequent-outlier, outbreak, anomaly, surveillance, public health

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2231
7622 An Ontology for Spatial Relevant Objects in a Location-aware System: Case Study: A Tourist Guide System

Authors: N. Neysani Samany, M.R. Delavar, N. Chrisman, M.R. Malek

Abstract:

Location-aware computing is a type of pervasive computing that utilizes user-s location as a dominant factor for providing urban services and application-related usages. One of the important urban services is navigation instruction for wayfinders in a city especially when the user is a tourist. The services which are presented to the tourists should provide adapted location aware instructions. In order to achieve this goal, the main challenge is to find spatial relevant objects and location-dependent information. The aim of this paper is the development of a reusable location-aware model to handle spatial relevancy parameters in urban location-aware systems. In this way we utilized ontology as an approach which could manage spatial relevancy by defining a generic model. Our contribution is the introduction of an ontological model based on the directed interval algebra principles. Indeed, it is assumed that the basic elements of our ontology are the spatial intervals for the user and his/her related contexts. The relationships between them would model the spatial relevancy parameters. The implementation language for the model is OWLs, a web ontology language. The achieved results show that our proposed location-aware model and the application adaptation strategies provide appropriate services for the user.

Keywords: Spatial relevancy, Context-aware, Ontology, Modeling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600
7621 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2555
7620 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: Data mining, knowledge discovery in databases, prediction models, student success.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2455
7619 Roller Guide Design and Manufacturing for Spatial Cylindrical Cams

Authors: Yuan L. Lai, Jui P. Hung, Jian H. Chen

Abstract:

This paper was aimed at developing a computer aided design and manufacturing system for spatial cylindrical cams. In the proposed system, a milling tool with a diameter smaller than that of the roller, instead of the standard cutter for traditional machining process, was used to generate the tool path for spatial cams. To verify the feasibility of the proposed method, a multi-axis machining simulation software was further used to simulate the practical milling operation of spatial cams. It was observed from computer simulation that the tool path of small-sized cutter were within the motion range of a standard cutter, no occurrence of overcutting. Examination of a finished cam component clearly verifies the accuracy of the tool path generated for small-sized milling tool. It is believed that the use of small-sized cutter for the machining of the spatial cylindrical cams can generate a better surface morphology with higher accuracy. The improvement in efficiency and cost for the manufacturing of the spatial cylindrical cam can be expected through the proposed method.

Keywords: Cylindrical cams, Computer-aided manufacturing, Tool path.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3396
7618 The Role of People and Data in Complex Spatial-Related Long-Term Decisions: A Case Study of Capital Project Management Groups

Authors: Peter Boyes, Sarah Sharples, Paul Tennent, Gary Priestnall, Jeremy Morley

Abstract:

Significant long-term investment projects can involve complex decisions. These are often described as capital projects and the factors that contribute to their complexity include budgets, motivating reasons for investment, stakeholder involvement, interdependent projects, and the delivery phases required. The complexity of these projects often requires management groups to be established involving stakeholder representatives, these teams are inherently multidisciplinary. This study uses two university campus capital projects as case studies for this type of management group. Due to the interaction of projects with wider campus infrastructure and users, decisions are made at varying spatial granularity throughout the project lifespan. This spatial-related context brings complexity to the group decisions. Sensemaking is the process used to achieve group situational awareness of a complex situation, enabling the team to arrive at a consensus and make a decision. The purpose of this study is to understand the role of people and data in complex spatial related long-term decision and sensemaking processes. The paper aims to identify and present issues experienced in practical settings of these types of decision. A series of exploratory semi-structured interviews with members of the two projects elicit an understanding of their operation. From two stages of thematic analysis, inductive and deductive, emergent themes are identified around the group structure, the data usage, and the decision making within these groups. When data were made available to the group, there were commonly issues with perception of veracity and validity of the data presented; this impacted the ability of the group to reach consensus and therefore for decision to be made. Similarly, there were different responses to forecasted or modelled data, shaped by the experience and occupation of the individuals within the multidisciplinary management group. This paper provides an understanding of further support required for team sensemaking and decision making in complex capital projects. The paper also discusses the barriers found to effective decision making in this setting and suggests opportunities to develop decision support systems in this team strategic decision-making process. Recommendations are made for further research into the sensemaking and decision-making process of this complex spatial-related setting.

Keywords: decision making, decisions under uncertainty, real decisions, sensemaking, spatial, team decision making

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 411
7617 Optimization of Air Pollution Control Model for Mining

Authors: Zunaira Asif, Zhi Chen

Abstract:

The sustainable measures on air quality management are recognized as one of the most serious environmental concerns in the mining region. The mining operations emit various types of pollutants which have significant impacts on the environment. This study presents a stochastic control strategy by developing the air pollution control model to achieve a cost-effective solution. The optimization method is formulated to predict the cost of treatment using linear programming with an objective function and multi-constraints. The constraints mainly focus on two factors which are: production of metal should not exceed the available resources, and air quality should meet the standard criteria of the pollutant. The applicability of this model is explored through a case study of an open pit metal mine, Utah, USA. This method simultaneously uses meteorological data as a dispersion transfer function to support the practical local conditions. The probabilistic analysis and the uncertainties in the meteorological conditions are accomplished by Monte Carlo simulation. Reasonable results have been obtained to select the optimized treatment technology for PM2.5, PM10, NOx, and SO2. Additional comparison analysis shows that baghouse is the least cost option as compared to electrostatic precipitator and wet scrubbers for particulate matter, whereas non-selective catalytical reduction and dry-flue gas desulfurization are suitable for NOx and SO2 reduction respectively. Thus, this model can aid planners to reduce these pollutants at a marginal cost by suggesting control pollution devices, while accounting for dynamic meteorological conditions and mining activities.

Keywords: Air pollution, linear programming, mining, optimization, treatment technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
7616 An Analysis on the Appropriateness and Effectiveness of CCTV Location for Crime Prevention

Authors: Tae-Heon Moon, Sun-Young Heo, Sang-Ho Lee, Youn-Taik Leem, Kwang-Woo Nam

Abstract:

This study aims to investigate the possibility of crime prevention through CCTV by analyzing the appropriateness of the CCTV location, whether it is installed in the hotspot of crime-prone areas, and exploring the crime prevention effect and transition effect. The real crime and CCTV locations of case city were converted into the spatial data by using GIS. The data was analyzed by hotspot analysis and weighted displacement quotient (WDQ). As study methods, it analyzed existing relevant studies for identifying the trends of CCTV and crime studies based on big data from 1800 to 2014 and understanding the relation between CCTV and crime. Second, it investigated the current situation of nationwide CCTVs and analyzed the guidelines of CCTV installation and operation to draw attention to the problems and indicating points of CCTV use. Third, it investigated the crime occurrence in case areas and the current situation of CCTV installation in the spatial aspects, and analyzed the appropriateness and effectiveness of CCTV installation to suggest a rational installation of CCTV and the strategic direction of crime prevention. The results demonstrate that there was no significant effect in the installation of CCTV on crime prevention in the case area. This indicates that CCTV should be installed and managed in a more scientific way reflecting local crime situations. In terms of CCTV, the methods of spatial analysis such as GIS, which can evaluate the installation effect, and the methods of economic analysis like cost-benefit analysis should be developed. In addition, these methods should be distributed to local governments across the nation for the appropriate installation of CCTV and operation. This study intended to find a design guideline of the optimum CCTV installation. In this regard, this study is meaningful in that it will contribute to the creation of a safe city.

Keywords: CCTV, Safe City, Crime Prevention, Spatial Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612
7615 Spatial Distribution of Local Sheep Breeds in Antalya Province

Authors: Serife Gulden Yilmaz, Suleyman Karaman

Abstract:

Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).

Keywords: Antalya, sheep breeds, spatial distribution, local.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1177
7614 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1849
7613 Analysis of Impact of Land Use Regulations against Urban Spatial Structure - Centering around Shiheung City

Authors: Chang-il Kang, Yoon-Hong Park, Tae-Hyun Kim, Yu Wann

Abstract:

In this paper, we analyzed the pattern of urban spatial structure of Siheung City that had been divided into two parts and presented alternative plans in order to get rid of these phenomena. Concerning patterns of urban spatial structure, we examined it through means of analyzing status of land use, population density and distribution of residence, status of distribution of main facilities, medical facilities, status of distribution of cultural facilities, distribution of land prices and traffic volume trends. The results of study revealed that status of facilities distribution and distribution of land prices, etc. were bisected by the surrounding area of former municipal office and the district of Sihwa, which were both regarded as one apex of the city divide, forming a duo-centric city. In order to get rid of this problem concerned with urban spatial structure that has been bisected, it is required that measures in order to expand facilities in Siheung City should be taken.

Keywords: Urban Spatial Structure, Duo-centric City, Siheung City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1197
7612 A Recommender System Fusing Collaborative Filtering and User’s Review Mining

Authors: Seulbi Choi, Hyunchul Ahn

Abstract:

Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.

Keywords: Recommender system, collaborative filtering, text mining, review mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1531
7611 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: Recommendation, user profile, data mining, web technology, mobile technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2107
7610 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: Microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1225
7609 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Clustering, k-means, categorical datasets, pattern recognition, unsupervised learning, knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3489
7608 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3164
7607 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 909
7606 Hybrid Intelligent Intrusion Detection System

Authors: Norbik Bashah, Idris Bharanidharan Shanmugam, Abdul Manan Ahmed

Abstract:

Intrusion Detection Systems are increasingly a key part of systems defense. Various approaches to Intrusion Detection are currently being used, but they are relatively ineffective. Artificial Intelligence plays a driving role in security services. This paper proposes a dynamic model Intelligent Intrusion Detection System, based on specific AI approach for intrusion detection. The techniques that are being investigated includes neural networks and fuzzy logic with network profiling, that uses simple data mining techniques to process the network data. The proposed system is a hybrid system that combines anomaly, misuse and host based detection. Simple Fuzzy rules allow us to construct if-then rules that reflect common ways of describing security attacks. For host based intrusion detection we use neural-networks along with self organizing maps. Suspicious intrusions can be traced back to its original source path and any traffic from that particular source will be redirected back to them in future. Both network traffic and system audit data are used as inputs for both.

Keywords: Intrusion Detection, Network Security, Data mining, Fuzzy Logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085
7605 Space-Time Variation in Rainfall and Runoff: Upper Betwa Catchment

Authors: Ritu Ahlawat

Abstract:

Among all geo-hydrological relationships, rainfallrunoff relationship is of utmost importance in any hydrological investigation and water resource planning. Spatial variation, lag time involved in obtaining areal estimates for the basin as a whole can affect the parameterization in design stage as well as in planning stage. In conventional hydrological processing of data, spatial aspect is either ignored or interpolated at sub-basin level. Temporal variation when analysed for different stages can provide clues for its spatial effectiveness. The interplay of space-time variation at pixel level can provide better understanding of basin parameters. Sustenance of design structures for different return periods and their spatial auto-correlations should be studied at different geographical scales for better management and planning of water resources. In order to understand the relative effect of spatio-temporal variation in hydrological data network, a detailed geo-hydrological analysis of Betwa river catchment falling in Lower Yamuna Basin is presented in this paper. Moreover, the exact estimates about the availability of water in the Betwa river catchment, especially in the wake of recent Betwa-Ken linkage project, need thorough scientific investigation for better planning. Therefore, an attempt in this direction is made here to analyse the existing hydrological and meteorological data with the help of SPSS, GIS and MS-EXCEL software. A comparison of spatial and temporal correlations at subcatchment level in case of upper Betwa reaches has been made to demonstrate the representativeness of rain gauges. First, flows at different locations are used to derive correlation and regression coefficients. Then, long-term normal water yield estimates based on pixel-wise regression coefficients of rainfall-runoff relationship have been mapped. The areal values obtained from these maps can definitely improve upon estimates based on point-based extrapolations or areal interpolations.

Keywords: Catchment's runoff estimates, influence area regional regression coefficients, runoff yield series,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2059
7604 Spatial Data Science for Data Driven Urban Planning: The Youth Economic Discomfort Index for Rome

Authors: Iacopo Testi, Diego Pajarito, Nicoletta Roberto, Carmen Greco

Abstract:

Today, a consistent segment of the world’s population lives in urban areas, and this proportion will vastly increase in the next decades. Therefore, understanding the key trends in urbanization, likely to unfold over the coming years, is crucial to the implementation of sustainable urban strategies. In parallel, the daily amount of digital data produced will be expanding at an exponential rate during the following years. The analysis of various types of data sets and its derived applications have incredible potential across different crucial sectors such as healthcare, housing, transportation, energy, and education. Nevertheless, in city development, architects and urban planners appear to rely mostly on traditional and analogical techniques of data collection. This paper investigates the prospective of the data science field, appearing to be a formidable resource to assist city managers in identifying strategies to enhance the social, economic, and environmental sustainability of our urban areas. The collection of different new layers of information would definitely enhance planners' capabilities to comprehend more in-depth urban phenomena such as gentrification, land use definition, mobility, or critical infrastructural issues. Specifically, the research results correlate economic, commercial, demographic, and housing data with the purpose of defining the youth economic discomfort index. The statistical composite index provides insights regarding the economic disadvantage of citizens aged between 18 years and 29 years, and results clearly display that central urban zones and more disadvantaged than peripheral ones. The experimental set up selected the city of Rome as the testing ground of the whole investigation. The methodology aims at applying statistical and spatial analysis to construct a composite index supporting informed data-driven decisions for urban planning.

Keywords: Data science, spatial analysis, composite index, Rome, urban planning, youth economic discomfort index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 814