Search results for: spatial data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25857

Search results for: spatial data mining

25497 Collective Strategies Dominate in Spatial Iterated Prisoners Dilemma

Authors: Jiawei Li

Abstract:

How cooperation emerges and persists in a population of selfish agents is a fundamental question in evolutionary game theory. Our research shows that Collective Strategies with Master-Slave Mechanism (CSMSM) defeat Tit-for-Tat and other well-known strategies in spatial iterated prisoner’s dilemma. A CSMSM identifies kin members by means of a handshaking mechanism. If the opponent is identified as non-kin, a CSMSM will always defect. Once two CSMSMs meet, they play master and slave roles. A mater defects and a slave cooperates in order to maximize the master’s payoff. CSMSM outperforms non-collective strategies in spatial IPD even if there is only a small cluster of CSMSMs in the population. The existence and performance of CSMSM in spatial iterated prisoner’s dilemma suggests that cooperation first appears and persists in a group of collective agents.

Keywords: Evolutionary game theory, spatial prisoners dilemma, collective strategy, master-slave mechanism

Procedia PDF Downloads 121
25496 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 361
25495 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: data mining, knowledge discovery in databases, prediction models, student success

Procedia PDF Downloads 384
25494 Sexting Phenomenon in Educational Settings: A Data Mining Approach

Authors: Koutsopoulou Ioanna, Gkintoni Evgenia, Halkiopoulos Constantinos, Antonopoulou Hera

Abstract:

Recent advances in Internet Computer Technology (ICT) and the ever-increasing use of technological equipment amongst adolescents and young adults along with unattended access to the internet and social media and uncontrolled use of smart phones and PCs have caused social problems like sexting to emerge. The main purpose of the present article is first to present an analytic theoretical framework of sexting as a recent social phenomenon based on studies that have been conducted the last decade or so; and second to investigate Greek students’ and also social network users, sexting perceptions and to record how often social media users exchange sexual messages and to retrace demographic variables predictors. Data from 1,000 students were collected and analyzed and all statistical analysis was done by the software package WEKA. The results indicate among others, that the use of data mining methods is an important tool to draw conclusions that could affect decision and policy making especially in the field and related social topics of educational psychology. To sum up, sexting lurks many risks for adolescents and young adults students in Greece and needs to be better addressed in relevance to the stakeholders as well as society in general. Furthermore, policy makers, legislation makers and authorities will have to take action to protect minors. Prevention strategies based on Greek cultural specificities are being proposed. This social problem has raised concerns in recent years and will most likely escalate concerns in global communities in the future.

Keywords: educational ethics, sexting, Greek sexters, sex education, data mining

Procedia PDF Downloads 160
25493 Optimization of Air Pollution Control Model for Mining

Authors: Zunaira Asif, Zhi Chen

Abstract:

The sustainable measures on air quality management are recognized as one of the most serious environmental concerns in the mining region. The mining operations emit various types of pollutants which have significant impacts on the environment. This study presents a stochastic control strategy by developing the air pollution control model to achieve a cost-effective solution. The optimization method is formulated to predict the cost of treatment using linear programming with an objective function and multi-constraints. The constraints mainly focus on two factors which are: production of metal should not exceed the available resources, and air quality should meet the standard criteria of the pollutant. The applicability of this model is explored through a case study of an open pit metal mine, Utah, USA. This method simultaneously uses meteorological data as a dispersion transfer function to support the practical local conditions. The probabilistic analysis and the uncertainties in the meteorological conditions are accomplished by Monte Carlo simulation. Reasonable results have been obtained to select the optimized treatment technology for PM2.5, PM10, NOx, and SO2. Additional comparison analysis shows that baghouse is the least cost option as compared to electrostatic precipitator and wet scrubbers for particulate matter, whereas non-selective catalytical reduction and dry-flue gas desulfurization are suitable for NOx and SO2 reduction respectively. Thus, this model can aid planners to reduce these pollutants at a marginal cost by suggesting control pollution devices, while accounting for dynamic meteorological conditions and mining activities.

Keywords: air pollution, linear programming, mining, optimization, treatment technologies

Procedia PDF Downloads 172
25492 An Analysis on the Appropriateness and Effectiveness of CCTV Location for Crime Prevention

Authors: Tae-Heon Moon, Sun-Young Heo, Sang-Ho Lee, Youn-Taik Leem, Kwang-Woo Nam

Abstract:

This study aims to investigate the possibility of crime prevention through CCTV by analyzing the appropriateness of the CCTV location, whether it is installed in the hotspot of crime-prone areas, and exploring the crime prevention effect and transition effect. The real crime and CCTV locations of case city were converted into the spatial data by using GIS. The data was analyzed by hotspot analysis and weighted displacement quotient(WDQ). As study methods, it analyzed existing relevant studies for identifying the trends of CCTV and crime studies based on big data from 1800 to 2014 and understanding the relation between CCTV and crime. Second, it investigated the current situation of nationwide CCTVs and analyzed the guidelines of CCTV installation and operation to draw attention to the problems and indicating points of domestic CCTV use. Third, it investigated the crime occurrence in case areas and the current situation of CCTV installation in the spatial aspects, and analyzed the appropriateness and effectiveness of CCTV installation to suggest a rational installation of CCTV and the strategic direction of crime prevention. The results demonstrate that there was no significant effect in the installation of CCTV on crime prevention. This indicates that CCTV should be installed and managed in a more scientific way reflecting local crime situations. In terms of CCTV, the methods of spatial analysis such as GIS, which can evaluate the installation effect, and the methods of economic analysis like cost-benefit analysis should be developed. In addition, these methods should be distributed to local governments across the nation for the appropriate installation of CCTV and operation. This study intended to find a design guideline of the optimum CCTV installation. In this regard, this study is meaningful in that it will contribute to the creation of a safe city.

Keywords: CCTV, safe city, crime prevention, spatial analysis

Procedia PDF Downloads 412
25491 Design of a Small and Medium Enterprise Growth Prediction Model Based on Web Mining

Authors: Yiea Funk Te, Daniel Mueller, Irena Pletikosa Cvijikj

Abstract:

Small and medium enterprises (SMEs) play an important role in the economy of many countries. When the overall world economy is considered, SMEs represent 95% of all businesses in the world, accounting for 66% of the total employment. Existing studies show that the current business environment is characterized as highly turbulent and strongly influenced by modern information and communication technologies, thus forcing SMEs to experience more severe challenges in maintaining their existence and expanding their business. To support SMEs at improving their competitiveness, researchers recently turned their focus on applying data mining techniques to build risk and growth prediction models. However, data used to assess risk and growth indicators is primarily obtained via questionnaires, which is very laborious and time-consuming, or is provided by financial institutes, thus highly sensitive to privacy issues. Recently, web mining (WM) has emerged as a new approach towards obtaining valuable insights in the business world. WM enables automatic and large scale collection and analysis of potentially valuable data from various online platforms, including companies’ websites. While WM methods have been frequently studied to anticipate growth of sales volume for e-commerce platforms, their application for assessment of SME risk and growth indicators is still scarce. Considering that a vast proportion of SMEs own a website, WM bears a great potential in revealing valuable information hidden in SME websites, which can further be used to understand SME risk and growth indicators, as well as to enhance current SME risk and growth prediction models. This study aims at developing an automated system to collect business-relevant data from the Web and predict future growth trends of SMEs by means of WM and data mining techniques. The envisioned system should serve as an 'early recognition system' for future growth opportunities. In an initial step, we examine how structured and semi-structured Web data in governmental or SME websites can be used to explain the success of SMEs. WM methods are applied to extract Web data in a form of additional input features for the growth prediction model. The data on SMEs provided by a large Swiss insurance company is used as ground truth data (i.e. growth-labeled data) to train the growth prediction model. Different machine learning classification algorithms such as the Support Vector Machine, Random Forest and Artificial Neural Network are applied and compared, with the goal to optimize the prediction performance. The results are compared to those from previous studies, in order to assess the contribution of growth indicators retrieved from the Web for increasing the predictive power of the model.

Keywords: data mining, SME growth, success factors, web mining

Procedia PDF Downloads 239
25490 Architectural Design Strategies and Visual Perception of Contemporary Spatial Design

Authors: Nora Geczy

Abstract:

In today’s architectural practice, during the process of designing public, educational, healthcare and cultural space, human-centered architectural designs helping spatial orientation, safe space usage and the appropriate spatial sequence of actions are gaining increasing importance. Related to the methodology of designing public buildings, several scientific experiments in spatial recognition, spatial analysis and spatial psychology with regard to the components of space producing mental and physiological effects have been going on at the Department of Architectural Design and the Interdisciplinary Student Workshop (IDM) at the Széchenyi István University, Győr since 2013. Defining the creation of preventive, anticipated spatial design and the architectural tools of spatial comfort of public buildings and their practical usability are in the limelight of our research. In the experiments applying eye-tracking cameras, we studied the way public spaces are used, especially concentrating on the characteristics of spatial behaviour, orientation, recognition, the sequence of actions, and space usage. Along with the role of mental maps, human perception, and interaction problems in public spaces (at railway stations, galleries, and educational institutions), we analyzed the spatial situations influencing psychological and ergonomic factors. We also analyzed the eye movements of the experimental subjects in dynamic situations, in spatial procession, using stairs and corridors. We monitored both the consequences and the distorting effects of the ocular dominance of the right eye on spatial orientation; we analyzed the gender-based differences of women and men’s orientation, stress-inducing spaces, spaces affecting concentration and the spatial situation influencing territorial behaviour. Based on these observations, we collected the components of creating public interior spaces, which -according to our theory- contribute to the optimal usability of public spaces. We summed up our research in criteria for design, including 10 points. Our further goals are testing design principles needed for optimizing orientation and space usage, their discussion, refinement, and practical usage.

Keywords: architecture, eye-tracking, human-centered spatial design, public interior spaces, visual perception

Procedia PDF Downloads 86
25489 The Role of People and Data in Complex Spatial-Related Long-Term Decisions: A Case Study of Capital Project Management Groups

Authors: Peter Boyes, Sarah Sharples, Paul Tennent, Gary Priestnall, Jeremy Morley

Abstract:

Significant long-term investment projects can involve complex decisions. These are often described as capital projects, and the factors that contribute to their complexity include budgets, motivating reasons for investment, stakeholder involvement, interdependent projects, and the delivery phases required. The complexity of these projects often requires management groups to be established involving stakeholder representatives; these teams are inherently multidisciplinary. This study uses two university campus capital projects as case studies for this type of management group. Due to the interaction of projects with wider campus infrastructure and users, decisions are made at varying spatial granularity throughout the project lifespan. This spatial-related context brings complexity to the group decisions. Sensemaking is the process used to achieve group situational awareness of a complex situation, enabling the team to arrive at a consensus and make a decision. The purpose of this study is to understand the role of people and data in the complex spatial related long-term decision and sensemaking processes. The paper aims to identify and present issues experienced in practical settings of these types of decision. A series of exploratory semi-structured interviews with members of the two projects elicit an understanding of their operation. From two stages of thematic analysis, inductive and deductive, emergent themes are identified around the group structure, the data usage, and the decision making within these groups. When data were made available to the group, there were commonly issues with the perception of veracity and validity of the data presented; this impacted the ability of group to reach consensus and, therefore, for decisions to be made. Similarly, there were different responses to forecasted or modelled data, shaped by the experience and occupation of the individuals within the multidisciplinary management group. This paper provides an understanding of further support required for team sensemaking and decision making in complex capital projects. The paper also discusses the barriers found to effective decision making in this setting and suggests opportunities to develop decision support systems in this team strategic decision-making process. Recommendations are made for further research into the sensemaking and decision-making process of this complex spatial-related setting.

Keywords: decision making, decisions under uncertainty, real decisions, sensemaking, spatial, team decision making

Procedia PDF Downloads 99
25488 Survey of Methods for Solutions of Spatial Covariance Structures and Their Limitations

Authors: Joseph Thomas Eghwerido, Julian I. Mbegbu

Abstract:

In modelling environment processes, we apply multidisciplinary knowledge to explain, explore and predict the Earth's response to natural human-induced environmental changes. Thus, the analysis of spatial-time ecological and environmental studies, the spatial parameters of interest are always heterogeneous. This often negates the assumption of stationarity. Hence, the dispersion of the transportation of atmospheric pollutants, landscape or topographic effect, weather patterns depends on a good estimate of spatial covariance. The generalized linear mixed model, although linear in the expected value parameters, its likelihood varies nonlinearly as a function of the covariance parameters. As a consequence, computing estimates for a linear mixed model requires the iterative solution of a system of simultaneous nonlinear equations. In other to predict the variables at unsampled locations, we need to know the estimate of the present sampled variables. The geostatistical methods for solving this spatial problem assume covariance stationarity (locally defined covariance) and uniform in space; which is not apparently valid because spatial processes often exhibit nonstationary covariance. Hence, they have globally defined covariance. We shall consider different existing methods of solutions of spatial covariance of a space-time processes at unsampled locations. This stationary covariance changes with locations for multiple time set with some asymptotic properties.

Keywords: parametric, nonstationary, Kernel, Kriging

Procedia PDF Downloads 231
25487 Spatial Interpolation of Intermediate Soil Properties to Enhance Geotechnical Surveying for Foundation Design

Authors: Yelbek B. Utepov, Assel T. Mukhamejanova, Aliya K. Aldungarova, Aida G. Nazarova, Sabit A. Karaulov, Nurgul T. Alibekova, Aigul K. Kozhas, Dias Kazhimkanuly, Akmaral K. Tleubayeva

Abstract:

This research focuses on enhancing geotechnical surveying for foundation design through the spatial interpolation of intermediate soil properties. Traditional geotechnical practices rely on discrete data from borehole drilling, soil sampling, and laboratory analyses, often neglecting the continuous nature of soil properties and disregarding values in intermediate locations. This study challenges these omissions by emphasizing interpolation techniques such as Kriging, Inverse Distance Weighting, and Spline interpolation to capture the nuanced spatial variations in soil properties. The methodology is applied to geotechnical survey data from two construction sites in Astana, Kazakhstan, revealing continuous representations of Young's Modulus, Cohesion, and Friction Angle. The spatial heatmaps generated through interpolation offered valuable insights into the subsurface environment, highlighting heterogeneity and aiding in more informed foundation design decisions for considered cites. Moreover, intriguing patterns of heterogeneity, as well as visual clusters and transitions between soil classes, were explored within seemingly uniform layers. The study bridges the gap between discrete borehole samples and the continuous subsurface, contributing to the evolution of geotechnical engineering practices. The proposed approach, utilizing open-source software geographic information systems, provides a practical tool for visualizing soil characteristics and may pave the way for future advancements in geotechnical surveying and foundation design.

Keywords: soil mechanical properties, spatial interpolation, inverse distance weighting, heatmaps

Procedia PDF Downloads 31
25486 A Recommender System Fusing Collaborative Filtering and User’s Review Mining

Authors: Seulbi Choi, Hyunchul Ahn

Abstract:

Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.

Keywords: Recommender system, Collaborative filtering, Text mining, Review mining

Procedia PDF Downloads 310
25485 Intermediate-Term Impact of Taiwan High-Speed Rail (HSR) and Land Use on Spatial Patterns of HSR Travel

Authors: Tsai Yu-hsin, Chung Yi-Hsin

Abstract:

The employment of an HSR system, resulting in elevation in the inter-city/-region accessibility, is likely to promote spatial interaction between places in the HSR and extended territory. The inter-city/-region travel via HSR could be, among others, affected by the land use, transportation, and location of the HSR station at both trip origin and destination ends. However, relatively few insights have been shed on these impacts and spatial patterns of the HSR travel. The research purposes, as phase one of a series of HSR related research, of this study are threefold: to analyze the general spatial patterns of HSR trips, such as the spatial distribution of trip origins and destinations; to analyze if specific land use, transportation characteristics, and trip characteristics affect HSR trips in terms of the use of HSR, the distribution of trip origins and destinations, and; to analyze the socio-economic characteristics of HSR travelers. With the Taiwan HSR starting operation in 2007, this study emphasizes on the intermediate-term impact of HSR, which is made possible with the population and housing census and industry and commercial census data and a station area intercept survey conducted in the summer 2014. The analysis will be conducted at the city, inter-city, and inter-region spatial levels, as necessary and required. The analysis tools include descriptive statistics and multivariate analysis with the assistance of SPSS, HLM and ArcGIS. The findings, on the one hand, can provide policy implications for associated land use, transportation plan and the site selection of HSR station. On the other hand, on the travel the findings are expected to provide insights that can help explain how land use and real estate values could be affected by HSR in following phases of this series of research.

Keywords: high speed rail, land use, travel, spatial pattern

Procedia PDF Downloads 434
25484 An Exploration of the Dimensions of Place-Making: A South African Case Study

Authors: W. J. Strydom, K. Puren

Abstract:

Place-making is viewed here as an empowering process in which people represent, improve and maintain their spatial (natural or built) environment. With the above-mentioned in mind, place-making is multi-dimensional and include a spatial dimension (including visual properties or the end product/plan), a procedural dimension during which (negotiation/discussion of ideas with all relevant stakeholders in terms of end product/plan) and a psychological dimension (inclusion of intrinsic values and meanings related to a place in the end product/plan). These three represent dimensions of place-making. The purpose of this paper is to explore these dimensions of place-making in a case study of a local community in Ikageng, Potchefstroom, North-West Province, South Africa. This case study represents an inclusive process that strives to empower a local community (forcefully relocated due to Apartheid legislation in South Africa). This case study focussed on the inclusion of participants in the decision-making process regarding their daily environment. By means of focus group discussions and a collaborative design workshop, data is generated and ultimately creates a linkage with the theoretical dimensions of place-making. This paper contributes to the field of spatial planning due to the exploration of the dimensions of place-making and the relevancy of this process on spatial planning (especially in a South African setting).

Keywords: community engagement, place-making, planning theory, spatial planning

Procedia PDF Downloads 370
25483 Derivation of Bathymetry from High-Resolution Satellite Images: Comparison of Empirical Methods through Geographical Error Analysis

Authors: Anusha P. Wijesundara, Dulap I. Rathnayake, Nihal D. Perera

Abstract:

Bathymetric information is fundamental importance to coastal and marine planning and management, nautical navigation, and scientific studies of marine environments. Satellite-derived bathymetry data provide detailed information in areas where conventional sounding data is lacking and conventional surveys are inaccessible. The two empirical approaches of log-linear bathymetric inversion model and non-linear bathymetric inversion model are applied for deriving bathymetry from high-resolution multispectral satellite imagery. This study compares these two approaches by means of geographical error analysis for the site Kankesanturai using WorldView-2 satellite imagery. Based on the Levenberg-Marquardt method calibrated the parameters of non-linear inversion model and the multiple-linear regression model was applied to calibrate the log-linear inversion model. In order to calibrate both models, Single Beam Echo Sounding (SBES) data in this study area were used as reference points. Residuals were calculated as the difference between the derived depth values and the validation echo sounder bathymetry data and the geographical distribution of model residuals was mapped. The spatial autocorrelation was calculated by comparing the performance of the bathymetric models and the results showing the geographic errors for both models. A spatial error model was constructed from the initial bathymetry estimates and the estimates of autocorrelation. This spatial error model is used to generate more reliable estimates of bathymetry by quantifying autocorrelation of model error and incorporating this into an improved regression model. Log-linear model (R²=0.846) performs better than the non- linear model (R²=0.692). Finally, the spatial error models improved bathymetric estimates derived from linear and non-linear models up to R²=0.854 and R²=0.704 respectively. The Root Mean Square Error (RMSE) was calculated for all reference points in various depth ranges. The magnitude of the prediction error increases with depth for both the log-linear and the non-linear inversion models. Overall RMSE for log-linear and the non-linear inversion models were ±1.532 m and ±2.089 m, respectively.

Keywords: log-linear model, multi spectral, residuals, spatial error model

Procedia PDF Downloads 273
25482 The Relationship between Metropolitan Space and Spatial Distribution of Main Innovative Actors: The Case of Yangtze Delta Metropolitan in China

Authors: Jun Zhou, Xingping Wang, Paul Milbourne

Abstract:

Evidences in the world shows that the industry and population have being greatly concentrated in metropolitan regions which is getting to be the most important area for the economic power and people living standard in the future. In the meanwhile, the relevant innovation theories of Agglomeration, New Industrial Geography and Modern Evolutionary innovation prove that the reason why the agglomeration in world-class city and metropolitan areas and also verify innovation is the key point for the development of metropolis. The primary purpose of this paper is to analyze the geographical spatial characteristics of innovative subjects which contain firm, university, research institution, government and intermediary organ in metropolis throughout the amount data analysis in Yangtze River Metropolis in China. The results show three main conclusions. The first is different subjects in different regions have different spatial characteristics. The second one is different structure and pattern between the subjects also can produce different innovative effect. The last but not the least is agglomeration of innovative subjects’ is not only influenced by the innovative network or local policies but also affected by the localized industry characteristics and culture which are getting to be the most important crucial factors.

Keywords: metropolitan development, innovative subject, spatial, Yangtze River Metropolis, China

Procedia PDF Downloads 352
25481 Detecting of Crime Hot Spots for Crime Mapping

Authors: Somayeh Nezami

Abstract:

The management of financial and human resources of police in metropolitans requires many information and exact plans to reduce a rate of crime and increase the safety of the society. Geographical Information Systems have an important role in providing crime maps and their analysis. By using them and identification of crime hot spots along with spatial presentation of the results, it is possible to allocate optimum resources while presenting effective methods for decision making and preventive solutions. In this paper, we try to explain and compare between some of the methods of hot spots analysis such as Mode, Fuzzy Mode and Nearest Neighbour Hierarchical spatial clustering (NNH). Then the spots with the highest crime rates of drug smuggling for one province in Iran with borderline with Afghanistan are obtained. We will show that among these three methods NNH leads to the best result.

Keywords: GIS, Hot spots, nearest neighbor hierarchical spatial clustering, NNH, spatial analysis of crime

Procedia PDF Downloads 299
25480 Spatial Distribution of Local Sheep Breeds in Antalya Province

Authors: Serife Gulden Yilmaz, Suleyman Karaman

Abstract:

Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).

Keywords: sheep breeds, local, spatial distribution, agglomeration, Antalya

Procedia PDF Downloads 258
25479 VR in the Middle School Classroom-An Experimental Study on Spatial Relations and Immersive Virtual Reality

Authors: Danielle Schneider, Ying Xie

Abstract:

Middle school science, technology, engineering, and math (STEM) teachers experience an exceptional challenge in the expectation to incorporate curricula that builds strong spatial reasoning skills on rudimentary geometry concepts. Because spatial ability is so closely tied to STEM students’ success, researchers are tasked to determine effective instructional practices that create an authentic learning environment within the immersive virtual reality learning environment (IVRLE). This study looked to investigate the effect of the IVRLE on middle school STEM students’ spatial reasoning skills as a methodology to benefit the STEM middle school students’ spatial reasoning skills. This experimental study was comprised of thirty 7th-grade STEM students divided into a treatment group that was engaged in an immersive VR platform where they engaged in building an object in the virtual realm by applying spatial processing and visualizing its dimensions and a control group that built the identical object using a desktop computer-based, computer-aided design (CAD) program. Before and after the students participated in the respective “3D modeling” environment, their spatial reasoning abilities were assessed using the Middle Grades Mathematics Project Spatial Visualization Test (MGMP-SVT). Additionally, both groups created a physical 3D model as a secondary measure to measure the effectiveness of the IVRLE. The results of a one-way ANOVA in this study identified a negative effect on those in the IVRLE. These findings suggest that with middle school students, virtual reality (VR) proved an inadequate tool to benefit spatial relation skills as compared to desktop-based CAD.

Keywords: virtual reality, spatial reasoning, CAD, middle school STEM

Procedia PDF Downloads 53
25478 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 163
25477 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 353
25476 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 235
25475 Study on the Spatial Evolution Characteristics of Urban Agglomeration Integration in China: The Case of Chengdu-Chongqing Urban Agglomeration

Authors: Guoqin Ge, Minhui Huang, Yazhou Zhou

Abstract:

The growth of the Chengdu-Chongqing urban agglomeration has been designated as a national strategy in China. Analyzing its spatial evolution characteristics is crucial for devising relevant development strategies. This paper enhances the gravitational model by using temporal distance as a factor. It applies this improved model to assess the economic interconnection and concentration level of each geographical unit within the Chengdu-Chongqing urban agglomeration between 2011 and 2019. On this basis, this paper examines the spatial correlation characteristics of economic agglomeration intensity and urban-rural development equalization by employing spatial autocorrelation analysis. The study findings indicate that the spatial integration in the Chengdu-Chongqing urban agglomeration is currently in the "point-axis" development stage. The spatial organization structure is becoming more flattened, and there is a stronger economic connection between the core of the urban agglomeration and the peripheral areas. The integration of the Chengdu-Chongqing urban agglomeration is currently hindered by conflicting interests and institutional heterogeneity between Chengdu and Chongqing. Additionally, the connections between the relatively secondary spatial units are largely loose and weak. The strength and scale of economic ties and the level of urban-rural equilibrium among spatial units within the Chengdu-Chongqing urban agglomeration have increased, but regional imbalances have continued to widen, and such positive and negative changes have been characterized by the spatial and temporal synergistic evolution of the "core-periphery". Ultimately, this paper presents planning ideas for the future integration development of the Chengdu-Chongqing urban agglomeration, drawing from the findings.

Keywords: integration, planning strategy, space organization, space evolution, urban agglomeration

Procedia PDF Downloads 26
25474 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 121
25473 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries have gained attention and implemented for this application. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: recommendation, user profile, data mining, web and mobile technology

Procedia PDF Downloads 295
25472 Defining Processes of Gender Restructuring: The Case of Displaced Tribal Communities of North East India

Authors: Bitopi Dutta

Abstract:

Development Induced Displacement (DID) of subaltern groups has been an issue of intense debate in India. This research will do a gender analysis of displacement induced by the mining projects in tribal indigenous societies of North East India, centering on the primary research question which is 'How does DID reorder gendered relationship in tribal matrilineal societies?' This paper will not focus primarily on the impacts of the displacement induced by coal mining on indigenous tribal women in the North East India; it will rather study 'what' are the processes that lead to these transformations and 'how' do they operate. In doing so, the paper will locate the cracks in traditional social systems that the discourse of displacement manipulates for its own benefit. DID in this sense will not only be understood as only physical displacement, but also as social and cultural displacement. The study will cover one matrilineal tribe in the state of Meghalaya in the North East India affected by several coal mining projects in the last 30 years. In-depth unstructured interviews used to collect life narratives will be the primary mode of data collection because the indigenous culture of the tribes in Meghalaya, including the matrilineal tribes, is based on oral history where knowledge and experiences produced under a tradition of oral history exist in a continuum. This is unlike modern societies which produce knowledge in a compartmentalized system. An interview guide designed around specific themes will be used rather than specific questions to ensure the flow of narratives from the interviewee. In addition to this, a number of focus groups will be held. The data collected through the life narrative will be supplemented and contextualized through documentary research using government data, and local media sources of the region.

Keywords: displacement, gender-relations, matriliny, mining

Procedia PDF Downloads 167
25471 Urban Sustainability and Sustainable Mobility, Lessons Learned from the Case of Chile

Authors: Jorge Urrutia-Mosquera, Luz Flórez-Calderón, Yasna Cortés

Abstract:

We assessed the state of progress in terms of urban sustainability indicators and studied the impact of current land use conditions and the level of spatial accessibility to basic urban amenities on travel patterns and sustainable mobility in Santiago de Chile. We determined the spatial impact of urban facilities on sustainable travel patterns through the statistical analysis, data visualisation, and weighted regression models. The results show a need to diversify land use in more than 60% of the communes, although in 85% of the communes, accessibility to public spaces is guaranteed. The findings also suggest improving access to early education facilities, as only 26% of the communes meet the sustainability standard, negatively impacting travel in sustainable modes. It is also observed that the level of access to urban facilities generates spatial heterogeneity in the city, which negatively affects travel patterns in terms of time over 60 minutes and modes of travel in private vehicles. The results obtained allow us to identify opportunities for public policy intervention to promote and adopt sustainable mobility.

Keywords: land use, urban sustainability, travel patterns, spatial heterogeneity, GWR model, sustainable mobility

Procedia PDF Downloads 44
25470 The Significance of Picture Mining in the Fashion and Design as a New Research Method

Authors: Katsue Edo, Yu Hiroi

Abstract:

T Increasing attention has been paid to using pictures and photographs in research since the beginning of the 21th century in social sciences. Meanwhile we have been studying the usefulness of Picture mining, which is one of the new ways for a these picture using researches. Picture Mining is an explorative research analysis method that takes useful information from pictures, photographs and static or moving images. It is often compared with the methods of text mining. The Picture Mining concept includes observational research in the broad sense, because it also aims to analyze moving images (Ochihara and Edo 2013). In the recent literature, studies and reports using pictures are increasing due to the environmental changes. These are identified as technological and social changes (Edo et.al. 2013). Low price digital cameras and i-phones, high information transmission speed, low costs for information transferring and high performance and resolution of the cameras of mobile phones have changed the photographing behavior of people. Consequently, there is less resistance in taking and processing photographs for most of the people in the developing countries. In these studies, this method of collecting data from respondents is often called as ‘participant-generated photography’ or ‘respondent-generated visual imagery’, which focuses on the collection of data and its analysis (Pauwels 2011, Snyder 2012). But there are few systematical and conceptual studies that supports it significance of these methods. We have discussed in the recent years to conceptualize these picture using research methods and formalize theoretical findings (Edo et. al. 2014). We have identified the most efficient fields of Picture mining in the following areas inductively and in case studies; 1) Research in Consumer and Customer Lifestyles. 2) New Product Development. 3) Research in Fashion and Design. Though we have found that it will be useful in these fields and areas, we must verify these assumptions. In this study we will focus on the field of fashion and design, to determine whether picture mining methods are really reliable in this area. In order to do so we have conducted an empirical research of the respondents’ attitudes and behavior concerning pictures and photographs. We compared the attitudes and behavior of pictures toward fashion to meals, and found out that taking pictures of fashion is not as easy as taking meals and food. Respondents do not often take pictures of fashion and upload their pictures online, such as Facebook and Instagram, compared to meals and food because of the difficulty of taking them. We concluded that we should be more careful in analyzing pictures in the fashion area for there still might be some kind of bias existing even if the environment of pictures have drastically changed in these years.

Keywords: empirical research, fashion and design, Picture Mining, qualitative research

Procedia PDF Downloads 338
25469 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: academic performance prediction system, educational data mining, dominant factors, feature selection method, prediction model, student performance

Procedia PDF Downloads 86
25468 Hidden Hot Spots: Identifying and Understanding the Spatial Distribution of Crime

Authors: Lauren C. Porter, Andrew Curtis, Eric Jefferis, Susanne Mitchell

Abstract:

A wealth of research has been generated examining the variation in crime across neighborhoods. However, there is also a striking degree of crime concentration within neighborhoods. A number of studies show that a small percentage of street segments, intersections, or addresses account for a large portion of crime. Not surprisingly, a focus on these crime hot spots can be an effective strategy for reducing community level crime and related ills, such as health problems. However, research is also limited in an important respect. Studies tend to use official data to identify hot spots, such as 911 calls or calls for service. While the use of call data may be more representative of the actual level and distribution of crime than some other official measures (e.g. arrest data), call data still suffer from the 'dark figure of crime.' That is, there is most certainly a degree of error between crimes that occur versus crimes that are reported to the police. In this study, we present an alternative method of identifying crime hot spots, that does not rely on official data. In doing so, we highlight the potential utility of neighborhood-insiders to identify and understand crime dynamics within geographic spaces. Specifically, we use spatial video and geo-narratives to record the crime insights of 36 police, ex-offenders, and residents of a high crime neighborhood in northeast Ohio. Spatial mentions of crime are mapped to identify participant-identified hot spots, and these are juxtaposed with calls for service (CFS) data. While there are bound to be differences between these two sources of data, we find that one location, in particular, a corner store, emerges as a hot spot for all three groups of participants. Yet it does not emerge when we examine CFS data. A closer examination of the space around this corner store and a qualitative analysis of narrative data reveal important clues as to why this store may indeed be a hot spot, but not generate disproportionate calls to the police. In short, our results suggest that researchers who rely solely on official data to study crime hot spots may risk missing some of the most dangerous places.

Keywords: crime, narrative, video, neighborhood

Procedia PDF Downloads 213