Search results for: educational data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26313

Search results for: educational data mining

26073 Abandoned Mine Methane Mitigation in the United States

Authors: Jerome Blackman, Pamela Franklin, Volha Roshchanka

Abstract:

The US coal mining sector accounts for 6% of total US Methane emissions (2021). 60% of US coal mining methane emissions come from active underground mine ventilation systems. Abandoned mines contribute about 13% of methane emissions from coal mining. While there are thousands of abandoned underground coal mines in the US, the Environmental Protection Agency (EPA) estimates that fewer than 100 have sufficient methane resources for viable methane recovery and use projects. Many abandoned mines are in remote areas far from potential energy customers and may be flooded, further complicating methane recovery. Because these mines are no longer active, recovery projects can be simpler to implement.

Keywords: abandoned mines, coal mine methane, coal mining, methane emissions, methane mitigation, recovery and use

Procedia PDF Downloads 45
26072 Comparison Of Data Mining Models To Predict Future Bridge Conditions

Authors: Pablo Martinez, Emad Mohamed, Osama Mohsen, Yasser Mohamed

Abstract:

Highway and bridge agencies, such as the Ministry of Transportation in Ontario, use the Bridge Condition Index (BCI) which is defined as the weighted condition of all bridge elements to determine the rehabilitation priorities for its bridges. Therefore, accurate forecasting of BCI is essential for bridge rehabilitation budgeting planning. The large amount of data available in regard to bridge conditions for several years dictate utilizing traditional mathematical models as infeasible analysis methods. This research study focuses on investigating different classification models that are developed to predict the bridge condition index in the province of Ontario, Canada based on the publicly available data for 2800 bridges over a period of more than 10 years. The data preparation is a key factor to develop acceptable classification models even with the simplest one, the k-NN model. All the models were tested, compared and statistically validated via cross validation and t-test. A simple k-NN model showed reasonable results (within 0.5% relative error) when predicting the bridge condition in an incoming year.

Keywords: asset management, bridge condition index, data mining, forecasting, infrastructure, knowledge discovery in databases, maintenance, predictive models

Procedia PDF Downloads 168
26071 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 250
26070 Data Analysis to Uncover Terrorist Attacks Using Data Mining Techniques

Authors: Saima Nazir, Mustansar Ali Ghazanfar, Sanay Muhammad Umar Saeed, Muhammad Awais Azam, Saad Ali Alahmari

Abstract:

Terrorism is an important and challenging concern. The entire world is threatened by only few sophisticated terrorist groups and especially in Gulf Region and Pakistan, it has become extremely destructive phenomena in recent years. Predicting the pattern of attack type, attack group and target type is an intricate task. This study offers new insight on terrorist group’s attack type and its chosen target. This research paper proposes a framework for prediction of terrorist attacks using the historical data and making an association between terrorist group, their attack type and target. Analysis shows that the number of attacks per year will keep on increasing, and Al-Harmayan in Saudi Arabia, Al-Qai’da in Gulf Region and Tehreek-e-Taliban in Pakistan will remain responsible for many future terrorist attacks. Top main targets of each group will be private citizen & property, police, government and military sector under constant circumstances.

Keywords: data mining, counter terrorism, machine learning, SVM

Procedia PDF Downloads 382
26069 Exploring Gaming-Learning Interaction in MMOG Using Data Mining Methods

Authors: Meng-Tzu Cheng, Louisa Rosenheck, Chen-Yen Lin, Eric Klopfer

Abstract:

The purpose of the research is to explore some of the ways in which gameplay data can be analyzed to yield results that feedback into the learning ecosystem. Back-end data for all users as they played an MMOG, The Radix Endeavor, was collected, and this study reports the analyses on a specific genetics quest by using the data mining techniques, including the decision tree method. In the study, different reasons for quest failure between participants who eventually succeeded and who never succeeded were revealed. Regarding the in-game tools use, trait examiner was a key tool in the quest completion process. Subsequently, the results of decision tree showed that a lack of trait examiner usage can be made up with additional Punnett square uses, displaying multiple pathways to success in this quest. The methods of analysis used in this study and the resulting usage patterns indicate some useful ways that gameplay data can provide insights in two main areas. The first is for game designers to know how players are interacting with and learning from their game. The second is for players themselves as well as their teachers to get information on how they are progressing through the game, and to provide help they may need based on strategies and misconceptions identified in the data.

Keywords: MMOG, decision tree, genetics, gaming-learning interaction

Procedia PDF Downloads 335
26068 Dietary Risk Assessment of Green Leafy Vegetables (GLV) Due to Heavy Metals from Selected Mining Areas

Authors: Simon Mensah Ofosu

Abstract:

Illicit surface mining activities pollutes agricultural lands and water bodies and results in accumulation of heavy metals in vegetables cultivated in such areas. Heavy metal (HM) accumulation in vegetables is a serious food safety issues due to the adverse effects of metal toxicities, hence the need to investigate the levels of these metals in cultivated vegetables in the eastern region. Cocoyam leaves, cabbage and cucumber were sampled from selected farms in mining areas (Atiwa District) and non -mining areas (Yilo Krobo and East Akim District) of the region for the study. Levels of Cadmium, Lead, Mercury and Arsenic were investigated in the vegetables with Atomic Absorption Spectrometer, and the results statistically analyzed with Microsoft Office Excel (2013) Spread Sheet and ANOVA. Cadmium (Cd) and arsenic (As) were the highest and least concentrated HM in the vegetables sampled, respectively. The mean concentrations of Cd and Pb in cabbage (0.564 mg/kg, 0.470 mg/kg), cucumber (0.389 mg/kg, 0.190 mg/kg), cocoyam leaves (0.410 mg/kg, 0.256 mg/kg) respectively from the mining areas exceeded the permissible limits set by Joint FAO/WHO. The mean concentrations of the metals in vegetables from the mining and non-mining areas varied significantly (P<0.05). The Target Hazard Quotient (THQ) was used to assess the health risk posed to the human population via vegetable consumption. The THQ values of cadmium, mercury, and lead in adults and children through vegetable consumption in the mining areas were greater than 1 (THQ >1). This indicates the potential health risk that the children and adults may be facing. The THQ values of adults and children in the non-mining areas were less than the safe limit of 1 (THQ<1), hence no significant health risk posed to the population from such areas.

Keywords: food safety, risk assessment, illicit mining, public health, contaminated vegetables

Procedia PDF Downloads 65
26067 Interoperability Standard for Data Exchange in Educational Documents in Professional and Technological Education: A Comparative Study and Feasibility Analysis for the Brazilian Context

Authors: Giovana Nunes Inocêncio

Abstract:

The professional and technological education (EPT) plays a pivotal role in equipping students for specialized careers, and it is imperative to establish a framework for efficient data exchange among educational institutions. The primary focus of this article is to address the pressing need for document interoperability within the context of EPT. The challenges, motivations, and benefits of implementing interoperability standards for digital educational documents are thoroughly explored. These documents include EPT completion certificates, academic records, and curricula. In conjunction with the prior abstract, it is evident that the intersection of IT governance and interoperability standards holds the key to transforming the landscape of technical education in Brazil. IT governance provides the strategic framework for effective data management, aligning with educational objectives, ensuring compliance, and managing risks. By adopting interoperability standards, the technical education sector in Brazil can facilitate data exchange, enhance data security, and promote international recognition of qualifications. The utilization of the XML (Extensible Markup Language) standard further strengthens the foundation for structured data exchange, fostering efficient communication, standardization of curricula, and enhancing educational materials. The IT governance, interoperability standards, and data management critical role in driving the quality, efficiency, and security of technical education. The adoption of these standards fosters transparency, stakeholder coordination, and regulatory compliance, ultimately empowering the technical education sector to meet the dynamic demands of the 21st century.

Keywords: interoperability, education, standards, governance

Procedia PDF Downloads 46
26066 Exploring the Correlation between Population Distribution and Urban Heat Island under Urban Data: Taking Shenzhen Urban Heat Island as an Example

Authors: Wang Yang

Abstract:

Shenzhen is a modern city of China's reform and opening-up policy, the development of urban morphology has been established on the administration of the Chinese government. This city`s planning paradigm is primarily affected by the spatial structure and human behavior. The subjective urban agglomeration center is divided into several groups and centers. In comparisons of this effect, the city development law has better to be neglected. With the continuous development of the internet, extensive data technology has been introduced in China. Data mining and data analysis has become important tools in municipal research. Data mining has been utilized to improve data cleaning such as receiving business data, traffic data and population data. Prior to data mining, government data were collected by traditional means, then were analyzed using city-relationship research, delaying the timeliness of urban development, especially for the contemporary city. Data update speed is very fast and based on the Internet. The city's point of interest (POI) in the excavation serves as data source affecting the city design, while satellite remote sensing is used as a reference object, city analysis is conducted in both directions, the administrative paradigm of government is broken and urban research is restored. Therefore, the use of data mining in urban analysis is very important. The satellite remote sensing data of the Shenzhen city in July 2018 were measured by the satellite Modis sensor and can be utilized to perform land surface temperature inversion, and analyze city heat island distribution of Shenzhen. This article acquired and classified the data from Shenzhen by using Data crawler technology. Data of Shenzhen heat island and interest points were simulated and analyzed in the GIS platform to discover the main features of functional equivalent distribution influence. Shenzhen is located in the east-west area of China. The city’s main streets are also determined according to the direction of city development. Therefore, it is determined that the functional area of the city is also distributed in the east-west direction. The urban heat island can express the heat map according to the functional urban area. Regional POI has correspondence. The research result clearly explains that the distribution of the urban heat island and the distribution of urban POIs are one-to-one correspondence. Urban heat island is primarily influenced by the properties of the underlying surface, avoiding the impact of urban climate. Using urban POIs as analysis object, the distribution of municipal POIs and population aggregation are closely connected, so that the distribution of the population corresponded with the distribution of the urban heat island.

Keywords: POI, satellite remote sensing, the population distribution, urban heat island thermal map

Procedia PDF Downloads 83
26065 Educase–Intelligent System for Pedagogical Advising Using Case-Based Reasoning

Authors: Elionai Moura, José A. Cunha, César Analide

Abstract:

This work introduces a proposal scheme for an Intelligent System applied to Pedagogical Advising using Case-Based Reasoning, to find consolidated solutions before used for the new problems, making easier the task of advising students to the pedagogical staff. We do intend, through this work, introduce the motivation behind the choices for this system structure, justifying the development of an incremental and smart web system who learns bests solutions for new cases when it’s used, showing technics and technology.

Keywords: case-based reasoning, pedagogical advising, educational data-mining (EDM), machine learning

Procedia PDF Downloads 390
26064 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 111
26063 Concept Drifts Detection and Localisation in Process Mining

Authors: M. V. Manoj Kumar, Likewin Thomas, Annappa

Abstract:

Process mining provides methods and techniques for analyzing event logs recorded in modern information systems that support real-world operations. While analyzing an event-log, state-of-the-art techniques available in process mining believe that the operational process as a static entity (stationary). This is not often the case due to the possibility of occurrence of a phenomenon called concept drift. During the period of execution, the process can experience concept drift and can evolve with respect to any of its associated perspectives exhibiting various patterns-of-change with a different pace. Work presented in this paper discusses the main aspects to consider while addressing concept drift phenomenon and proposes a method for detecting and localizing the sudden concept drifts in control-flow perspective of the process by using features extracted by processing the traces in the process log. Our experimental results are promising in the direction of efficiently detecting and localizing concept drift in the context of process mining research discipline.

Keywords: abrupt drift, concept drift, sudden drift, control-flow perspective, detection and localization, process mining

Procedia PDF Downloads 317
26062 Reclamation of Mining Using Vegetation - A Comparative Study of Open Pit Mining

Authors: G. Surendra Babu

Abstract:

We all know the importance of mineral wealth, which has been buried inside the layers of the earth for decades. These are the natural energy sources that are used in our day to day life like fuel, electricity, construction, etc. but the process of extraction causes damage to the nature that can’t be returned back and which are left over after completion of mining we can see these are barren from decades these remain unused degraded land. Most of them are covered with vegetation before the start during mining which damages the native vegetation of the region and disturbs the watershed boundary of the regions and it also disturbs the biodiversity of the reign. The major motto of the study is to understand the various issues that are found and to understand various methods of reclamations process that are suitable for revegetating and also variously practiced which are carried out in the different case studies and government guidelines procedure of lease licenses which includes the environmental clearances and also to study the vegetation pattern according to the major issues identified. And finally suggesting the new guidelines with respect to the old guidelines which helps in the revegetation of the mine-sites which helps in establishing of its own sustainable ecosystem in future.

Keywords: reclamation, open-pit mining, revegetation, reclamation methods

Procedia PDF Downloads 158
26061 Multiscale Connected Component Labelling and Applications to Scientific Microscopy Image Processing

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

In this paper, a new method is proposed to extending the method of connected component labeling from processing binary images to multi-scale modeling of images. By using the adaptive threshold of multi-scale attributes, this approach minimizes the possibility of missing those important components with weak intensities. In addition, the computational cost of this approach remains similar to that of the typical approach of component labeling. Then, this methodology is applied to grain boundary detection and Drosophila Brain-bow neuron segmentation. These demonstrate the feasibility of the proposed approach in the analysis of challenging microscopy images for scientific discovery.

Keywords: microscopic image processing, scientific data mining, multi-scale modeling, data mining

Procedia PDF Downloads 414
26060 The Perception of Teacher Candidates' on History in Non-Educational TV Series: The Magnificent Century

Authors: Evren Şar İşbilen

Abstract:

As it is known, the movies and tv series are occupying a large part in the daily lives of adults and children in our era. In this connection, in the present study, the most popular historical TV series of recent years in Turkey, “Muhteşem Yüzyıl” (The Magnificent Century), was selected as the sample for the data collection in order to explore the perception of history of university students’. The data collected was analyzed bothqualitatively and quantitatively. The findings discussed in relation to the possible educative effects of historical non-educational TV series and movies on students' perceptions related to history. Additionally, suggestions were made regarding to the utilization of non-educational TV series or movies in education in a positive way.

Keywords: education, history, movies, teacher candidates

Procedia PDF Downloads 313
26059 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 362
26058 Improving University Operations with Data Mining: Predicting Student Performance

Authors: Mladen Dragičević, Mirjana Pejić Bach, Vanja Šimičević

Abstract:

The purpose of this paper is to develop models that would enable predicting student success. These models could improve allocation of students among colleges and optimize the newly introduced model of government subsidies for higher education. For the purpose of collecting data, an anonymous survey was carried out in the last year of undergraduate degree student population using random sampling method. Decision trees were created of which two have been chosen that were most successful in predicting student success based on two criteria: Grade Point Average (GPA) and time that a student needs to finish the undergraduate program (time-to-degree). Decision trees have been shown as a good method of classification student success and they could be even more improved by increasing survey sample and developing specialized decision trees for each type of college. These types of methods have a big potential for use in decision support systems.

Keywords: data mining, knowledge discovery in databases, prediction models, student success

Procedia PDF Downloads 385
26057 Optimization of Air Pollution Control Model for Mining

Authors: Zunaira Asif, Zhi Chen

Abstract:

The sustainable measures on air quality management are recognized as one of the most serious environmental concerns in the mining region. The mining operations emit various types of pollutants which have significant impacts on the environment. This study presents a stochastic control strategy by developing the air pollution control model to achieve a cost-effective solution. The optimization method is formulated to predict the cost of treatment using linear programming with an objective function and multi-constraints. The constraints mainly focus on two factors which are: production of metal should not exceed the available resources, and air quality should meet the standard criteria of the pollutant. The applicability of this model is explored through a case study of an open pit metal mine, Utah, USA. This method simultaneously uses meteorological data as a dispersion transfer function to support the practical local conditions. The probabilistic analysis and the uncertainties in the meteorological conditions are accomplished by Monte Carlo simulation. Reasonable results have been obtained to select the optimized treatment technology for PM2.5, PM10, NOx, and SO2. Additional comparison analysis shows that baghouse is the least cost option as compared to electrostatic precipitator and wet scrubbers for particulate matter, whereas non-selective catalytical reduction and dry-flue gas desulfurization are suitable for NOx and SO2 reduction respectively. Thus, this model can aid planners to reduce these pollutants at a marginal cost by suggesting control pollution devices, while accounting for dynamic meteorological conditions and mining activities.

Keywords: air pollution, linear programming, mining, optimization, treatment technologies

Procedia PDF Downloads 172
26056 Design of a Small and Medium Enterprise Growth Prediction Model Based on Web Mining

Authors: Yiea Funk Te, Daniel Mueller, Irena Pletikosa Cvijikj

Abstract:

Small and medium enterprises (SMEs) play an important role in the economy of many countries. When the overall world economy is considered, SMEs represent 95% of all businesses in the world, accounting for 66% of the total employment. Existing studies show that the current business environment is characterized as highly turbulent and strongly influenced by modern information and communication technologies, thus forcing SMEs to experience more severe challenges in maintaining their existence and expanding their business. To support SMEs at improving their competitiveness, researchers recently turned their focus on applying data mining techniques to build risk and growth prediction models. However, data used to assess risk and growth indicators is primarily obtained via questionnaires, which is very laborious and time-consuming, or is provided by financial institutes, thus highly sensitive to privacy issues. Recently, web mining (WM) has emerged as a new approach towards obtaining valuable insights in the business world. WM enables automatic and large scale collection and analysis of potentially valuable data from various online platforms, including companies’ websites. While WM methods have been frequently studied to anticipate growth of sales volume for e-commerce platforms, their application for assessment of SME risk and growth indicators is still scarce. Considering that a vast proportion of SMEs own a website, WM bears a great potential in revealing valuable information hidden in SME websites, which can further be used to understand SME risk and growth indicators, as well as to enhance current SME risk and growth prediction models. This study aims at developing an automated system to collect business-relevant data from the Web and predict future growth trends of SMEs by means of WM and data mining techniques. The envisioned system should serve as an 'early recognition system' for future growth opportunities. In an initial step, we examine how structured and semi-structured Web data in governmental or SME websites can be used to explain the success of SMEs. WM methods are applied to extract Web data in a form of additional input features for the growth prediction model. The data on SMEs provided by a large Swiss insurance company is used as ground truth data (i.e. growth-labeled data) to train the growth prediction model. Different machine learning classification algorithms such as the Support Vector Machine, Random Forest and Artificial Neural Network are applied and compared, with the goal to optimize the prediction performance. The results are compared to those from previous studies, in order to assess the contribution of growth indicators retrieved from the Web for increasing the predictive power of the model.

Keywords: data mining, SME growth, success factors, web mining

Procedia PDF Downloads 239
26055 Reimagine and Redesign: Augmented Reality Digital Technologies and 21st Century Education

Authors: Jasmin Cowin

Abstract:

Augmented reality digital technologies, big data, and the need for a teacher workforce able to meet the demands of a knowledge-based society are poised to lead to major changes in the field of education. This paper explores applications and educational use cases of augmented reality digital technologies for educational organizations during the Fourth Industrial Revolution. The Fourth Industrial Revolution requires vision, flexibility, and innovative educational conduits by governments and educational institutions to remain competitive in a global economy. Educational organizations will need to focus on teaching in and for a digital age to continue offering academic knowledge relevant to 21st-century markets and changing labor force needs. Implementation of contemporary disciplines will need to be embodied through learners’ active knowledge-making experiences while embracing ubiquitous accessibility. The power of distributed ledger technology promises major streamlining for educational record-keeping, degree conferrals, and authenticity guarantees. Augmented reality digital technologies hold the potential to restructure educational philosophies and their underpinning pedagogies thereby transforming modes of delivery. Structural changes in education and governmental planning are already increasing through intelligent systems and big data. Reimagining and redesigning education on a broad scale is required to plan and implement governmental and institutional changes to harness innovative technologies while moving away from the big schooling machine.

Keywords: fourth industrial revolution, artificial intelligence, big data, education, augmented reality digital technologies, distributed ledger technology

Procedia PDF Downloads 250
26054 A Recommender System Fusing Collaborative Filtering and User’s Review Mining

Authors: Seulbi Choi, Hyunchul Ahn

Abstract:

Collaborative filtering (CF) algorithm has been popularly used for recommender systems in both academic and practical applications. It basically generates recommendation results using users’ numeric ratings. However, the additional use of the information other than user ratings may lead to better accuracy of CF. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's review can be regarded as the new informative source for identifying user's preference with accuracy. Under this background, this study presents a hybrid recommender system that fuses CF and user's review mining. Our system adopts conventional memory-based CF, but it is designed to use both user’s numeric ratings and his/her text reviews on the items when calculating similarities between users.

Keywords: Recommender system, Collaborative filtering, Text mining, Review mining

Procedia PDF Downloads 311
26053 The Influence of Educational Board Games on Chinese Learning Motivation and Flow Experience

Authors: Ju May Wen, Chun Hung Lin, Eric Zhi Feng Liu

Abstract:

Flow theory implies that people are persuaded by happiness. By focusing on an activity, people turn a blind eye to external factors. This study explores the influence of educational board games and fundamental Chinese language teaching on students’ learning motivation and flow experience. Fifty-three students studying Chinese language fundamental courses were used in the study. These students were divided into three groups: (1) flash card teaching group; (2) educational original board game teaching group; and (3) educational Chinese board game teaching group. Chinese language teaching was integrated with the educational board game titled ‘Transportation GO.’ The students were observed playing this game as the teacher collected quantitative and qualitative data. Quantitative data was collected from the learning motivation scale and flow experience scale. Qualitative data was collected through observing, recording, and visiting. The first result found that the three groups integrated with Chinese language teaching could maintain students’ high learning motivation and high flow experience. Second, there was no significant difference between the flow experience of the flash card group and the educational original board game group. Third, there was a significant difference in the flow experience and learning motivation of the educational Chinese board game group vs. the other groups. This study suggests that the experimental model can be applied to advanced Chinese language teaching. Apart from oral and literacy skills, the study of educational board games integrated with Chinese language teaching to enforce student writing skills will be continued.

Keywords: Chinese language instruction, educational board game, learning motivation, flow experience

Procedia PDF Downloads 151
26052 Integrating of Multi-Criteria Decision Making and Spatial Data Warehouse in Geographic Information System

Authors: Zohra Mekranfar, Ahmed Saidi, Abdellah Mebrek

Abstract:

This work aims to develop multi-criteria decision making (MCDM) and spatial data warehouse (SDW) methods, which will be integrated into a GIS according to a ‘GIS dominant’ approach. The GIS operating tools will be operational to operate the SDW. The MCDM methods can provide many solutions to a set of problems with various and multiple criteria. When the problem is so complex, integrating spatial dimension, it makes sense to combine the MCDM process with other approaches like data mining, ascending analyses, we present in this paper an experiment showing a geo-decisional methodology of SWD construction, On-line analytical processing (OLAP) technology which combines both basic multidimensional analysis and the concepts of data mining provides powerful tools to highlight inductions and information not obvious by traditional tools. However, these OLAP tools become more complex in the presence of the spatial dimension. The integration of OLAP with a GIS is the future geographic and spatial information solution. GIS offers advanced functions for the acquisition, storage, analysis, and display of geographic information. However, their effectiveness for complex spatial analysis is questionable due to their determinism and their decisional rigor. A prerequisite for the implementation of any analysis or exploration of spatial data requires the construction and structuring of a spatial data warehouse (SDW). This SDW must be easily usable by the GIS and by the tools offered by an OLAP system.

Keywords: data warehouse, GIS, MCDM, SOLAP

Procedia PDF Downloads 150
26051 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 164
26050 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 236
26049 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 353
26048 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 122
26047 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries have gained attention and implemented for this application. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: recommendation, user profile, data mining, web and mobile technology

Procedia PDF Downloads 296
26046 Defining Processes of Gender Restructuring: The Case of Displaced Tribal Communities of North East India

Authors: Bitopi Dutta

Abstract:

Development Induced Displacement (DID) of subaltern groups has been an issue of intense debate in India. This research will do a gender analysis of displacement induced by the mining projects in tribal indigenous societies of North East India, centering on the primary research question which is 'How does DID reorder gendered relationship in tribal matrilineal societies?' This paper will not focus primarily on the impacts of the displacement induced by coal mining on indigenous tribal women in the North East India; it will rather study 'what' are the processes that lead to these transformations and 'how' do they operate. In doing so, the paper will locate the cracks in traditional social systems that the discourse of displacement manipulates for its own benefit. DID in this sense will not only be understood as only physical displacement, but also as social and cultural displacement. The study will cover one matrilineal tribe in the state of Meghalaya in the North East India affected by several coal mining projects in the last 30 years. In-depth unstructured interviews used to collect life narratives will be the primary mode of data collection because the indigenous culture of the tribes in Meghalaya, including the matrilineal tribes, is based on oral history where knowledge and experiences produced under a tradition of oral history exist in a continuum. This is unlike modern societies which produce knowledge in a compartmentalized system. An interview guide designed around specific themes will be used rather than specific questions to ensure the flow of narratives from the interviewee. In addition to this, a number of focus groups will be held. The data collected through the life narrative will be supplemented and contextualized through documentary research using government data, and local media sources of the region.

Keywords: displacement, gender-relations, matriliny, mining

Procedia PDF Downloads 167
26045 The Significance of Picture Mining in the Fashion and Design as a New Research Method

Authors: Katsue Edo, Yu Hiroi

Abstract:

T Increasing attention has been paid to using pictures and photographs in research since the beginning of the 21th century in social sciences. Meanwhile we have been studying the usefulness of Picture mining, which is one of the new ways for a these picture using researches. Picture Mining is an explorative research analysis method that takes useful information from pictures, photographs and static or moving images. It is often compared with the methods of text mining. The Picture Mining concept includes observational research in the broad sense, because it also aims to analyze moving images (Ochihara and Edo 2013). In the recent literature, studies and reports using pictures are increasing due to the environmental changes. These are identified as technological and social changes (Edo et.al. 2013). Low price digital cameras and i-phones, high information transmission speed, low costs for information transferring and high performance and resolution of the cameras of mobile phones have changed the photographing behavior of people. Consequently, there is less resistance in taking and processing photographs for most of the people in the developing countries. In these studies, this method of collecting data from respondents is often called as ‘participant-generated photography’ or ‘respondent-generated visual imagery’, which focuses on the collection of data and its analysis (Pauwels 2011, Snyder 2012). But there are few systematical and conceptual studies that supports it significance of these methods. We have discussed in the recent years to conceptualize these picture using research methods and formalize theoretical findings (Edo et. al. 2014). We have identified the most efficient fields of Picture mining in the following areas inductively and in case studies; 1) Research in Consumer and Customer Lifestyles. 2) New Product Development. 3) Research in Fashion and Design. Though we have found that it will be useful in these fields and areas, we must verify these assumptions. In this study we will focus on the field of fashion and design, to determine whether picture mining methods are really reliable in this area. In order to do so we have conducted an empirical research of the respondents’ attitudes and behavior concerning pictures and photographs. We compared the attitudes and behavior of pictures toward fashion to meals, and found out that taking pictures of fashion is not as easy as taking meals and food. Respondents do not often take pictures of fashion and upload their pictures online, such as Facebook and Instagram, compared to meals and food because of the difficulty of taking them. We concluded that we should be more careful in analyzing pictures in the fashion area for there still might be some kind of bias existing even if the environment of pictures have drastically changed in these years.

Keywords: empirical research, fashion and design, Picture Mining, qualitative research

Procedia PDF Downloads 339
26044 Bankruptcy Prediction Analysis on Mining Sector Companies in Indonesia

Authors: Devina Aprilia Gunawan, Tasya Aspiranti, Inugrah Ratia Pratiwi

Abstract:

This research aims to classify the mining sector companies based on Altman’s Z-score model, and providing an analysis based on the Altman’s Z-score model’s financial ratios to provide a picture about the financial condition in mining sector companies in Indonesia and their viability in the future, and to find out the partial and simultaneous impact of each of the financial ratio variables in the Altman’s Z-score model, namely (WC/TA), (RE/TA), (EBIT/TA), (MVE/TL), and (S/TA), toward the financial condition represented by the Z-score itself. Among 38 mining sector companies listed in Indonesia Stock Exchange (IDX), 28 companies are selected as research sample according to the purposive sampling criteria.The results of this research showed that during 3 years research period at 2010-2012, the amount of the companies that was predicted to be healthy in each year was less than half of the total sample companies and not even reach up to 50%. The multiple regression analysis result showed that all of the research hypotheses are accepted, which means that (WC/TA), (RE/TA), (EBIT/TA), (MVE/TL), and (S/TA), both partially and simultaneously had an impact towards company’s financial condition.

Keywords: Altman’s Z-score model, financial condition, mining companies, Indonesia

Procedia PDF Downloads 506