Search results for: mining big data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25524

Search results for: mining big data

25254 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 215
25253 Research on the Correlation between College Students' Physical Fitness and Running Habits: Data Mining of Smart Phone Sports App

Authors: Mingming Guo, Xiaozan Wang

Abstract:

Introduction: The purpose of this study is to examine the correlation between the physical fitness of Chinese college students and their daily running habits (RH). Methods: A total of 718 college students from East China Normal University participated in this study (385 boys and 333 girls). Each participant participated in the Chinese Students’ Physical Fitness Test during the 2018-2019 school year. In addition, each student is also required to use the app to record all their running results during each run during the 2018-2019 school year. Researchers can query and export all running records through the app's management platform. Results: (1) The total number of kilometers run by the students showed a significant negative correlation with their vital capacity (VC), sitting body flexion (SBF), and long jump (LJ) (rᵥ

Keywords: college students, physical fitness, running habits, data mining

Procedia PDF Downloads 139
25252 Grid and Market Integration of Large Scale Wind Farms using Advanced Predictive Data Mining Techniques

Authors: Umit Cali

Abstract:

The integration of intermittent energy sources like wind farms into the electricity grid has become an important challenge for the utilization and control of electric power systems, because of the fluctuating behaviour of wind power generation. Wind power predictions improve the economic and technical integration of large amounts of wind energy into the existing electricity grid. Trading, balancing, grid operation, controllability and safety issues increase the importance of predicting power output from wind power operators. Therefore, wind power forecasting systems have to be integrated into the monitoring and control systems of the transmission system operator (TSO) and wind farm operators/traders. The wind forecasts are relatively precise for the time period of only a few hours, and, therefore, relevant with regard to Spot and Intraday markets. In this work predictive data mining techniques are applied to identify a statistical and neural network model or set of models that can be used to predict wind power output of large onshore and offshore wind farms. These advanced data analytic methods helps us to amalgamate the information in very large meteorological, oceanographic and SCADA data sets into useful information and manageable systems. Accurate wind power forecasts are beneficial for wind plant operators, utility operators, and utility customers. An accurate forecast allows grid operators to schedule economically efficient generation to meet the demand of electrical customers. This study is also dedicated to an in-depth consideration of issues such as the comparison of day ahead and the short-term wind power forecasting results, determination of the accuracy of the wind power prediction and the evaluation of the energy economic and technical benefits of wind power forecasting.

Keywords: renewable energy sources, wind power, forecasting, data mining, big data, artificial intelligence, energy economics, power trading, power grids

Procedia PDF Downloads 518
25251 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: client classification, loan suitability, risk rating, CART analysis

Procedia PDF Downloads 338
25250 Sustainable and Responsible Mining - Lundin Mining’s Subsidiary in Portugal, Sociedade Mineira de Neves-Corvo Case

Authors: Jose Daniel Braga Alves, Joaquim Gois, Alexandre Leite

Abstract:

This abstract presents the responsible and sustainable mining case study of a Portuguese mine operation, highlighting how mine exploitation can sustainably exist in balance with the environment, aligned with all stakeholders. The mining operation is remotely located in a United Nations (UN) biodiversity reserve, away from major industrial centers or logistical ports, and presents an interesting investigation to assess the balanced mine operation in alignment with all key stakeholders, which presents unique opportunities as well as challenges. Based on the sustainable mining framework, it is intended to detail examples of best practices from Sociedade Mineira de Neves-Corvo (SOMINCOR), demonstrating social acceptance by the local community, health, and safety at work, reduction of environmental impacts and management of mining waste, which directly influence the acceptance and recognition of a sustainable operation. The case study aims to present the SOMINCOR approach to sustainable mining, focusing on social responsibility, considering materials provided by Lundin Mining Corporation (LMC) and SOMINCOR and the socially responsible approach of the mining operations., referencing related international guidelines, UN Sustainable Development Goals. The researchers reviewed LMC's annual Sustainability Reports (2019, 2020 and 2021) and updated information regarding material topics of the most significant interest to internal and external stakeholders. These material topics formed the basis of the corporation-wide sustainability strategy. LMC's Responsible Mining Policy (RMP) was reviewed, focusing on the commitment that guides the approach to responsible operation and management of the Company's business. Social performance, compliance, environmental management, governance, human rights, and economic contribution are principles of the RMP. The Human Rights Risk Impact Assessment (HRRIA), based on frameworks including UN Guiding Principles (UNGP), Voluntary Principles on Security and Human Rights, and a community engagement program implemented (SLO index), was part of this research. The program consists of ongoing surveys and perceptions studies using behavioural science insights, data from which was not available within the timeframe of completing this research. LMC stakeholder engagement standards and grievance mechanisms were also reviewed. Stakeholder engagement and the community's perception are key to this operation to ensure social license to operate (SLO). Preliminary surveys with local communities provided input data for the local development strategy. After the implementation of several initiatives, subsequent surveys were performed to assess acceptance and trust from the local communities and changes to the SLO index. SOMINCOR's operation contributes to 12 out of 17 sustainable development goals. From the assessed and available data, local communities and social engagement are priorities to SOMINCOR. Experience to date shows that the continual engagement with local communities and the grievance mechanisms in place are respected and followed for all concerns presented by any stakeholder. It can be concluded that this underground mine in Portugal complies with applicable regulations and goes beyond them with regard to sustainable development and engagement with key stakeholders.

Keywords: sustainable mining, development goals, portuguese mining, zinc copper

Procedia PDF Downloads 76
25249 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 82
25248 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 329
25247 Gravity and Magnetic Survey, Modeling and Interpretation in the Blötberget Iron-Oxide Mining Area of Central Sweden

Authors: Ezra Yehuwalashet, Alireza Malehmir

Abstract:

Blötberget mining area in central Sweden, part of the Bergslagen mineral district, is well known for its various type of mineralization particularly iron-oxide deposits since the 1600. To shed lights on the knowledge of the host rock structures, depth extent and tonnage of the mineral deposits and support deep mineral exploration potential in the study area, new ground gravity and existing aeromagnetic data (from the Geological Survey of Sweden) were used for interpretations and modelling. A major boundary separating a gravity low from a gravity high in the southern part of the study area is noticeable and likely representing a fault boundary separating two different lithological units. Gravity data and modeling offers a possible new target area in the southeast of the known mineralization while suggesting an excess high-density region down to 800 m depth.

Keywords: gravity, magnetics, ore deposit, geophysics

Procedia PDF Downloads 65
25246 Analysis Mechanized Boring (TBM) of Tehran Subway Line 7

Authors: Shahin Shabani, Pouya Pourmadadi

Abstract:

Tunnel boring machines (TBMs) have been used for the construction of various tunnels for mining projects for the purpose of access, conveyance of ore and waste, drainage, exploration, water supply and water diversion. Several mining projects have seen the successful and economic beneficial use of TBMs, and there is an increasing awareness of the benefits of TBMs for mining projects. Key technical considerations for the use of TBMs for the construction of tunnels for mining projects include geological issues (rock type, rock alteration, rock strength, rock abrasivity, durability, ground water inflows), depth of cover and the potential for overstressing/rockbursts, site access and terrain, portal locations, TBM constraints, minimum tunnel size, tunnel support requirements, contractor and labor experience, and project schedule demands. This study focuses on tunnelling mining, with the goal to develop methods and tools to be used to gain understanding of these processes, and to analyze metro of Tehran. The Metro Line 7 of Tehran is one of the Longest (26 Km) and deepest (27m) of projects that’s under implementation. Because of major differences like passing under all geotechnical layers of the town and encountering part of it with underground water table and also using mechanized excavation system, is one of special metro projects.

Keywords: TBM, tunnel boring machines economic, metro, line 7

Procedia PDF Downloads 384
25245 CoP-Networks: Virtual Spaces for New Faculty’s Professional Development in the 21st Higher Education

Authors: Eman AbuKhousa, Marwan Z. Bataineh

Abstract:

The 21st century higher education and globalization challenge new faculty members to build effective professional networks and partnership with industry in order to accelerate their growth and success. This creates the need for community of practice (CoP)-oriented development approaches that focus on cognitive apprenticeship while considering individual predisposition and future career needs. This work adopts data mining, clustering analysis, and social networking technologies to present the CoP-Network as a virtual space that connects together similar career-aspiration individuals who are socially influenced to join and engage in a process for domain-related knowledge and practice acquisitions. The CoP-Network model can be integrated into higher education to extend traditional graduate and professional development programs.

Keywords: clustering analysis, community of practice, data mining, higher education, new faculty challenges, social network, social influence, professional development

Procedia PDF Downloads 183
25244 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data

Procedia PDF Downloads 320
25243 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 355
25242 Fuzzy Expert Approach for Risk Mitigation on Functional Urban Areas Affected by Anthropogenic Ground Movements

Authors: Agnieszka A. Malinowska, R. Hejmanowski

Abstract:

A number of European cities are strongly affected by ground movements caused by anthropogenic activities or post-anthropogenic metamorphosis. Those are mainly water pumping, current mining operation, the collapse of post-mining underground voids or mining-induced earthquakes. These activities lead to large and small-scale ground displacements and a ground ruptures. The ground movements occurring in urban areas could considerably affect stability and safety of structures and infrastructures. The complexity of the ground deformation phenomenon in relation to the structures and infrastructures vulnerability leads to considerable constraints in assessing the threat of those objects. However, the increase of access to the free software and satellite data could pave the way for developing new methods and strategies for environmental risk mitigation and management. Open source geographical information systems (OS GIS), may support data integration, management, and risk analysis. Lately, developed methods based on fuzzy logic and experts methods for buildings and infrastructure damage risk assessment could be integrated into OS GIS. Those methods were verified base on back analysis proving their accuracy. Moreover, those methods could be supported by ground displacement observation. Based on freely available data from European Space Agency and free software, ground deformation could be estimated. The main innovation presented in the paper is the application of open source software (OS GIS) for integration developed models and assessment of the threat of urban areas. Those approaches will be reinforced by analysis of ground movement based on free satellite data. Those data would support the verification of ground movement prediction models. Moreover, satellite data will enable our mapping of ground deformation in urbanized areas. Developed models and methods have been implemented in one of the urban areas hazarded by underground mining activity. Vulnerability maps supported by satellite ground movement observation would mitigate the hazards of land displacements in urban areas close to mines.

Keywords: fuzzy logic, open source geographic information science (OS GIS), risk assessment on urbanized areas, satellite interferometry (InSAR)

Procedia PDF Downloads 159
25241 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 430
25240 Resource Framework Descriptors for Interestingness in Data

Authors: C. B. Abhilash, Kavi Mahesh

Abstract:

Human beings are the most advanced species on earth; it's all because of the ability to communicate and share information via human language. In today's world, a huge amount of data is available on the web in text format. This has also resulted in the generation of big data in structured and unstructured formats. In general, the data is in the textual form, which is highly unstructured. To get insights and actionable content from this data, we need to incorporate the concepts of text mining and natural language processing. In our study, we mainly focus on Interesting data through which interesting facts are generated for the knowledge base. The approach is to derive the analytics from the text via the application of natural language processing. Using semantic web Resource framework descriptors (RDF), we generate the triple from the given data and derive the interesting patterns. The methodology also illustrates data integration using the RDF for reliable, interesting patterns.

Keywords: RDF, interestingness, knowledge base, semantic data

Procedia PDF Downloads 162
25239 Evaluating 8D Reports Using Text-Mining

Authors: Benjamin Kuester, Bjoern Eilert, Malte Stonis, Ludger Overmeyer

Abstract:

Increasing quality requirements make reliable and effective quality management indispensable. This includes the complaint handling in which the 8D method is widely used. The 8D report as a written documentation of the 8D method is one of the key quality documents as it internally secures the quality standards and acts as a communication medium to the customer. In practice, however, the 8D report is mostly faulty and of poor quality. There is no quality control of 8D reports today. This paper describes the use of natural language processing for the automated evaluation of 8D reports. Based on semantic analysis and text-mining algorithms the presented system is able to uncover content and formal quality deficiencies and thus increases the quality of the complaint processing in the long term.

Keywords: 8D report, complaint management, evaluation system, text-mining

Procedia PDF Downloads 315
25238 Representation Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: compression properties, uncertainty, uncertain time series, mining technique, weather prediction

Procedia PDF Downloads 428
25237 Analysis of Causality between Defect Causes Using Association Rule Mining

Authors: Sangdeok Lee, Sangwon Han, Changtaek Hyun

Abstract:

Construction defects are major components that result in negative impacts on project performance including schedule delays and cost overruns. Since construction defects generally occur when a few associated causes combine, a thorough understanding of defect causality is required in order to more systematically prevent construction defects. To address this issue, this paper uses association rule mining (ARM) to quantify the causality between defect causes, and social network analysis (SNA) to find indirect causality among them. The suggested approach is validated with 350 defect instances from concrete works in 32 projects in Korea. The results show that the interrelationships revealed by the approach reflect the characteristics of the concrete task and the important causes that should be prevented.

Keywords: causality, defect causes, social network analysis, association rule mining

Procedia PDF Downloads 367
25236 Accountant Strategists Challenge the Dominant Business Model: A Strategy-as-Practice Perspective

Authors: Lindie Grebe

Abstract:

This paper reports on a study that explored the strategizing practices of professional accountants in the mining industry, based on Jarratt and Stiles’ dominant strategizing practice models framework. Drawing on a strategy-as-practice perspective, the paper recognises qualified professional accountants in strategic management such as Chief Executive Officers, as strategy practitioners that perform their strategizing practices and praxis within a specific context. The main findings of this paper were produced through semi-structured individual interviews with accountants that perform strategy on a business level in the South African mining industry. Qualitative data were analysed through conversation analysis over two coding-cycles. Findings describe accountant strategists as practitioners who challenge the dominant business model when a disconnect seems to exist between international corporate level strategy and business level strategy in the South African mining industry. Accountant strategy practitioners described their dominant strategizing practice model as incremental change during strategic planning and as a lived experience during strategy implementation. Findings portrayed these strategists as taking initiative as strategy leaders in a dynamic and volatile environment to combine their accounting background with strategic management and challenge the dominant business model. Understanding how accountant strategists perform strategizing offers insight into the social practice of strategic management. This understanding contributes to the body of knowledge on strategizing in the South African mining industry. In addition, knowledge on the transformation of accountants as strategists could provide valuable practice relevant insights for accounting educators and the accounting profession alike.

Keywords: accountant strategists, dominant strategizing practice models framework, mining industry, strategy-as-practice

Procedia PDF Downloads 175
25235 Exploring Twitter Data on Human Rights Activism on Olympics Stage through Social Network Analysis and Mining

Authors: Teklu Urgessa, Joong Seek Lee

Abstract:

Social media is becoming the primary choice of activists to make their voices heard. This fact is coupled by two main reasons. The first reason is the emergence web 2.0, which gave the users opportunity to become content creators than passive recipients. Secondly the control of the mainstream mass media outlets by the governments and individuals with their political and economic interests. This paper aimed at exploring twitter data of network actors talking about the marathon silver medalists on Rio2016, who showed solidarity with the Oromo protesters in Ethiopia on the marathon race finish line when he won silver. The aim is to discover important insight using social network analysis and mining. The hashtag #FeyisaLelisa was used for Twitter network search. The actors’ network was visualized and analyzed. It showed the central influencers during first 10 days in August, were international media outlets while it was changed to individual activist in September. The degree distribution of the network is scale free where the frequency of degrees decay by power low. Text mining was also used to arrive at meaningful themes from tweet corpus about the event selected for analysis. The semantic network indicated important clusters of concepts (15) that provided different insight regarding the why, who, where, how of the situation related to the event. The sentiments of the words in the tweets were also analyzed and indicated that 95% of the opinions in the tweets were either positive or neutral. Overall, the finding showed that Olympic stage protest of the marathoner brought the issue of Oromo protest to the global stage. The new research framework is proposed based for event-based social network analysis and mining based on the practical procedures followed in this research for event-based social media sense making.

Keywords: human rights, Olympics, social media, network analysis, social network ming

Procedia PDF Downloads 257
25234 Social Media Data Analysis for Personality Modelling and Learning Styles Prediction Using Educational Data Mining

Authors: Srushti Patil, Preethi Baligar, Gopalkrishna Joshi, Gururaj N. Bhadri

Abstract:

In designing learning environments, the instructional strategies can be tailored to suit the learning style of an individual to ensure effective learning. In this study, the information shared on social media like Facebook is being used to predict learning style of a learner. Previous research studies have shown that Facebook data can be used to predict user personality. Users with a particular personality exhibit an inherent pattern in their digital footprint on Facebook. The proposed work aims to correlate the user's’ personality, predicted from Facebook data to the learning styles, predicted through questionnaires. For Millennial learners, Facebook has become a primary means for information sharing and interaction with peers. Thus, it can serve as a rich bed for research and direct the design of learning environments. The authors have conducted this study in an undergraduate freshman engineering course. Data from 320 freshmen Facebook users was collected. The same users also participated in the learning style and personality prediction survey. The Kolb’s Learning style questionnaires and Big 5 personality Inventory were adopted for the survey. The users have agreed to participate in this research and have signed individual consent forms. A specific page was created on Facebook to collect user data like personal details, status updates, comments, demographic characteristics and egocentric network parameters. This data was captured by an application created using Python program. The data captured from Facebook was subjected to text analysis process using the Linguistic Inquiry and Word Count dictionary. An analysis of the data collected from the questionnaires performed reveals individual student personality and learning style. The results obtained from analysis of Facebook, learning style and personality data were then fed into an automatic classifier that was trained by using the data mining techniques like Rule-based classifiers and Decision trees. This helps to predict the user personality and learning styles by analysing the common patterns. Rule-based classifiers applied for text analysis helps to categorize Facebook data into positive, negative and neutral. There were totally two models trained, one to predict the personality from Facebook data; another one to predict the learning styles from the personalities. The results show that the classifier model has high accuracy which makes the proposed method to be a reliable one for predicting the user personality and learning styles.

Keywords: educational data mining, Facebook, learning styles, personality traits

Procedia PDF Downloads 231
25233 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: artificial neural network, data mining, electroencephalogram, epilepsy, feature extraction, seizure detection, signal processing

Procedia PDF Downloads 188
25232 Exploring Social Impact of Emerging Technologies from Futuristic Data

Authors: Heeyeul Kwon, Yongtae Park

Abstract:

Despite the highly touted benefits, emerging technologies have unleashed pervasive concerns regarding unintended and unforeseen social impacts. Thus, those wishing to create safe and socially acceptable products need to identify such side effects and mitigate them prior to the market proliferation. Various methodologies in the field of technology assessment (TA), namely Delphi, impact assessment, and scenario planning, have been widely incorporated in such a circumstance. However, literatures face a major limitation in terms of sole reliance on participatory workshop activities. They unfortunately missed out the availability of a massive untapped data source of futuristic information flooding through the Internet. This research thus seeks to gain insights into utilization of futuristic data, future-oriented documents from the Internet, as a supplementary method to generate social impact scenarios whilst capturing perspectives of experts from a wide variety of disciplines. To this end, network analysis is conducted based on the social keywords extracted from the futuristic documents by text mining, which is then used as a guide to produce a comprehensive set of detailed scenarios. Our proposed approach facilitates harmonized depictions of possible hazardous consequences of emerging technologies and thereby makes decision makers more aware of, and responsive to, broad qualitative uncertainties.

Keywords: emerging technologies, futuristic data, scenario, text mining

Procedia PDF Downloads 491
25231 Incremental Learning of Independent Topic Analysis

Authors: Takahiro Nishigaki, Katsumi Nitta, Takashi Onoda

Abstract:

In this paper, we present a method of applying Independent Topic Analysis (ITA) to increasing the number of document data. The number of document data has been increasing since the spread of the Internet. ITA was presented as one method to analyze the document data. ITA is a method for extracting the independent topics from the document data by using the Independent Component Analysis (ICA). ICA is a technique in the signal processing; however, it is difficult to apply the ITA to increasing number of document data. Because ITA must use the all document data so temporal and spatial cost is very high. Therefore, we present Incremental ITA which extracts the independent topics from increasing number of document data. Incremental ITA is a method of updating the independent topics when the document data is added after extracted the independent topics from a just previous the data. In addition, Incremental ITA updates the independent topics when the document data is added. And we show the result applied Incremental ITA to benchmark datasets.

Keywords: text mining, topic extraction, independent, incremental, independent component analysis

Procedia PDF Downloads 309
25230 Poultry in Motion: Text Mining Social Media Data for Avian Influenza Surveillance in the UK

Authors: Samuel Munaf, Kevin Swingler, Franz Brülisauer, Anthony O’Hare, George Gunn, Aaron Reeves

Abstract:

Background: Avian influenza, more commonly known as Bird flu, is a viral zoonotic respiratory disease stemming from various species of poultry, including pets and migratory birds. Researchers have purported that the accessibility of health information online, in addition to the low-cost data collection methods the internet provides, has revolutionized the methods in which epidemiological and disease surveillance data is utilized. This paper examines the feasibility of using internet data sources, such as Twitter and livestock forums, for the early detection of the avian flu outbreak, through the use of text mining algorithms and social network analysis. Methods: Social media mining was conducted on Twitter between the period of 01/01/2021 to 31/12/2021 via the Twitter API in Python. The results were filtered firstly by hashtags (#avianflu, #birdflu), word occurrences (avian flu, bird flu, H5N1), and then refined further by location to include only those results from within the UK. Analysis was conducted on this text in a time-series manner to determine keyword frequencies and topic modeling to uncover insights in the text prior to a confirmed outbreak. Further analysis was performed by examining clinical signs (e.g., swollen head, blue comb, dullness) within the time series prior to the confirmed avian flu outbreak by the Animal and Plant Health Agency (APHA). Results: The increased search results in Google and avian flu-related tweets showed a correlation in time with the confirmed cases. Topic modeling uncovered clusters of word occurrences relating to livestock biosecurity, disposal of dead birds, and prevention measures. Conclusions: Text mining social media data can prove to be useful in relation to analysing discussed topics for epidemiological surveillance purposes, especially given the lack of applied research in the veterinary domain. The small sample size of tweets for certain weekly time periods makes it difficult to provide statistically plausible results, in addition to a great amount of textual noise in the data.

Keywords: veterinary epidemiology, disease surveillance, infodemiology, infoveillance, avian influenza, social media

Procedia PDF Downloads 105
25229 Feature-Based Summarizing and Ranking from Customer Reviews

Authors: Dim En Nyaung, Thin Lai Lai Thein

Abstract:

Due to the rapid increase of Internet, web opinion sources dynamically emerge which is useful for both potential customers and product manufacturers for prediction and decision purposes. These are the user generated contents written in natural languages and are unstructured-free-texts scheme. Therefore, opinion mining techniques become popular to automatically process customer reviews for extracting product features and user opinions expressed over them. Since customer reviews may contain both opinionated and factual sentences, a supervised machine learning technique applies for subjectivity classification to improve the mining performance. In this paper, we dedicate our work is the task of opinion summarization. Therefore, product feature and opinion extraction is critical to opinion summarization, because its effectiveness significantly affects the identification of semantic relationships. The polarity and numeric score of all the features are determined by Senti-WordNet Lexicon. The problem of opinion summarization refers how to relate the opinion words with respect to a certain feature. Probabilistic based model of supervised learning will improve the result that is more flexible and effective.

Keywords: opinion mining, opinion summarization, sentiment analysis, text mining

Procedia PDF Downloads 332
25228 Constraining the Potential Nickel Laterite Area Using Geographic Information System-Based Multi-Criteria Rating in Surigao Del Sur

Authors: Reiner-Ace P. Mateo, Vince Paolo F. Obille

Abstract:

The traditional method of classifying the potential mineral resources requires a significant amount of time and money. In this paper, an alternative way to classify potential mineral resources with GIS application in Surigao del Sur. The three (3) analog map data inputs integrated to GIS are geologic map, topographic map, and land cover/vegetation map. The indicators used in the classification of potential nickel laterite integrated from the analog map data inputs are a geologic indicator, which is the presence of ultramafic rock from the geologic map; slope indicator and the presence of plateau edges from the topographic map; areas of forest land, grassland, and shrublands from the land cover/vegetation map. The potential mineral of the area was classified from low up to very high potential. The produced mineral potential classification map of Surigao del Sur has an estimated 4.63% low nickel laterite potential, 42.15% medium nickel laterite potential, 43.34% high nickel laterite potential, and 9.88% very high nickel laterite from its ultramafic terrains. For the validation of the produced map, it was compared with known occurrences of nickel laterite in the area using a nickel mining tenement map from the area with the application of remote sensing. Three (3) prominent nickel mining companies were delineated in the study area. The generated potential classification map of nickel-laterite in Surigao Del Sur may be of aid to the mining companies which are currently in the exploration phase in the study area. Also, the currently operating nickel mines in the study area can help to validate the reliability of the mineral classification map produced.

Keywords: mineral potential classification, nickel laterites, GIS, remote sensing, Surigao del Sur

Procedia PDF Downloads 123
25227 Applying Sequential Pattern Mining to Generate Block for Scheduling Problems

Authors: Meng-Hui Chen, Chen-Yu Kao, Chia-Yu Hsu, Pei-Chann Chang

Abstract:

The main idea in this paper is using sequential pattern mining to find the information which is helpful for finding high performance solutions. By combining this information, it is defined as blocks. Using the blocks to generate artificial chromosomes (ACs) could improve the structure of solutions. Estimation of Distribution Algorithms (EDAs) is adapted to solve the combinatorial problems. Nevertheless many of these approaches are advantageous for this application, but only some of them are used to enhance the efficiency of application. Generating ACs uses patterns and EDAs could increase the diversity. According to the experimental result, the algorithm which we proposed has a better performance to solve the permutation flow-shop problems.

Keywords: combinatorial problems, sequential pattern mining, estimationof distribution algorithms, artificial chromosomes

Procedia PDF Downloads 611
25226 QoS-CBMG: A Model for e-Commerce Customer Behavior

Authors: Hoda Ghavamipoor, S. Alireza Hashemi Golpayegani

Abstract:

An approach to model the customer interaction with e-commerce websites is presented. Considering the service quality level as a predictive feature, we offer an improved method based on the Customer Behavior Model Graph (CBMG), a state-transition graph model. To derive the Quality of Service sensitive-CBMG (QoS-CBMG) model, process-mining techniques is applied to pre-processed website server logs which are categorized as ‘buy’ or ‘visit’. Experimental results on an e-commerce website data confirmed that the proposed method outperforms CBMG based method.

Keywords: customer behavior model, electronic commerce, quality of service, customer behavior model graph, process mining

Procedia PDF Downloads 416
25225 Text Mining Techniques for Prioritizing Pathogenic Mutations in Protein Families Known to Misfold or Aggregate

Authors: Khaleel Saleh Al-Rababah

Abstract:

Amyloid fibril forming regions, which are known as protein aggregates, in sequences of some protein families are associated with a number of diseases known as amyloidosis. Mutations play a role in forming fibrils by accelerating the fibril formation process. In this paper we want to extract diseases that caused by those mutations as a result of the impact of the mutations on structural and functional properties of the aggregated protein. We propose a text mining system, to automatically extract mutations, diseases and relations between mutations and diseases. We presented an algorithm based on finite state to cluster mutations found in the same sentence as a sentence could contain different mutation cause different diseases. Also, we presented a co reference algorithm that enables cross-link sentences.

Keywords: amyloid, amyloidosis, co reference, protein, text mining

Procedia PDF Downloads 524