Search results for: Web Usage Mining.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1156

Search results for: Web Usage Mining.

946 Parallel and Distributed Mining of Association Rule on Knowledge Grid

Authors: U. Sakthi, R. Hemalatha, R. S. Bhuvaneswaran

Abstract:

In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear.

Keywords: Association rule, Grid computing, Knowledge grid, Mobility prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2182
945 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3094
944 An Educational Data Mining System for Advising Higher Education Students

Authors: Heba Mohammed Nagy, Walid Mohamed Aly, Osama Fathy Hegazy

Abstract:

Educational  data mining  is  a  specific  data   mining field applied to data originating from educational environments, it relies on different  approaches to discover hidden knowledge  from  the  available   data. Among these approaches are   machine   learning techniques which are used to build a system that acquires learning from previous data. Machine learning can be applied to solve different regression, classification, clustering and optimization problems.

In  our  research, we propose  a “Student  Advisory  Framework” that  utilizes  classification  and  clustering  to  build  an  intelligent system. This system can be used to provide pieces of consultations to a first year  university  student to  pursue a  certain   education   track   where  he/she  will  likely  succeed  in, aiming  to  decrease   the  high  rate   of  academic  failure   among these  students.  A real case study  in Cairo  Higher  Institute  for Engineering, Computer  Science  and  Management  is  presented using  real  dataset   collected  from  2000−2012.The dataset has two main components: pre-higher education dataset and first year courses results dataset. Results have proved the efficiency of the suggested framework.

Keywords: Classification, Clustering, Educational Data Mining (EDM), Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5215
943 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: Decision tree, classification, data mining, scholarship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2160
942 Post ERP Feral System and use of ‘Feral System as Coping Mechanism

Authors: Tajul Urus, S., Molla, A., Teoh, S.Y.

Abstract:

A number of studies highlighted problems related to ERP systems, yet, most of these studies focus on the problems during the project and implementation stages but not during the postimplementation use process. Problems encountered in the process of using ERP would hinder the effective exploitation and the extended and continued use of ERP systems and their value to organisations. This paper investigates the different types of problems users (operational, supervisory and managerial) faced in using ERP and how 'feral system' is used as the coping mechanism. The paper adopts a qualitative method and uses data collected from two cases and 26 interviews, to inductively develop a casual network model of ERP usage problem and its coping mechanism. This model classified post ERP usage problems as data quality, system quality, interface and infrastructure. The model is also categorised the different coping mechanism through use of 'feral system' inclusive of feral information system, feral data and feral use of technology.

Keywords: Case Studies, Coping Mechanism, Post Implementation ERP system, Usage Problem

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1510
941 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: Text mining, Twitter, topic model, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
940 Using Textual Pre-Processing and Text Mining to Create Semantic Links

Authors: Ricardo Avila, Gabriel Lopes, Vania Vidal, Jose Macedo

Abstract:

This article offers a approach to the automatic discovery of semantic concepts and links in the domain of Oil Exploration and Production (E&P). Machine learning methods combined with textual pre-processing techniques were used to detect local patterns in texts and, thus, generate new concepts and new semantic links. Even using more specific vocabularies within the oil domain, our approach has achieved satisfactory results, suggesting that the proposal can be applied in other domains and languages, requiring only minor adjustments.

Keywords: Semantic links, data mining, linked data, SKOS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1064
939 The Relevance of Data Warehousing and Data Mining in the Field of Evidence-based Medicine to Support Healthcare Decision Making

Authors: Nevena Stolba, A Min Tjoa

Abstract:

Evidence-based medicine is a new direction in modern healthcare. Its task is to prevent, diagnose and medicate diseases using medical evidence. Medical data about a large patient population is analyzed to perform healthcare management and medical research. In order to obtain the best evidence for a given disease, external clinical expertise as well as internal clinical experience must be available to the healthcare practitioners at right time and in the right manner. External evidence-based knowledge can not be applied directly to the patient without adjusting it to the patient-s health condition. We propose a data warehouse based approach as a suitable solution for the integration of external evidence-based data sources into the existing clinical information system and data mining techniques for finding appropriate therapy for a given patient and a given disease. Through integration of data warehousing, OLAP and data mining techniques in the healthcare area, an easy to use decision support platform, which supports decision making process of care givers and clinical managers, is built. We present three case studies, which show, that a clinical data warehouse that facilitates evidence-based medicine is a reliable, powerful and user-friendly platform for strategic decision making, which has a great relevance for the practice and acceptance of evidence-based medicine.

Keywords: data mining, data warehousing, decision-support systems, evidence-based medicine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3813
938 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 624
937 Chances and Challenges of Intelligent Technologies in the Production and Retail Sector

Authors: Carsten Röcker

Abstract:

This paper provides an introduction into the evolution of information and communication technology and illustrates its usage in the work domain. The paper is sub-divided into two parts. The first part gives an overview over the different phases of information processing in the work domain. It starts by charting the past and present usage of computers in work environments and shows current technological trends, which are likely to influence future business applications. The second part starts by briefly describing, how the usage of computers changed business processes in the past, and presents first Ambient Intelligence applications based on identification and localization information, which are already used in the production and retail sector. Based on current systems and prototype applications, the paper gives an outlook of how Ambient Intelligence technologies could change business processes in the future.

Keywords: Ambient Intelligence, Ubiquitous Computing, Business Applications, Radio Frequency Identification (RFID).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
936 Ambient Intelligence in the Production and Retail Sector: Emerging Opportunities and Potential Pitfalls

Authors: Carsten Röcker

Abstract:

This paper provides an introduction into the evolution of information and communication technology and illustrates its usage in the work domain. The paper is sub-divided into two parts. The first part gives an overview over the different phases of information processing in the work domain. It starts by charting the past and present usage of computers in work environments and shows current technological trends, which are likely to influence future business applications. The second part starts by briefly describing, how the usage of computers changed business processes in the past, and presents first Ambient Intelligence applications based on identification and localization information, which are already used in the production and retail sector. Based on current systems and prototype applications, the paper gives an outlook of how Ambient Intelligence technologies could change business processes in the future.

Keywords: Ambient Intelligence, Ubiquitous Computing, Business Applications, Radio Frequency Identification (RFID)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
935 Multidimensional and Data Mining Analysis for Property Investment Risk Analysis

Authors: Nur Atiqah Rochin Demong, Jie Lu, Farookh Khadeer Hussain

Abstract:

Property investment in the real estate industry has a high risk due to the uncertainty factors that will affect the decisions made and high cost. Analytic hierarchy process has existed for some time in which referred to an expert-s opinion to measure the uncertainty of the risk factors for the risk analysis. Therefore, different level of experts- experiences will create different opinion and lead to the conflict among the experts in the field. The objective of this paper is to propose a new technique to measure the uncertainty of the risk factors based on multidimensional data model and data mining techniques as deterministic approach. The propose technique consist of a basic framework which includes four modules: user, technology, end-user access tools and applications. The property investment risk analysis defines as a micro level analysis as the features of the property will be considered in the analysis in this paper.

Keywords: Uncertainty factors, data mining, multidimensional data model, risk analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2923
934 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: Social Media, text mining, knowledge discovery, predictive analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3849
933 Customer Need Type Classification Model using Data Mining Techniques for Recommender Systems

Authors: Kyoung-jae Kim

Abstract:

Recommender systems are usually regarded as an important marketing tool in the e-commerce. They use important information about users to facilitate accurate recommendation. The information includes user context such as location, time and interest for personalization of mobile users. We can easily collect information about location and time because mobile devices communicate with the base station of the service provider. However, information about user interest can-t be easily collected because user interest can not be captured automatically without user-s approval process. User interest usually represented as a need. In this study, we classify needs into two types according to prior research. This study investigates the usefulness of data mining techniques for classifying user need type for recommendation systems. We employ several data mining techniques including artificial neural networks, decision trees, case-based reasoning, and multivariate discriminant analysis. Experimental results show that CHAID algorithm outperforms other models for classifying user need type. This study performs McNemar test to examine the statistical significance of the differences of classification results. The results of McNemar test also show that CHAID performs better than the other models with statistical significance.

Keywords: Customer need type, Data mining techniques, Recommender system, Personalization, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2146
932 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino

Abstract:

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Keywords: Cluster analysis, constipation, data mining, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1298
931 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial neural network, competitive dynamics, logistic regression, text classification, text mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 537
930 Probabilistic Approach as a Method Used in the Solution of Engineering Design for Biomechanics and Mining

Authors: Karel Frydrýšek

Abstract:

This paper focuses on the probabilistic numerical solution of the problems in biomechanics and mining. Applications of Simulation-Based Reliability Assessment (SBRA) Method are presented in the solution of designing of the external fixators applied in traumatology and orthopaedics (these fixators can be applied for the treatment of open and unstable fractures etc.) and in the solution of a hard rock (ore) disintegration process (i.e. the bit moves into the ore and subsequently disintegrates it, the results are compared with experiments, new design of excavation tool is proposed.

Keywords: probabilistic approach, engineering design, traumatology, rock mechanics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480
929 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application

Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman

Abstract:

Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.

Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4482
928 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: Health informatics, data mining, nutritional and health databases, nutritional and chronical databases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
927 Semantic Web as an Enabling Technology for Better e-Services Addoption

Authors: Luka Pavlič, Marjan Heričko

Abstract:

E-services have significantly changed the way of doing business in recent years. We can, however, observe poor use of these services. There is a large gap between supply and actual eservices usage. This is why we started a project to provide an environment that will encourage the use of e-services. We believe that only providing e-service does not automatically mean consumers would use them. This paper shows the origins of our project and its current position. We discuss the decision of using semantic web technologies and their potential to improve e-services usage. We also present current knowledge base and its real-world classification. In the paper, we discuss further work to be done in the project. Current state of the project is promising.

Keywords: E-Services, E-Services Repository, Ontologies, Semantic Web

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
926 The Relationship between Depression Interpersonal Communication and Media Using Among International Students

Authors: Birol Gülnar, Hacer Aker

Abstract:

Student-s movements have been going increasing in last decades. International students can have different psychological and sociological problems in their adaptation process. Depression is one of the most important problems in this procedure. This research purposed to reveal level of foreign students- depression, kinds of interpersonal communication networks (host/ethnic interpersonal communication) and media usage (host/ethnic media usage). Additionally study aimed to display the relationship between depression and communication (host/ethnic interpersonal communication and host/ethnic media usage) among foreign university students. A field research was performed among 283 foreign university students who have been attending 8 different universities in Turkey. A purposeful sampling technique was used in this research cause of data collect facilities. Results indicated that 58.3% of foreign students- depression stage was “intermediate" while 33.2% of foreign students- depression level was “low". Add to this, host interpersonal communication behaviors and Turkish web sites usages were negatively and significantly correlated with depression.

Keywords: International students, depression, interpersonal communication behaviors, media using.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2888
925 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

Authors: Helyane Bronoski Borges, Júlio Cesar Nievola

Abstract:

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Keywords: Attribute selection, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1417
924 Online Forums Hotspot Detection and Analysis Using Aging Theory

Authors: K. Nirmala Devi, V. Murali Bhaskaran

Abstract:

The exponential growth of social media arouses much attention on public opinion information. The online forums, blogs, micro blogs are proving to be extremely valuable resources and are having bulk volume of information. However, most of the social media data is unstructured and semi structured form. So that it is more difficult to decipher automatically. Therefore, it is very much essential to understand and analyze those data for making a right decision. The online forums hotspot detection is a promising research field in the web mining and it guides to motivate the user to take right decision in right time. The proposed system consist of a novel approach to detect a hotspot forum for any given time period. It uses aging theory to find the hot terms and E-K-means for detecting the hotspot forum. Experimental results demonstrate that the proposed approach outperforms k-means for detecting the hotspot forums with the improved accuracy.

Keywords: Hotspot forums, Micro blog, Blog, Sentiment Analysis, Opinion Mining, Social media, Twitter, Web mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2185
923 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3518
922 A Novel Approach to Avoid Billing Attack on VOIP System

Authors: Narendra M. Shekokar, Satish R. Devane

Abstract:

In a recent year usage of VoIP subscription has increased tremendously as compare to Public Switching Telephone System(PSTN). A VoIP subscriber would like to know the exact tariffs of the calls made using VoIP. As the usage increases, the rate of fraud is also increases, causing users complain about excess billing. This in turn hampers the growth of VoIP .This paper describe the common frauds and attack on VoIP based system and make an attempt to solve the billing attack by creating secured channel between caller and callee.

Keywords: VoIP, Billing-fraud, SSL/TLS, MITM, Replay-attack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1653
921 Total and Leachable Concentration of Trace Elements in Soil towards Human Health Risk, Related with Coal Mine in Jorong, South Kalimantan, Indonesia

Authors: Arie Pujiwati, Kengo Nakamura, Noriaki Watanabe, Takeshi Komai

Abstract:

Coal mining is well known to cause considerable environmental impacts, including trace element contamination of soil. This study aimed to assess the trace element (As, Cd, Co, Cu, Ni, Pb, Sb, and Zn) contamination of soil in the vicinity of coal mining activities, using the case study of Asam-asam River basin, South Kalimantan, Indonesia, and to assess the human health risk, incorporating total and bioavailable (water-leachable and acid-leachable) concentrations. The results show the enrichment of As and Co in soil, surpassing the background soil value. Contamination was evaluated based on the index of geo-accumulation, Igeo and the pollution index, PI. Igeo values showed that the soil was generally uncontaminated (Igeo ≤ 0), except for elevated As and Co. Mean PI for Ni and Cu indicated slight contamination. Regarding the assessment of health risks, the Hazard Index, HI showed adverse risks (HI > 1) for Ni, Co, and As. Further, Ni and As were found to pose unacceptable carcinogenic risk (risk > 1.10-5). Farming, settlement, and plantation were found to present greater risk than coal mines. These results show that coal mining activity in the study area contaminates the soils by particular elements and may pose potential human health risk in its surrounding area. This study is important for setting appropriate countermeasure actions and improving basic coal mining management in Indonesia.

Keywords: Coal mine, risk, soil, trace elements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1175
920 The Power of Indigenous Peoples in Decision-Making Processes of Mining Projects: The Pilbara Region

Authors: K. N. Penna, J. P. English

Abstract:

The destruction of the Juukan Gorge rock shelters in 2020 has catalysed impetus within Australian society for a significant change in engagement with Indigenous Peoples, and the approach to Indigenous cultural heritage, both within the Pilbara region and more broadly across Australia. Culture-based and people-centred approaches are inherent to inclusive sustainable development and Free, Prior, Informed Consent, outcomes encouraged by international and local recommendations on the human rights and cultural heritage preservation of Indigenous peoples. In this paper, we present an interpretive model of an evolved process for mining project development, incorporating culture-based and people-centred approaches, based on the Theory U system change method. The evolved process advocates a change in organisational mindset and culture, and a comprehensive understanding of Indigenous Peoples’ culture and values, as the foundations for increasing their influence and achieving mutually beneficial developments.

Keywords: Indigenous Engagement, mining industry, culture-based approach, people-centred approach, Theory U.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 440
919 Comparative Study of Universities’ Web Structure Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.

Keywords: Algorithm, ranking, website, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667
918 A Testbed for the Experiments Performed in Missing Value Treatments

Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo

Abstract:

The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.

Keywords: Data imputation, data mining, missing values treatment, testbed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
917 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Authors: Florin Gorunescu

Abstract:

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1868