Search results for: Text mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1054

Search results for: Text mining

664 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3173
663 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: Contamination, mining activities, surface water, trace metals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925
662 Evaluation of Energy and Environmental Aspects of Reduced Tillage Systems Applied in Maize Cultivation

Authors: E. Sarauskis, L. Masilionyte, Z. Kriauciuniene, K. Romaneckas, S. Buragiene

Abstract:

In maize growing technologies, tillage technological operations are the most time-consuming and require the greatest fuel input. Substitution of conventional tillage, involving deep ploughing, by other reduced tillage methods can reduce technological production costs, diminish soil degradation and environmental pollution from greenhouse gas emissions, as well as improve economic competitiveness of agricultural produce.

Experiments designed to assess energy and environmental aspects associated with different reduced tillage systems, applied in maize cultivation were conducted at Aleksandras Stulginskis University taking into account Lithuania’s economic and climate conditions. The study involved 5 tillage treatments: deep ploughing (DP, control), shallow ploughing (SP), deep cultivation (DC), shallow cultivation (SC) and no-tillage (NT).

Our experimental evidence suggests that with the application of reduced tillage systems it is feasible to reduce fuel consumption by 13-58% and working time input by 8.4% to nearly 3-fold, to reduce the cost price of maize cultivation technological operations, decrease environmental pollution with CO2 gas by 30 to 146 kg ha-1, compared with the deep ploughing.

Keywords: Reduced tillage, energy and environmental assessment, fuel consumption, CO2 emission, maize.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2048
661 Improved Safety Science: Utilizing a Design Hierarchy

Authors: Ulrica Pettersson

Abstract:

Collection of information on incidents is regularly done through pre-printed incident report forms. These tend to be incomplete and frequently lack essential information. ne consequence is that reports with inadequate information, that do not fulfil analysts’ requirements, are transferred into the analysis process. To improve an incident reporting form, theory in design science, witness psychology and interview and questionnaire research has been used. Previously three experiments have been conducted to evaluate the form and shown significant improved results. The form has proved to capture knowledge, regardless of the incidents’ character or context. The aim in this paper is to describe how design science, in more detail a design hierarchy can be used to construct a collection form for improvements in safety science.

Keywords: Design science, data collection, form, incident report, safety science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 798
660 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544
659 Non-Homogeneous Layered Fiber Reinforced Concrete

Authors: Vitalijs Lusis, Andrejs Krasnikovs

Abstract:

Fiber reinforced concrete is important material for load bearing structural elements. Usually fibers are homogeneously distributed in a concrete body having arbitrary spatial orientations. At the same time, in many situations, fiber concrete with oriented fibers is more optimal. Is obvious, that is possible to create constructions with oriented short fibers in them, in different ways. Present research is devoted to one of such approaches- fiber reinforced concrete prisms having dimensions 100mm ×100mm ×400mmwith layers of non-homogeneously distributed fibers inside them were fabricated.

Simultaneously prisms with homogeneously dispersed fibers were produced for reference as well. Prisms were tested under four point bending conditions. During the tests vertical deflection at the center of every prism and crack opening were measured (using linear displacements transducers in real timescale). Prediction results were discussed.

Keywords: Fiber reinforced concrete, 4-point bending, steel fiber.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2917
658 An Improved Method to Watermark Images Sensitive to Blocking Artifacts

Authors: Afzel Noore

Abstract:

A new digital watermarking technique for images that are sensitive to blocking artifacts is presented. Experimental results show that the proposed MDCT based approach produces highly imperceptible watermarked images and is robust to attacks such as compression, noise, filtering and geometric transformations. The proposed MDCT watermarking technique is applied to fingerprints for ensuring security. The face image and demographic text data of an individual are used as multiple watermarks. An AFIS system was used to quantitatively evaluate the matching performance of the MDCT-based watermarked fingerprint. The high fingerprint matching scores show that the MDCT approach is resilient to blocking artifacts. The quality of the extracted face and extracted text images was computed using two human visual system metrics and the results show that the image quality was high.

Keywords: Digital watermarking, data hiding, modified discretecosine transformation (MDCT).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
657 Measuring Text-Based Semantics Relatedness Using WordNet

Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed

Abstract:

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 779
656 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 918
655 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2719
654 Investigation of Constant Transconductance Circuit for Low Power Low-Noise Amplifier

Authors: Wei Yi Lim, M. Annamalai Arasu, M. Kumarasamy Raja, Minkyu Je

Abstract:

In this paper, the design of wide-swing constant transconductance (gm) bias circuit that generates bias voltage for low-noise amplifier (LNA) circuit design by using an off-chip resistor is demonstrated. The overall transconductance (Gm) generated by the constant gm bias circuit is important to maintain the overall gain and noise figure of the LNA circuit. Therefore, investigation is performed to study the variation in Gm with process, temperature and supply voltage (PVT).  Temperature and supply voltage are swept from -10 °C to 85 °C and 1.425 V to 1.575 V respectively, while the process conditions are also varied to the extreme and the gm variation is eventually concluded at between -3 % to 7 %. With the slight variation in the gm value, through simulation, at worst condition of state SS, we are able to attain a conversion gain (S21) variation of -3.10 % and a noise figure (NF) variation of 18.71 %. The whole constant gm circuit draws approximately 100 µA from a 1.5V supply and is designed based on 0.13 µm CMOS process. 

Keywords: Transconductance, LNA, temperature, process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4089
653 AI Tutor: A Computer Science Domain Knowledge Graph-Based QA System on JADE platform

Authors: Yingqi Cui, Changran Huang, Raymond Lee

Abstract:

In this paper, we proposed an AI Tutor using ontology and natural language process techniques to generate a computer science domain knowledge graph and answer users’ questions based on the knowledge graph. We define eight types of relation to extract relationships between entities according to the computer science domain text. The AI tutor is separated into two agents: learning agent and Question-Answer (QA) agent and developed on JADE (a multi-agent system) platform. The learning agent is responsible for reading text to extract information and generate a corresponding knowledge graph by defined patterns. The QA agent can understand the users’ questions and answer humans’ questions based on the knowledge graph generated by the learning agent.

Keywords: Artificial intelligence, natural language process, knowledge graph, agent, QA system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 822
652 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2018
651 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1368
650 Profile Analysis and Experiences on the Tourism of Jalapão State Park, Tocantins, Brazil

Authors: Veruska C. Dutra, Mary L. G. S. Senna, Felipe S. Spindler

Abstract:

The State Park Jalapão - PEJ proved to be one of the protected areas that has attracted tourists from all over the world with its unique scenic landscapes. Although the region already has a considerable tourist flow, to our knowledge there is a lack of continuity of studies in the region capable of drawing a plan of activities, such as the profile of the tourist and analysis of their experiences in the region, carried out from 2006-2007. Therefore, this study was proposed to know the profile and experiences of tourists visiting the park today, making a connection with the earlier study, in order to generate subsidies to trace improvement actions. We conducted interviews with tourists in the main tourism season 2015. The results show that after eight years of carrying out the first study, there were no changes, highlighting the lack of a tourism plan for the park.

Keywords: Jalapão, profile tourist, level of satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003
649 Degraded Document Analysis and Extraction of Original Text Document: An Approach without Optical Character Recognition

Authors: L. Hamsaveni, Navya Prakash, Suresha

Abstract:

Document Image Analysis recognizes text and graphics in documents acquired as images. An approach without Optical Character Recognition (OCR) for degraded document image analysis has been adopted in this paper. The technique involves document imaging methods such as Image Fusing and Speeded Up Robust Features (SURF) Detection to identify and extract the degraded regions from a set of document images to obtain an original document with complete information. In case, degraded document image captured is skewed, it has to be straightened (deskew) to perform further process. A special format of image storing known as YCbCr is used as a tool to convert the Grayscale image to RGB image format. The presented algorithm is tested on various types of degraded documents such as printed documents, handwritten documents, old script documents and handwritten image sketches in documents. The purpose of this research is to obtain an original document for a given set of degraded documents of the same source.

Keywords: Grayscale image format, image fusing, SURF detection, YCbCr image format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1118
648 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography

Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway

Abstract:

This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.

Keywords: Steganography, stego, LSB, crop.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
647 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
646 Data Embedding Based on Better Use of Bits in Image Pixels

Authors: Rehab H. Alwan, Fadhil J. Kadhim, Ahmad T. Al-Taani

Abstract:

In this study, a novel approach of image embedding is introduced. The proposed method consists of three main steps. First, the edge of the image is detected using Sobel mask filters. Second, the least significant bit LSB of each pixel is used. Finally, a gray level connectivity is applied using a fuzzy approach and the ASCII code is used for information hiding. The prior bit of the LSB represents the edged image after gray level connectivity, and the remaining six bits represent the original image with very little difference in contrast. The proposed method embeds three images in one image and includes, as a special case of data embedding, information hiding, identifying and authenticating text embedded within the digital images. Image embedding method is considered to be one of the good compression methods, in terms of reserving memory space. Moreover, information hiding within digital image can be used for security information transfer. The creation and extraction of three embedded images, and hiding text information is discussed and illustrated, in the following sections.

Keywords: Image embedding, Edge detection, gray level connectivity, information hiding, digital image compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104
645 Skew Detection Technique for Binary Document Images based on Hough Transform

Authors: Manjunath Aradhya V N, Hemantha Kumar G, Shivakumara P

Abstract:

Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents with different resolutions, to reveal the robustness of the proposed method. The experimental results revealed that the proposed method is accurate compared to the results of well-known existing methods.

Keywords: Optical Character Recognition, Skew angle, Thinning, Hough transform, Document processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057
644 Effects of Computer–Based Instructional Designs among Pupils of Different Music Intelligence Levels

Authors: Aldalalah, M. Osamah, Soon Fook Fong

Abstract:

The purpose of this study was to investigate the effects of computer–based instructional designs, namely modality and redundancy principles on the attitude and learning of music theory among primary pupils of different Music Intelligence levels. The lesson of music theory was developed in three different modes, audio and image (AI), text with image (TI) and audio with image and text (AIT). The independent variables were the three modes of courseware. The moderator variable was music intelligence. The dependent variables were the post test score. ANOVA was used to determine the significant differences of the pretest scores among the three groups. Analyses of covariance (ANCOVA) and Post hoc were carried out to examine the main effects as well as the interaction effects of the independent variables on the dependent variables. High music intelligence pupils performed significantly better than low music intelligence pupils in all the three treatment modes. The AI mode was found to help pupils with low music intelligence significantly more than the TI and AIT modes.

Keywords: Modality, Redundancy, Music theory, Cognitivetheory of multimedia learning, Cognitive load theory, Musicintelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1632
643 Mining Implicit Knowledge to Predict Political Risk by Providing Novel Framework with Using Bayesian Network

Authors: Siavash Asadi Ghajarloo

Abstract:

Nowadays predicting political risk level of country has become a critical issue for investors who intend to achieve accurate information concerning stability of the business environments. Since, most of the times investors are layman and nonprofessional IT personnel; this paper aims to propose a framework named GECR in order to help nonexpert persons to discover political risk stability across time based on the political news and events. To achieve this goal, the Bayesian Networks approach was utilized for 186 political news of Pakistan as sample dataset. Bayesian Networks as an artificial intelligence approach has been employed in presented framework, since this is a powerful technique that can be applied to model uncertain domains. The results showed that our framework along with Bayesian Networks as decision support tool, predicted the political risk level with a high degree of accuracy.

Keywords: Bayesian Networks, Data mining, GECRframework, Predicting political risk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2137
642 Appraisal of Methods for Identifying, Mapping, and Modelling of Fluvial Erosion in a Mining Environment

Authors: F. F. Howard, I. Yakubu, C. B. Boye, J. S. Y. Kuma

Abstract:

Natural and human activities, such as mining operations, expose the natural soil to adverse environmental conditions, leading to contamination of soil, groundwater, and surface water, which has negative effects on humans, flora, and fauna. Bare or partly exposed soil is most liable to fluvial erosion. This paper enumerates various methods used to identify, map, and model fluvial erosion in a mining environment. Classical, Artificial Intelligence (AI), and GIS methods have been reviewed. One of the many classical methods used to estimate river erosion is the Revised Universal Soil Loss Equation (RUSLE) model. The RUSLE model is easy to use. Its reliance on empirical relationships that may not always be applicable to specific circumstances or locations is a flaw. Other classical models for estimating fluvial erosion are the Soil and Water Assessment Tool (SWAT) and the Universal Soil Loss Equation (USLE). These models offer a more complete understanding of the underlying physical processes and encompass a wider range of situations. Although more difficult to utilise, they depend on the availability and dependability of input data for correctness. AI can help deal with multivariate and complex difficulties and predict soil loss with higher accuracy than traditional methods, and also be used to build unique models for identifying degraded areas. AI techniques have become popular as an alternative predictor for degraded environments. However, this research proposed a hybrid of classical, AI, and GIS methods for efficient and effective modelling of fluvial erosion.

Keywords: Fluvial erosion, classical methods, Artificial Intelligence, Geographic Information System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 122
641 Modality and Redundancy Effects on Music Theory Learning Among Pupils of Different Anxiety Levels

Authors: Soon Fook Fong, Aldalalah, M. Osamah

Abstract:

The purpose of this study was to investigate effects of modality and redundancy principles on music theory learning among pupils of different anxiety levels. The lesson of music theory was developed in three different modes, audio and image (AI), text with image (TI) and audio with image and text (AIT). The independent variables were the three modes of courseware. The moderator variable was the anxiety level, while the dependent variable was the post test score. The study sample consisted of 405 third-grade pupils. Descriptive and inferential statistics were conducted to analyze the collected data. Analyses of covariance (ANCOVA) and Post hoc were carried out to examine the main effects as well as the interaction effects of the independent variables on the dependent variable. The findings of this study showed that medium anxiety pupils performed significantly better than low and high anxiety pupils in all the three treatment modes. The AI mode was found to help pupils with high anxiety significantly more than the TI and AIT modes.

Keywords: Modality, Redundancy, Music theory, Cognitivetheory of multimedia learning, Cognitive load theory, Anxiety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2091
640 A Hybrid Approach for Thread Recommendation in MOOC Forums

Authors: Ahmad. A. Kardan, Amir Narimani, Foozhan Ataiefard

Abstract:

Recommender Systems have been developed to provide contents and services compatible to users based on their behaviors and interests. Due to information overload in online discussion forums and users diverse interests, recommending relative topics and threads is considered to be helpful for improving the ease of forum usage. In order to lead learners to find relevant information in educational forums, recommendations are even more needed. We present a hybrid thread recommender system for MOOC forums by applying social network analysis and association rule mining techniques. Initial results indicate that the proposed recommender system performs comparatively well with regard to limited available data from users' previous posts in the forum.

Keywords: Association rule mining, hybrid recommender system, massive open online courses, MOOCs, social network analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1221
639 Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script

Authors: M. K. Jindal, G. S. Lehal, R. K. Sharma

Abstract:

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone, upper zone and lower zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded printed Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. We have also discussed a new and useful technique to segment the horizontally overlapping lines.

Keywords: Character Segmentation, Middle Zone, Upper Zone, Lower Zone, Touching Characters, Horizontally Overlapping Lines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
638 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline M. R. Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.

Keywords: Brazil, classifiers, data-mining, Image Segmentation, oil well visualization, classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2514
637 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: Information retrieval (IR), unified medical language system (UMLS), Syntax Based Analysis, natural language processing (NLP), medical informatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 733
636 Secure Text Steganography for Microsoft Word Document

Authors: Khan Farhan Rafat, M. Junaid Hussain

Abstract:

Seamless modification of an entity for the purpose of hiding a message of significance inside its substance in a manner that the embedding remains oblivious to an observer is known as steganography. Together with today's pervasive registering frameworks, steganography has developed into a science that offers an assortment of strategies for stealth correspondence over the globe that must, however, need a critical appraisal from security breach standpoint. Microsoft Word is amongst the preferably used word processing software, which comes as a part of the Microsoft Office suite. With a user-friendly graphical interface, the richness of text editing, and formatting topographies, the documents produced through this software are also most suitable for stealth communication. This research aimed not only to epitomize the fundamental concepts of steganography but also to expound on the utilization of Microsoft Word document as a carrier for furtive message exchange. The exertion is to examine contemporary message hiding schemes from security aspect so as to present the explorative discoveries and suggest enhancements which may serve a wellspring of information to encourage such futuristic research endeavors.

Keywords: Hiding information in plain sight, stealth communication, oblivious information exchange, conceal, steganography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
635 Development of a Process to Manufacture High Quality Refined Salt from Crude Solar Salt

Authors: Rathnayaka D. D. T., Vidanage P. W., Wasalathilake K. C., Wickramasingha H. W., Wijayarathne U. P. L, Perera S. A. S.

Abstract:

This paper describes the research carried out to develop a process to increase the NaCl percentage of crude salt which is obtained from the conventional solar evaporation process. In this study refined salt was produced from crude solar salt by a chemico-physical method which consists of coagulation, precipitation and filtration. Initially crude salt crystals were crushed and dissolved in water. Optimum amounts of calcium hydroxide, sodium carbonate and Polyaluminiumchloride (PAC) were added to the solution respectively.

Refined NaCl solution was separated out by a filtration process. The solution was tested for Total Suspended Solids, SO42-, Mg2+, Ca2+. With optimum dosage of reagents, the results showed that a level of 99.60% NaCl could be achieved. Further this paper discusses the economic viability of the proposed process. A 83% profit margin can be achieved by this process and it is an increase of 112.3% compared to the traditional process.

Keywords: Chemico-physical, Economic, Optimum, Refined, Solar Salt.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4604