Search results for: Sequential pattern mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1595

Search results for: Sequential pattern mining

1175 Pattern Recognition Techniques Applied to Biomedical Patterns

Authors: Giovanni Luca Masala

Abstract:

Pattern recognition is the research area of Artificial Intelligence that studies the operation and design of systems that recognize patterns in the data. Important application areas are image analysis, character recognition, fingerprint classification, speech analysis, DNA sequence identification, man and machine diagnostics, person identification and industrial inspection. The interest in improving the classification systems of data analysis is independent from the context of applications. In fact, in many studies it is often the case to have to recognize and to distinguish groups of various objects, which requires the need for valid instruments capable to perform this task. The objective of this article is to show several methodologies of Artificial Intelligence for data classification applied to biomedical patterns. In particular, this work deals with the realization of a Computer-Aided Detection system (CADe) that is able to assist the radiologist in identifying types of mammary tumor lesions. As an additional biomedical application of the classification systems, we present a study conducted on blood samples which shows how these methods may help to distinguish between carriers of Thalassemia (or Mediterranean Anaemia) and healthy subjects.

Keywords: Computer Aided Detection, mammary tumor, pattern recognition, dissimilarity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2316
1174 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1862
1173 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1968
1172 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: Tool condition monitoring, cutting tool replacement, data stream mining, e-Nose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1852
1171 PUMA 560 Optimal Trajectory Control using Genetic Algorithm, Simulated Annealing and Generalized Pattern Search Techniques

Authors: Sufian Ashraf Mazhari, Surendra Kumar

Abstract:

Robot manipulators are highly coupled nonlinear systems, therefore real system and mathematical model of dynamics used for control system design are not same. Hence, fine-tuning of controller is always needed. For better tuning fast simulation speed is desired. Since, Matlab incorporates LAPACK to increase the speed and complexity of matrix computation, dynamics, forward and inverse kinematics of PUMA 560 is modeled on Matlab/Simulink in such a way that all operations are matrix based which give very less simulation time. This paper compares PID parameter tuning using Genetic Algorithm, Simulated Annealing, Generalized Pattern Search (GPS) and Hybrid Search techniques. Controller performances for all these methods are compared in terms of joint space ITSE and cartesian space ISE for tracking circular and butterfly trajectories. Disturbance signal is added to check robustness of controller. GAGPS hybrid search technique is showing best results for tuning PID controller parameters in terms of ITSE and robustness.

Keywords: Controller Tuning, Genetic Algorithm, Pattern Search, Robotic Controller, Simulated Annealing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3669
1170 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: Data mining, K-means, road traffic accidents, Waze, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150
1169 Scheduling Method for Electric Heater in HEMS Considering User’s Comfort

Authors: Yong-Sung Kim, Je-Seok Shin, Ho-Jun Jo Jin-O Kim

Abstract:

Home Energy Management System (HEMS), which makes the residential consumers, contribute to the demand response is attracting attention in recent years. An aim of HEMS is to minimize their electricity cost by controlling the use of their appliances according to electricity price. The use of appliances in HEMS may be affected by some conditions such as external temperature and electricity price. Therefore, the user’s usage pattern of appliances should be modeled according to the external conditions, and the resultant usage pattern is related to the user’s comfortability on use of each appliances. This paper proposes a methodology to model the usage pattern based on the historical data with the copula function. Through copula function, the usage range of each appliance can be obtained and is able to satisfy the appropriate user’s comfort according to the external conditions for next day. Within the usage range, an optimal scheduling for appliances would be conducted so as to minimize an electricity cost with considering user’s comfort. Among the home appliance, electric heater (EH) is a representative appliance, which is affected by the external temperature. In this paper, an optimal scheduling algorithm for an electric heater (EH) is addressed based on the method of branch and bound. As a result, scenarios for the EH usage are obtained according to user’s comfort levels and then the residential consumer would select the best scenario. The case study shows the effects of the proposed algorithm compared with the traditional operation of the EH, and it represents impacts of the comfort level on the scheduling result.

Keywords: Load scheduling, usage pattern, user’s comfort, copula function, branch, bound, electric heater.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2042
1168 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries were applied and implemented. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: Recommendation, user profile, data mining, web technology, mobile technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2112
1167 Enhancing the Performance of H.264/AVC in Adaptive Group of Pictures Mode Using Octagon and Square Search Pattern

Authors: S. Sowmyayani, P. Arockia Jansi Rani

Abstract:

This paper integrates Octagon and Square Search pattern (OCTSS) motion estimation algorithm into H.264/AVC (Advanced Video Coding) video codec in Adaptive Group of Pictures (AGOP) mode. AGOP structure is computed based on scene change in the video sequence. Octagon and square search pattern block-based motion estimation method is implemented in inter-prediction process of H.264/AVC. Both these methods reduce bit rate and computational complexity while maintaining the quality of the video sequence respectively. Experiments are conducted for different types of video sequence. The results substantially proved that the bit rate, computation time and PSNR gain achieved by the proposed method is better than the existing H.264/AVC with fixed GOP and AGOP. With a marginal gain in quality of 0.28dB and average gain in bitrate of 132.87kbps, the proposed method reduces the average computation time by 27.31 minutes when compared to the existing state-of-art H.264/AVC video codec.

Keywords: Block Distortion Measure, Block Matching Algorithms, H.264/AVC, Motion estimation, Search patterns, Shot cut detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1695
1166 Effect of Stitching Pattern on Composite Tubular Structures Subjected to Quasi-Static Crushing

Authors: Ali Rabiee, Hessam Ghasemnejad

Abstract:

Extensive experimental investigation on the effect of stitching pattern on tubular composite structures was conducted. The effect of stitching reinforcement through thickness on using glass flux yarn on energy absorption of fiber-reinforced polymer (FRP) was investigated under high speed loading conditions at axial loading. Keeping the mass of the structure at 125 grams and applying different pattern of stitching at various locations in theory enables better energy absorption, and also enables the control over the behaviour of force-crush distance curve. The study consists of simple non-stitch absorber comparison with single and multi-location stitching behaviour and its effect on energy absorption capabilities. The locations of reinforcements are 10 mm, 20 mm, 30 mm, 10-20 mm, 10-30 mm, 20-30 mm, 10-20-30 mm and 10-15-20-25-30-35 mm from the top of the specimen. The effect of through the thickness reinforcements has shown increase in energy absorption capabilities and crushing load. The significance of this is that as the stitching locations are closer, the crushing load increases and consequently energy absorption capabilities are also increased. The implementation of this idea would improve the mean force by applying stitching and controlling the behaviour of force-crush distance curve.

Keywords: Through-thickness, stitching, reinforcement, Tulbular composite structures, energy absorption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380
1165 AniMoveMineR: Animal Behavior Exploratory Analysis Using Association Rules Mining

Authors: Suelane Garcia Fontes, Silvio Luiz Stanzani, Pedro L. Pizzigatti Corrła Ronaldo G. Morato

Abstract:

Environmental changes and major natural disasters are most prevalent in the world due to the damage that humanity has caused to nature and these damages directly affect the lives of animals. Thus, the study of animal behavior and their interactions with the environment can provide knowledge that guides researchers and public agencies in preservation and conservation actions. Exploratory analysis of animal movement can determine the patterns of animal behavior and with technological advances the ability of animals to be tracked and, consequently, behavioral studies have been expanded. There is a lot of research on animal movement and behavior, but we note that a proposal that combines resources and allows for exploratory analysis of animal movement and provide statistical measures on individual animal behavior and its interaction with the environment is missing. The contribution of this paper is to present the framework AniMoveMineR, a unified solution that aggregates trajectory analysis and data mining techniques to explore animal movement data and provide a first step in responding questions about the animal individual behavior and their interactions with other animals over time and space. We evaluated the framework through the use of monitored jaguar data in the city of Miranda Pantanal, Brazil, in order to verify if the use of AniMoveMineR allows to identify the interaction level between these jaguars. The results were positive and provided indications about the individual behavior of jaguars and about which jaguars have the highest or lowest correlation.

Keywords: Data mining, data science, trajectory, animal behavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 835
1164 RB-Matcher: String Matching Technique

Authors: Rajender Singh Chillar, Barjesh Kochar

Abstract:

All Text processing systems allow their users to search a pattern of string from a given text. String matching is fundamental to database and text processing applications. Every text editor must contain a mechanism to search the current document for arbitrary strings. Spelling checkers scan an input text for words in the dictionary and reject any strings that do not match. We store our information in data bases so that later on we can retrieve the same and this retrieval can be done by using various string matching algorithms. This paper is describing a new string matching algorithm for various applications. A new algorithm has been designed with the help of Rabin Karp Matcher, to improve string matching process.

Keywords: Algorithm, Complexity, Matching-patterns, Pattern, Rabin-Karp, String, text-processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1729
1163 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3174
1162 Seasonal Variation of the Impact of Mining Activities on Ga-Selati River in Limpopo Province, South Africa

Authors: Joshua N. Edokpayi, John O. Odiyo, Patience P. Shikwambana

Abstract:

Water is a very rare natural resource in South Africa. Ga-Selati River is used for both domestic and industrial purposes. This study was carried out in order to assess the quality of Ga-Selati River in a mining area of Limpopo Province-Phalaborwa. The pH, Electrical Conductivity (EC) and Total Dissolved Solids (TDS) were determined using a Crinson multimeter while turbidity was measured using a Labcon Turbidimeter. The concentrations of Al, Ca, Cd, Cr, Fe, K, Mg, Mn, Na and Pb were analysed in triplicate using a Varian 520 flame atomic absorption spectrometer (AAS) supplied by PerkinElmer, after acid digestion with nitric acid in a fume cupboard. The average pH of the river from eight different sampling sites was 8.00 and 9.38 in wet and dry season respectively. Higher EC values were determined in the dry season (138.7 mS/m) than in the wet season (96.93 mS/m). Similarly, TDS values were higher in dry (929.29 mg/L) than in the wet season (640.72 mg/L) season. These values exceeded the recommended guideline of South Africa Department of Water Affairs and Forestry (DWAF) for domestic water use (70 mS/m) and that of the World Health Organization (WHO) (600 mS/m), respectively. Turbidity varied between 1.78-5.20 and 0.95-2.37 NTU in both wet and dry seasons. Total hardness of 312.50 mg/L and 297.75 mg/L as the concentration of CaCO3 was computed for the river in both the wet and the dry seasons and the river water was categorised as very hard. Mean concentration of the metals studied in both the wet and the dry seasons are: Na (94.06 mg/L and 196.3 mg/L), K (11.79 mg/L and 13.62 mg/L), Ca (45.60 mg/L and 41.30 mg/L), Mg (48.41 mg/L and 44.71 mg/L), Al (0.31 mg/L and 0.38 mg/L), Cd (0.01 mg/L and 0.01 mg/L), Cr (0.02 mg/L and 0.09 mg/L), Pb (0.05 mg/L and 0.06 mg/L), Mn (0.31 mg/L and 0.11 mg/L) and Fe (0.76 mg/L and 0.69 mg/L). Results from this study reveal that most of the metals were present in concentrations higher than the recommended guidelines of DWAF and WHO for domestic use and the protection of aquatic life.

Keywords: Contamination, mining activities, surface water, trace metals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1928
1161 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544
1160 Consumption Pattern and Dietary Practices of Pregnant Women in Odeda Local Government Area of Ogun State

Authors: Ademuyiwa, M. O., Sanni, S. A.

Abstract:

The importance of maternal nutritional practices during pregnancy cannot be overemphasized. This paper assessed the consumption pattern and dietary practices of 50 pregnant women selected using purposive sampling technique from three health care centres (Primary Health Care Centre, Obantoko; Primary Health Care Centre Alabata; and the General Hospital, Odeda) in Odeda Local Government Area of Ogun State, Nigeria. Structured questionnaire was used to elicit information on socioeconomic status, consumption pattern and dietary practices. Data were analyzed using the Statistical Package for Social Sciences (SPSS, 17). The results indicated that about 58% of the pregnant women were below the age of 30 while 42% were ages 28-40 years. Only 16% had tertiary education while (38%) had secondary education, 52% earn income through petty trading. On food intake, 52% got their energy source from rice on a daily basis, followed by pap (38%) and eko (34%). For protein intake, 36% consumed bean cake on a daily basis while 66% consumed moinmoin 2-3 times a week. Orange (48%) and Green Leafy vegetable (40%) accounted for the mostly consumed fruit and vegetable on daily basis. In terms of animal origin, fish (76%), meat (58%) and eggs (30%) were consumed daily, while chicken and snail were consumed occasionally by 54% and 42%, respectively. Forty-six percent (46%) of the pregnant women eat more than three times daily; while 60% of the women eat outside their homes with 42% respondents eat out lunch and only two percent least eaten out dinner. It is important to increase in awareness campaign to sensitize the pregnant women on the importance of good nutrition especially fruits, vegetables and dairy products. 

Keywords: Consumption Pattern, Dietary Practices, Pregnant, Women, Nigeria.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4859
1159 Identifying Karst Pattern to Prevent Bell Spring from Being Submerged in Daryan Dam Reservoir

Authors: H. Shafaattalab Dehghani, H. R. Zarei

Abstract:

The large karstic Bell spring with a discharge ranging between 250 and 5300 lit/ sec is one of the most important springs of Kermanshah Province. This spring supplies drinking water of Nodsheh City and its surrounding villages. The spring is located in the reservoir of Daryan Dam and its mouth would be submerged after impounding under a water column of about 110 m height. This paper has aimed to render an account of the karstification pattern around the spring under consideration with the intention of preventing Bell Spring from being submerged in Daryan Dam Reservoir. The studies comprise engineering geology and hydrogeology investigations. Some geotechnical activities included in these studies include geophysical studies, drilling, excavation of exploratory gallery and shaft and diving. The results depict that Bell is a single-conduit siphon spring with 4 m diameter and 85 m height that 32 m of the conduit is located below the spring outlet. To survive the spring, it was decided to plug the outlet and convey the water to upper elevations under the natural pressure of the aquifer. After plugging, water was successfully conveyed to elevation 837 meter above sea level (about 120 m from the outlet) under the natural pressure of the aquifer. This signifies the accuracy of the studies done and proper recognition of the karstification pattern of Bell Spring. This is a unique experience in karst problems in Iran.

Keywords: Bell spring, karst, Daryan Dam, submerged.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1177
1158 Grooved Linear Microstrip Patch Antenna Array

Authors: Ayesha Aslam, F A Bhatti

Abstract:

A simple impedance matching technique for inset feed grooved microstrip patch antenna based on the concept of coplanar waveguide feed line has been developed and investigated for a printed antenna at X-Band frequency of 10GHz. The proposed technique has been used in the design of Linear Grooved Microstrip patch antenna array. The characteristics of the antenna are determined in terms of Return loss, VSWR, gain, radiation pattern etc. The measured and simulated results presented are found to be in good agreement.

Keywords: Gain, Microstrip patch, return loss, VSWR, Radiation pattern, CPW Feed, Inset feed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2763
1157 CFD Simulation and Validation of Flow Pattern Transition Boundaries during Moderately Viscous Oil-Water Two-Phase Flow through Horizontal Pipeline

Authors: Anand B. Desamala, Anjali Dasari, Vinayak Vijayan, Bharath K. Goshika, Ashok K. Dasmahapatra, Tapas K. Mandal

Abstract:

In the present study, computational fluid dynamics (CFD) simulation has been executed to investigate the transition boundaries of different flow patterns for moderately viscous oil-water (viscosity ratio 107, density ratio 0.89 and interfacial tension of 0.032 N/m.) two-phase flow through a horizontal pipeline with internal diameter and length of 0.025 m and 7.16 m respectively. Volume of Fluid (VOF) approach including effect of surface tension has been employed to predict the flow pattern. Geometry and meshing of the present problem has been drawn using GAMBIT and ANSYS FLUENT has been used for simulation. A total of 47037 quadrilateral elements are chosen for the geometry of horizontal pipeline. The computation has been performed by assuming unsteady flow, immiscible liquid pair, constant liquid properties, co-axial flow and a T-junction as entry section. The simulation correctly predicts the transition boundaries of wavy stratified to stratified mixed flow. Other transition boundaries are yet to be simulated. Simulated data has been validated with our own experimental results.

Keywords: CFD simulation, flow pattern transition, moderately viscous oil-water flow, prediction of flow transition boundary, VOF technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4199
1156 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 919
1155 Mining User-Generated Contents to Detect Service Failures with Topic Model

Authors: Kyung Bae Park, Sung Ho Ha

Abstract:

Online user-generated contents (UGC) significantly change the way customers behave (e.g., shop, travel), and a pressing need to handle the overwhelmingly plethora amount of various UGC is one of the paramount issues for management. However, a current approach (e.g., sentiment analysis) is often ineffective for leveraging textual information to detect the problems or issues that a certain management suffers from. In this paper, we employ text mining of Latent Dirichlet Allocation (LDA) on a popular online review site dedicated to complaint from users. We find that the employed LDA efficiently detects customer complaints, and a further inspection with the visualization technique is effective to categorize the problems or issues. As such, management can identify the issues at stake and prioritize them accordingly in a timely manner given the limited amount of resources. The findings provide managerial insights into how analytics on social media can help maintain and improve their reputation management. Our interdisciplinary approach also highlights several insights by applying machine learning techniques in marketing research domain. On a broader technical note, this paper illustrates the details of how to implement LDA in R program from a beginning (data collection in R) to an end (LDA analysis in R) since the instruction is still largely undocumented. In this regard, it will help lower the boundary for interdisciplinary researcher to conduct related research.

Keywords: Latent Dirichlet allocation, R program, text mining, topic model, user generated contents, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1182
1154 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2722
1153 Satellite Sensing for Evaluation of an Irrigation System in Cotton - Wheat Zone

Authors: Sadia Iqbal, Faheem Iqbal, Furqan Iqbal

Abstract:

Efficient utilization of existing water is a pressing need for Pakistan. Due to rising population, reduction in present storage capacity and poor delivery efficiency of 30 to 40% from canal. A study to evaluate an irrigation system in the cotton-wheat zone of Pakistan, after the watercourse lining was conducted. The study is made on the basis of cropping pattern and salinity to evaluate the system. This study employed an index-based approach of using Geographic information system with field data. The satellite images of different years were use to examine the effective area. Several combinations of the ratio of signals received in different spectral bands were used for development of this index. Near Infrared and Thermal IR spectral bands proved to be most effective as this combination helped easy detection of salt affected area and cropping pattern of the study area. Result showed that 9.97% area under salinity in 1992, 9.17% in 2000 and it left 2.29% in year 2005. Similarly in 1992, 45% area is under vegetation it improves to 56% and 65% in 2000 and 2005 respectively. On the basis of these results evaluation is done 30% performance is increase after the watercourse improvement.

Keywords: Salinity, remote sensing index, salinity index, cropping pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1643
1152 Landscape Pattern Evolution and Optimization Strategy in Wuhan Urban Development Zone, China

Authors: Feng Yue, Fei Dai

Abstract:

With the rapid development of urbanization process in China, its environmental protection pressure is severely tested. So, analyzing and optimizing the landscape pattern is an important measure to ease the pressure on the ecological environment. This paper takes Wuhan Urban Development Zone as the research object, and studies its landscape pattern evolution and quantitative optimization strategy. First, remote sensing image data from 1990 to 2015 were interpreted by using Erdas software. Next, the landscape pattern index of landscape level, class level, and patch level was studied based on Fragstats. Then five indicators of ecological environment based on National Environmental Protection Standard of China were selected to evaluate the impact of landscape pattern evolution on the ecological environment. Besides, the cost distance analysis of ArcGIS was applied to simulate wildlife migration thus indirectly measuring the improvement of ecological environment quality. The result shows that the area of land for construction increased 491%. But the bare land, sparse grassland, forest, farmland, water decreased 82%, 47%, 36%, 25% and 11% respectively. They were mainly converted into construction land. On landscape level, the change of landscape index all showed a downward trend. Number of patches (NP), Landscape shape index (LSI), Connection index (CONNECT), Shannon's diversity index (SHDI), Aggregation index (AI) separately decreased by 2778, 25.7, 0.042, 0.6, 29.2%, all of which indicated that the NP, the degree of aggregation and the landscape connectivity declined. On class level, the construction land and forest, CPLAND, TCA, AI and LSI ascended, but the Distribution Statistics Core Area (CORE_AM) decreased. As for farmland, water, sparse grassland, bare land, CPLAND, TCA and DIVISION, the Patch Density (PD) and LSI descended, yet the patch fragmentation and CORE_AM increased. On patch level, patch area, Patch perimeter, Shape index of water, farmland and bare land continued to decline. The three indexes of forest patches increased overall, sparse grassland decreased as a whole, and construction land increased. It is obvious that the urbanization greatly influenced the landscape evolution. Ecological diversity and landscape heterogeneity of ecological patches clearly dropped. The Habitat Quality Index continuously declined by 14%. Therefore, optimization strategy based on greenway network planning is raised for discussion. This paper contributes to the study of landscape pattern evolution in planning and design and to the research on spatial layout of urbanization.

Keywords: Landscape pattern, optimization strategy, ArcGIS, Erdas, landscape metrics, landscape architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 789
1151 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2018
1150 Numerical Simulation of Heating Characteristics in a Microwave T-Prong Antenna for Cancer Therapy

Authors: M. Chaichanyut, S. Tungjitkusolmun

Abstract:

This research is presented with microwave (MW) ablation by using the T-Prong monopole antennas. In the study, three-dimensional (3D) finite-element methods (FEM) were utilized to analyse: the tissue heat flux, temperature distributions (heating pattern) and volume destruction during MW ablation in liver cancer tissue. The configurations of T-Prong monopole antennas were considered: Three T-prong antenna, Expand T-Prong antenna and Arrow T-Prong antenna. The 3D FEMs solutions were based on Maxwell and bio-heat equations. The microwave power deliveries were 10 W; the duration of ablation in all cases was 300s. Our numerical result, heat flux and the hotspot occurred at the tip of the T-prong antenna for all cases. The temperature distribution pattern of all antennas was teardrop. The Arrow T-Prong antenna can induce the highest temperature within cancer tissue. The microwave ablation was successful when the region where the temperatures exceed 50°C (i.e. complete destruction). The Expand T-Prong antenna could complete destruction the liver cancer tissue was maximized (6.05 cm3). The ablation pattern or axial ratio (Widest/length) of Expand T-Prong antenna and Arrow T-Prong antenna was 1, but the axial ratio of Three T-prong antenna of about 1.15.

Keywords: Liver cancer, T-Prong antenna, Finite element, Microwave ablation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1386
1149 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1368
1148 Economic Effects of Maritime Environmental Legislation in the North and Baltic Sea Area: An Exploratory Sequential Mixed Methods Approach

Authors: Thea Freese

Abstract:

Environmental legislation to protect North and Baltic Sea areas from harmful vessel-source emissions has received increased political attention in recent years. Legislative measures are expected to show positive effects on the health of the marine environment and society. At the same time, compliance might increase the costs to industry and have effects on freight rates and volumes shipped with potential negative repercussions on the environment. Building on an exploratory sequential mixed methods approach, this research project will study the economic effects of maritime environmental legislation in two phases. In Phase I, exploratory in-depth interviews were conducted with 12 experts from various stakeholder groups aiming at identifying variables influencing the relationship between environmental legislation, freight rates and volumes shipped. Influencing factors like compliance, enforcement and modal shift were identified and studied. Phase II will comprise of a quantitative study conducted with the aim of verifying the theory build in Phase I and quantifying economic effects of rules on shipping pollution. Research in this field might inform policy-makers about determinants of behaviour of ship operators in the face of the law and might further the development of a comprehensive legal system for marine environmental protection. At the present stage of research, first tentative results from the qualitative phase may be examined and open research questions to be addressed in the quantitative phase as well as possible research designs for phase II may be discussed. Input from other researchers will be highly valuable at this point.

Keywords: Clean shipping operations, compliance, maritime environmental legislation, maritime law and economics, mixed methods research, North and Baltic Sea area.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1034
1147 Learning of Class Membership Values by Ellipsoidal Decision Regions

Authors: Leehter Yao, Chin-Chin Lin

Abstract:

A novel method of learning complex fuzzy decision regions in the n-dimensional feature space is proposed. Through the fuzzy decision regions, a given pattern's class membership value of every class is determined instead of the conventional crisp class the pattern belongs to. The n-dimensional fuzzy decision region is approximated by union of hyperellipsoids. By explicitly parameterizing these hyperellipsoids, the decision regions are determined by estimating the parameters of each hyperellipsoid.Genetic Algorithm is applied to estimate the parameters of each region component. With the global optimization ability of GA, the learned decision region can be arbitrarily complex.

Keywords: Ellipsoid, genetic algorithm, decision regions, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398
1146 Efficient Single Relay Selection Scheme for Cooperative Communication

Authors: Sung-Bok Choi, Hyun-Jun Shin, Hyoung-Kyu Song

Abstract:

This paper proposes a single relay selection scheme in  cooperative communication. Decode-and-forward scheme is  considered when a source node wants to cooperate with a single relay  for data transmission. To use the proposed single relay selection  scheme, the source node makes a little different pattern signal which is  not complex pattern and broadcasts it. The proposed scheme does not  require the channel state information between the source node and  candidates of the relay during the relay selection. Therefore, it is able  to be used in many fields.

Keywords: Relay selection, cooperative communication, df, channel codes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782