Search results for: data mining techniques
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28987

Search results for: data mining techniques

28267 Application of a Modified Crank-Nicolson Method in Metallurgy

Authors: Kobamelo Mashaba

Abstract:

The molten slag has a high substantial temperatures range between 1723-1923, carrying a huge amount of useful energy for reducing energy consumption and CO₂ emissions under the heat recovery process. Therefore in this study, we investigated the performance of the modified crank Nicolson method for a delayed partial differential equation on the heat recovery of molten slag in the metallurgical mining environment. It was proved that the proposed method converges quickly compared to the classic method with the existence of a unique solution. It was inferred from numerical result that the proposed methodology is more viable and profitable for the mining industry.

Keywords: delayed partial differential equation, modified Crank-Nicolson Method, molten slag, heat recovery, parabolic equation

Procedia PDF Downloads 86
28266 The Quality of the Presentation Influence the Audience Perceptions

Authors: Gilang Maulana, Dhika Rahma Qomariah, Yasin Fadil

Abstract:

Purpose: This research meant to measure the magnitude of the influence of the quality of the presentation to the targeted audience perception in catching information presentation. Design/Methodology/Approach: This research uses a quantitative research method. The kind of data that uses in this research is the primary data. The population in this research are students the economics faculty of Semarang State University. The sampling techniques uses in this research is purposive sampling. The retrieving data uses questionnaire on 30 respondents. The data analysis uses descriptive analysis. Result: The quality of presentation influential positive against perception of the audience. This proved that the more qualified presentation will increase the perception of the audience. Limitation: Respondents were limited to only 30 people.

Keywords: quality of presentation, presentation, audience, perception, semarang state university

Procedia PDF Downloads 366
28265 Effective Supply Chain Coordination with Hybrid Demand Forecasting Techniques

Authors: Gurmail Singh

Abstract:

Effective supply chain is the main priority of every organization which is the outcome of strategic corporate investments with deliberate management action. Value-driven supply chain is defined through development, procurement and by configuring the appropriate resources, metrics and processes. However, responsiveness of the supply chain can be improved by proper coordination. So the Bullwhip effect (BWE) and Net stock amplification (NSAmp) values were anticipated and used for the control of inventory in organizations by both discrete wavelet transform-Artificial neural network (DWT-ANN) and Adaptive Network-based fuzzy inference system (ANFIS). This work presents a comparative methodology of forecasting for the customers demand which is non linear in nature for a multilevel supply chain structure using hybrid techniques such as Artificial intelligence techniques including Artificial neural networks (ANN) and Adaptive Network-based fuzzy inference system (ANFIS) and Discrete wavelet theory (DWT). The productiveness of these forecasting models are shown by computing the data from real world problems for Bullwhip effect and Net stock amplification. The results showed that these parameters were comparatively less in case of discrete wavelet transform-Artificial neural network (DWT-ANN) model and using Adaptive network-based fuzzy inference system (ANFIS).

Keywords: bullwhip effect, hybrid techniques, net stock amplification, supply chain flexibility

Procedia PDF Downloads 105
28264 Simulation of a Cost Model Response Requests for Replication in Data Grid Environment

Authors: Kaddi Mohammed, A. Benatiallah, D. Benatiallah

Abstract:

Data grid is a technology that has full emergence of new challenges, such as the heterogeneity and availability of various resources and geographically distributed, fast data access, minimizing latency and fault tolerance. Researchers interested in this technology address the problems of the various systems related to the industry such as task scheduling, load balancing and replication. The latter is an effective solution to achieve good performance in terms of data access and grid resources and better availability of data cost. In a system with duplication, a coherence protocol is used to impose some degree of synchronization between the various copies and impose some order on updates. In this project, we present an approach for placing replicas to minimize the cost of response of requests to read or write, and we implement our model in a simulation environment. The placement techniques are based on a cost model which depends on several factors, such as bandwidth, data size and storage nodes.

Keywords: response time, query, consistency, bandwidth, storage capacity, CERN

Procedia PDF Downloads 253
28263 Evaluation of Ultrasonic Techniques for the Estimation of Air Voids in Asphalt Concrete

Authors: Majid Zargar, Frank Bullen, Ron Ayers

Abstract:

One of the important factors in the design of asphalt concrete mixes is the accurate measurement of air voids and their variable distribution. Both can have significant impact on long and short term fatigue and creep behaviour under traffic. While some simple methods exist for overall evaluation of air voids, measuring air void distribution in asphalt concrete is very complex, involving expensive techniques such as X-ray methodologies. The research reported in the paper investigated the use of non-destructive ultrasonic techniques as an alternative to estimate the amount of air voids and their distribution within asphalt samples. Seventy-four Standard AC–14 asphalt samples made with three types of bitumen; Multigrade, PMB and C320 were analysed using ultrasonic techniques. The results have illustrated that ultrasonic testing has the potential of being a rapid, accurate and cost-effective method of estimating air void distribution in asphalt.

Keywords: asphalt concrete, air voids, ultrasonic, mechanical behaviour

Procedia PDF Downloads 331
28262 A Comparative Study on Sampling Techniques of Polynomial Regression Model Based Stochastic Free Vibration of Composite Plates

Authors: S. Dey, T. Mukhopadhyay, S. Adhikari

Abstract:

This paper presents an exhaustive comparative investigation on sampling techniques of polynomial regression model based stochastic natural frequency of composite plates. Both individual and combined variations of input parameters are considered to map the computational time and accuracy of each modelling techniques. The finite element formulation of composites is capable to deal with both correlated and uncorrelated random input variables such as fibre parameters and material properties. The results obtained by Polynomial regression (PR) using different sampling techniques are compared. Depending on the suitability of sampling techniques such as 2k Factorial designs, Central composite design, A-Optimal design, I-Optimal, D-Optimal, Taguchi’s orthogonal array design, Box-Behnken design, Latin hypercube sampling, sobol sequence are illustrated. Statistical analysis of the first three natural frequencies is presented to compare the results and its performance.

Keywords: composite plate, natural frequency, polynomial regression model, sampling technique, uncertainty quantification

Procedia PDF Downloads 495
28261 Hybridized Approach for Distance Estimation Using K-Means Clustering

Authors: Ritu Vashistha, Jitender Kumar

Abstract:

Clustering using the K-means algorithm is a very common way to understand and analyze the obtained output data. When a similar object is grouped, this is called the basis of Clustering. There is K number of objects and C number of cluster in to single cluster in which k is always supposed to be less than C having each cluster to be its own centroid but the major problem is how is identify the cluster is correct based on the data. Formulation of the cluster is not a regular task for every tuple of row record or entity but it is done by an iterative process. Each and every record, tuple, entity is checked and examined and similarity dissimilarity is examined. So this iterative process seems to be very lengthy and unable to give optimal output for the cluster and time taken to find the cluster. To overcome the drawback challenge, we are proposing a formula to find the clusters at the run time, so this approach can give us optimal results. The proposed approach uses the Euclidian distance formula as well melanosis to find the minimum distance between slots as technically we called clusters and the same approach we have also applied to Ant Colony Optimization(ACO) algorithm, which results in the production of two and multi-dimensional matrix.

Keywords: ant colony optimization, data clustering, centroids, data mining, k-means

Procedia PDF Downloads 112
28260 COVID_ICU_BERT: A Fine-Tuned Language Model for COVID-19 Intensive Care Unit Clinical Notes

Authors: Shahad Nagoor, Lucy Hederman, Kevin Koidl, Annalina Caputo

Abstract:

Doctors’ notes reflect their impressions, attitudes, clinical sense, and opinions about patients’ conditions and progress, and other information that is essential for doctors’ daily clinical decisions. Despite their value, clinical notes are insufficiently researched within the language processing community. Automatically extracting information from unstructured text data is known to be a difficult task as opposed to dealing with structured information such as vital physiological signs, images, and laboratory results. The aim of this research is to investigate how Natural Language Processing (NLP) techniques and machine learning techniques applied to clinician notes can assist in doctors’ decision-making in Intensive Care Unit (ICU) for coronavirus disease 2019 (COVID-19) patients. The hypothesis is that clinical outcomes like survival or mortality can be useful in influencing the judgement of clinical sentiment in ICU clinical notes. This paper introduces two contributions: first, we introduce COVID_ICU_BERT, a fine-tuned version of clinical transformer models that can reliably predict clinical sentiment for notes of COVID patients in the ICU. We train the model on clinical notes for COVID-19 patients, a type of notes that were not previously seen by clinicalBERT, and Bio_Discharge_Summary_BERT. The model, which was based on clinicalBERT achieves higher predictive accuracy (Acc 93.33%, AUC 0.98, and precision 0.96 ). Second, we perform data augmentation using clinical contextual word embedding that is based on a pre-trained clinical model to balance the samples in each class in the data (survived vs. deceased patients). Data augmentation improves the accuracy of prediction slightly (Acc 96.67%, AUC 0.98, and precision 0.92 ).

Keywords: BERT fine-tuning, clinical sentiment, COVID-19, data augmentation

Procedia PDF Downloads 183
28259 Exploring Influence Range of Tainan City Using Electronic Toll Collection Big Data

Authors: Chen Chou, Feng-Tyan Lin

Abstract:

Big Data has been attracted a lot of attentions in many fields for analyzing research issues based on a large number of maternal data. Electronic Toll Collection (ETC) is one of Intelligent Transportation System (ITS) applications in Taiwan, used to record starting point, end point, distance and travel time of vehicle on the national freeway. This study, taking advantage of ETC big data, combined with urban planning theory, attempts to explore various phenomena of inter-city transportation activities. ETC, one of government's open data, is numerous, complete and quick-update. One may recall that living area has been delimited with location, population, area and subjective consciousness. However, these factors cannot appropriately reflect what people’s movement path is in daily life. In this study, the concept of "Living Area" is replaced by "Influence Range" to show dynamic and variation with time and purposes of activities. This study uses data mining with Python and Excel, and visualizes the number of trips with GIS to explore influence range of Tainan city and the purpose of trips, and discuss living area delimited in current. It dialogues between the concepts of "Central Place Theory" and "Living Area", presents the new point of view, integrates the application of big data, urban planning and transportation. The finding will be valuable for resource allocation and land apportionment of spatial planning.

Keywords: Big Data, ITS, influence range, living area, central place theory, visualization

Procedia PDF Downloads 263
28258 Information Needs and Information Usage of the Older Person Club’s Members in Bangkok

Authors: Siriporn Poolsuwan

Abstract:

This research aims to explore the information needs, information usages, and problems of information usage of the older people club’s members in Dusit District, Bangkok. There are 12 clubs and 746 club’s members in this district. The research results use for older person service in this district. Data is gathered from 252 club’s members by using questionnaires. The quantitative approach uses in research by percentage, means and standard deviation. The results are as follows (1) The older people need Information for entertainment, occupation and academic in the field of short story, computer work, and religion and morality. (2) The participants use Information from various sources. (3) The Problem of information usage is their language skills because of the older people’s literacy problem.

Keywords: information behavior, older person, information seeking, knowledge discovery and data mining

Procedia PDF Downloads 254
28257 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis

Authors: Kisik Song, Sungjoo Lee

Abstract:

With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.

Keywords: patent infringement, new technology ideas, patent analysis, F-term

Procedia PDF Downloads 254
28256 Clay Palm Press: A Technique of Hand Building in Ceramics for Developing Conceptual Forms

Authors: Okewu E. Jonathan

Abstract:

There are several techniques of production in the field of ceramics. These different techniques overtime have been categorised under three methods of production which includes; casting, throwing and hand building. Hand building method of production is further broken down into other techniques and they include coiling, slabbing and pinching. Ceramic artists find the different hand building techniques to be very interesting, practicable and rewarding. This has encouraged ceramic artist in their various studios at different levels to experiment for further hand building techniques that could be unique and unusual. The art of “Clay Palm Press” is a development from studio experiment in a quest for uniqueness in conceptual ceramic practise. Clay palm press is a technique that requires no formal tutelage but at the same time, it is not easily comprehensible when viewed. It is a practice of putting semi-solid clay in the palm and inserting a closed fist pressure so as to take the imprint of the human palm. This clay production from the palm when dried, fired and explored into an art, work reveals an absolute awesomeness of what the palm imprint could result in.

Keywords: ceramics, clay palm press, conceptual forms, hand building, technique

Procedia PDF Downloads 257
28255 Characterization of Gamma Irradiated PVDF and PVDF/Graphene Oxide Composites by Spectroscopic Techniques

Authors: Juliana V. Pereira, Adriana S. M. Batista, Jefferson P. Nascimento, Clascídia A. Furtado, Luiz O. Faria

Abstract:

The combination of the properties of graphene oxide (OG) and PVDF homopolymer makes their combined composite materials as multifunctional systems with great potential. Knowledge of the molecular structure is essential for better use. In this work, the degradation of PVDF polymer exposed to gamma irradiation in oxygen atmosphere in high dose rate has been studied and compared to degradation of PVDF/OG composites. The samples were irradiated with a Co-60 source at constant dose rate, with doses ranging from 100 kGy to 1,000 kGy. In FTIR data shown that the formation of oxidation products was at the both samples with formation of carbonyl and hydroxyl groups amongst the most prevalent products in the pure PVDF samples. In the other hand, the composites samples exhibit less presence of degradation products with predominant formation of carbonyl groups, these results also seen in the UV-Vis analysis. The results show that the samples of composites may have greater resistance to the irradiation process, since they have less degradation products than pure PVDF samples seen by spectroscopic techniques.

Keywords: gamma irradiation, PVDF, PVDF/OG composites, spectroscopic techniques

Procedia PDF Downloads 557
28254 Credit Card Fraud Detection with Ensemble Model: A Meta-Heuristic Approach

Authors: Gong Zhilin, Jing Yang, Jian Yin

Abstract:

The purpose of this paper is to develop a novel system for credit card fraud detection based on sequential modeling of data using hybrid deep learning models. The projected model encapsulates five major phases are pre-processing, imbalance-data handling, feature extraction, optimal feature selection, and fraud detection with an ensemble classifier. The collected raw data (input) is pre-processed to enhance the quality of the data through alleviation of the missing data, noisy data as well as null values. The pre-processed data are class imbalanced in nature, and therefore they are handled effectively with the K-means clustering-based SMOTE model. From the balanced class data, the most relevant features like improved Principal Component Analysis (PCA), statistical features (mean, median, standard deviation) and higher-order statistical features (skewness and kurtosis). Among the extracted features, the most optimal features are selected with the Self-improved Arithmetic Optimization Algorithm (SI-AOA). This SI-AOA model is the conceptual improvement of the standard Arithmetic Optimization Algorithm. The deep learning models like Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and optimized Quantum Deep Neural Network (QDNN). The LSTM and CNN are trained with the extracted optimal features. The outcomes from LSTM and CNN will enter as input to optimized QDNN that provides the final detection outcome. Since the QDNN is the ultimate detector, its weight function is fine-tuned with the Self-improved Arithmetic Optimization Algorithm (SI-AOA).

Keywords: credit card, data mining, fraud detection, money transactions

Procedia PDF Downloads 113
28253 The Use of Methods and Techniques of Drama Education with Kindergarten Teachers

Authors: Vladimira Hornackova, Jana Kottasova, Zuzana Vanova, Anna Jungrova

Abstract:

Present study deals with drama education in preschool education. The research made in this field brings a qualitative comparative survey with the aim to find out the use of methods and techniques of drama education in preschool education at university or secondary school graduate preschool teachers. The research uses a content analysis and an unstandardized questionnaire for preschool teachers and obtained data are processed with the help of descriptive methods and correlations. The results allow a comparison of aspects applied through drama in preschool education. The research brings impulses for education improvement in kindergartens and inspiration for university study programs of drama education in the professional training of preschool teachers.

Keywords: drama education, preschool education, preschool teacher, research

Procedia PDF Downloads 349
28252 Destruction of Coastal Wetlands in Harper City-Liberia: Setting Nature against the Future Society

Authors: Richard Adu Antwako

Abstract:

Coastal wetland destruction and its consequences have recently taken the center stage of global discussions. This phenomenon is no gray area to humanity as coastal wetland-human interaction seems inevitably ingrained in the earliest civilizations, amidst the demanding use of its resources to meet their necessities. The severity of coastal wetland destruction parallels with growing civilizations, and it is against this backdrop that, this paper interrogated the causes of coastal wetland destruction in Harper City in Liberia, compared the degree of coastal wetland stressors to the non-equilibrium thermodynamic scale as well as suggested an integrated coastal zone management to address the problems. Literature complemented the primary data gleaned via global positioning system devices, field observation, questionnaire, and interviews. Multi-sampling techniques were used to generate data from the sand miners, institutional heads, fisherfolk, community-based groups, and other stakeholders. Non-equilibrium thermodynamic theory remains vibrant in discerning the ecological stability, and it would be employed to further understand the coastal wetland destruction in Harper City, Liberia and to measure the coastal wetland stresses-amplitude and elasticity. The non-equilibrium thermodynamics postulates that the coastal wetlands are capable of assimilating resources (inputs), as well as discharging products (outputs). However, the input-output relationship exceedingly stretches beyond the thresholds of the coastal wetlands, leading to coastal wetland disequilibrium. Findings revealed that the sand mining, mangrove removal, and crude dumping have transformed the coastal wetlands, resulting in water pollution, flooding, habitat loss and disfigured beaches in Harper City in Liberia. This paper demonstrates that the coastal wetlands are converted into developmental projects and agricultural fields, thus, endangering the future society against nature.

Keywords: amplitude, crude dumping, elasticity, non-equilibrium thermodynamics, wetland destruction

Procedia PDF Downloads 124
28251 Development of Typical Meteorological Year for Passive Cooling Applications Using World Weather Data

Authors: Nasser A. Al-Azri

Abstract:

The effectiveness of passive cooling techniques is assessed based on bioclimatic charts that require the typical meteorological year (TMY) for a specified location for their development. However, TMYs are not always available; mainly due to the scarcity of records of solar radiation which is an essential component used in developing common TMYs intended for general uses. Since solar radiation is not required in the development of the bioclimatic chart, this work suggests developing TMYs based solely on the relevant parameters. This approach improves the accuracy of the developed TMY since only the relevant parameters are considered and it also makes the development of the TMY more accessible since solar radiation data are not used. The presented paper will also discuss the development of the TMY from the raw data available at the NOAA-NCDC archive of world weather data and the construction of the bioclimatic charts for some randomly selected locations around the world.

Keywords: bioclimatic charts, passive cooling, TMY, weather data

Procedia PDF Downloads 222
28250 Timely Detection and Identification of Abnormalities for Process Monitoring

Authors: Hyun-Woo Cho

Abstract:

The detection and identification of multivariate manufacturing processes are quite important in order to maintain good product quality. Unusual behaviors or events encountered during its operation can have a serious impact on the process and product quality. Thus they should be detected and identified as soon as possible. This paper focused on the efficient representation of process measurement data in detecting and identifying abnormalities. This qualitative method is effective in representing fault patterns of process data. In addition, it is quite sensitive to measurement noise so that reliable outcomes can be obtained. To evaluate its performance a simulation process was utilized, and the effect of adopting linear and nonlinear methods in the detection and identification was tested with different simulation data. It has shown that the use of a nonlinear technique produced more satisfactory and more robust results for the simulation data sets. This monitoring framework can help operating personnel to detect the occurrence of process abnormalities and identify their assignable causes in an on-line or real-time basis.

Keywords: detection, monitoring, identification, measurement data, multivariate techniques

Procedia PDF Downloads 220
28249 Analysis of Erosion Quantity on Application of Conservation Techniques in Ci Liwung Hulu Watershed

Authors: Zaenal Mutaqin

Abstract:

The level of erosion that occurs in the upsteam watersheed will lead to limited infiltrattion, land degradation and river trivialisation and estuaries in the body. One of the watesheed that has been degraded caused by using land is the DA Ci Liwung Upstream. The high degradation that occurs in the DA Ci Liwung upstream is indicated by the hugher rate of erosion on the region, especially in the area of agriculture. In this case, agriculture cultivation intent to the agricultural land that has been applied conservation techniques. This study is applied to determine the quantity of erosion by reviewing Hidrologic Response Unit (HRU) in agricuktural cultivation land which is contained in DA Ci Liwung upstream by using the Soil and Water Assessmen Tool (SWAT). Conservation techniques applied are terracing, agroforestry and gulud terrace. It was concluded that agroforestry conservation techniques show the best value of erosion (lowest) compared with other conservation techniques with the contribution of erosion of 25.22 tonnes/ha/year. The results of the calibration between the discharge flow models with the observation that R²=0.9014 and NS=0.79 indicates that this model is acceptable and feasible applied to the Ci Liwung Hulu watershed.

Keywords: conservation, erosion, SWAT analysis, watersheed

Procedia PDF Downloads 273
28248 An Information-Based Approach for Preference Method in Multi-Attribute Decision Making

Authors: Serhat Tuzun, Tufan Demirel

Abstract:

Multi-Criteria Decision Making (MCDM) is the modelling of real-life to solve problems we encounter. It is a discipline that aids decision makers who are faced with conflicting alternatives to make an optimal decision. MCDM problems can be classified into two main categories: Multi-Attribute Decision Making (MADM) and Multi-Objective Decision Making (MODM), based on the different purposes and different data types. Although various MADM techniques were developed for the problems encountered, their methodology is limited in modelling real-life. Moreover, objective results are hard to obtain, and the findings are generally derived from subjective data. Although, new and modified techniques are developed by presenting new approaches such as fuzzy logic; comprehensive techniques, even though they are better in modelling real-life, could not find a place in real world applications for being hard to apply due to its complex structure. These constraints restrict the development of MADM. This study aims to conduct a comprehensive analysis of preference methods in MADM and propose an approach based on information. For this purpose, a detailed literature review has been conducted, current approaches with their advantages and disadvantages have been analyzed. Then, the approach has been introduced. In this approach, performance values of the criteria are calculated in two steps: first by determining the distribution of each attribute and standardizing them, then calculating the information of each attribute as informational energy.

Keywords: literature review, multi-attribute decision making, operations research, preference method, informational energy

Procedia PDF Downloads 203
28247 Information Communication Technology Based Road Traffic Accidents’ Identification, and Related Smart Solution Utilizing Big Data

Authors: Ghulam Haider Haidaree, Nsenda Lukumwena

Abstract:

Today the world of research enjoys abundant data, available in virtually any field, technology, science, and business, politics, etc. This is commonly referred to as big data. This offers a great deal of precision and accuracy, supportive of an in-depth look at any decision-making process. When and if well used, Big Data affords its users with the opportunity to produce substantially well supported and good results. This paper leans extensively on big data to investigate possible smart solutions to urban mobility and related issues, namely road traffic accidents, its casualties, and fatalities based on multiple factors, including age, gender, location occurrences of accidents, etc. Multiple technologies were used in combination to produce an Information Communication Technology (ICT) based solution with embedded technology. Those technologies include principally Geographic Information System (GIS), Orange Data Mining Software, Bayesian Statistics, to name a few. The study uses the Leeds accident 2016 to illustrate the thinking process and extracts thereof a model that can be tested, evaluated, and replicated. The authors optimistically believe that the proposed model will significantly and smartly help to flatten the curve of road traffic accidents in the fast-growing population densities, which increases considerably motor-based mobility.

Keywords: accident factors, geographic information system, information communication technology, mobility

Procedia PDF Downloads 196
28246 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 293
28245 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: decision tree, genetic algorithm, machine learning, software defect prediction

Procedia PDF Downloads 314
28244 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: association rules, rule-based classification, classification quality, validation

Procedia PDF Downloads 418
28243 The Reduction of Post-Blast Fumes to Improve Productivity and Safety: A Review Paper

Authors: Nhleko Monique Chiloane

Abstract:

The gold mining industry has predominantly used ammonium nitrate fuel oil (ANFO) explosives for decades, although these are known to be “gassier” and their detonation results in toxic fumes, for example, carbon monoxide (CO), nitrogen oxides (NOx) and ammonia. Re-entry into underground workings too soon after blasting can lead to fatal exposure to toxic fumes. It is, therefore, required that the polluted air be removed from the affected areas within a reasonable period before employees' re-entry into the working area. Post-blast re-entry times have therefore been described as a productivity bottleneck. The known causes of post-blast fumes are water ingress, incorrect fuel to oxygen ratio, confinement, explosive additives etc. To prevent or minimize post-blast fumes, some researchers have used neutralization, re-burning technique and non-explosive products or different oxidizing agents. The use of commercial explosives without nitrate oxidizing agents can also minimize the production of blasting fumes and thereby reduce the time needed for the clearance of these fumes to allow workers to re-enter the underground workings safely. The reduction in non-production time directly contributes to an increase in the available time per shift for productive work, thus leading to continuous mining. However, owing to its low cost and ease of use, ANFO is still widely used in South African underground blasting operations.

Keywords: post-blast fumes, continuous mining, ammonium nitrate explosive, non-explosive blasting, re-entry period

Procedia PDF Downloads 167
28242 Water End-Use Classification with Contemporaneous Water-Energy Data and Deep Learning Network

Authors: Khoi A. Nguyen, Rodney A. Stewart, Hong Zhang

Abstract:

‘Water-related energy’ is energy use which is directly or indirectly influenced by changes to water use. Informatics applying a range of mathematical, statistical and rule-based approaches can be used to reveal important information on demand from the available data provided at second, minute or hourly intervals. This study aims to combine these two concepts to improve the current water end use disaggregation problem through applying a wide range of most advanced pattern recognition techniques to analyse the concurrent high-resolution water-energy consumption data. The obtained results have shown that recognition accuracies of all end-uses have significantly increased, especially for mechanised categories, including clothes washer, dishwasher and evaporative air cooler where over 95% of events were correctly classified.

Keywords: deep learning network, smart metering, water end use, water-energy data

Procedia PDF Downloads 287
28241 Improved Network Construction Methods Based on Virtual Rails for Mobile Sensor Network

Authors: Noritaka Shigei, Kazuto Matsumoto, Yoshiki Nakashima, Hiromi Miyajima

Abstract:

Although Mobile Wireless Sensor Networks (MWSNs), which consist of mobile sensor nodes (MSNs), can cover a wide range of observation region by using a small number of sensor nodes, they need to construct a network to collect the sensing data on the base station by moving the MSNs. As an effective method, the network construction method based on Virtual Rails (VRs), which is referred to as VR method, has been proposed. In this paper, we propose two types of effective techniques for the VR method. They can prolong the operation time of the network, which is limited by the battery capabilities of MSNs and the energy consumption of MSNs. The first technique, an effective arrangement of VRs, almost equalizes the number of MSNs belonging to each VR. The second technique, an adaptive movement method of MSNs, takes into account the residual energy of battery. In the simulation, we demonstrate that each technique can improve the network lifetime and the combination of both techniques is the most effective.

Keywords: mobile sensor node, relay of sensing data, residual energy, virtual rail, wireless sensor network

Procedia PDF Downloads 313
28240 Cross-Validation of the Data Obtained for ω-6 Linoleic and ω-3 α-Linolenic Acids Concentration of Hemp Oil Using Jackknife and Bootstrap Resampling

Authors: Vibha Devi, Shabina Khanam

Abstract:

Hemp (Cannabis sativa) possesses a rich content of ω-6 linoleic and ω-3 linolenic essential fatty acid in the ratio of 3:1, which is a rare and most desired ratio that enhances the quality of hemp oil. These components are beneficial for the development of cell and body growth, strengthen the immune system, possess anti-inflammatory action, lowering the risk of heart problem owing to its anti-clotting property and a remedy for arthritis and various disorders. The present study employs supercritical fluid extraction (SFE) approach on hemp seed at various conditions of parameters; temperature (40 - 80) °C, pressure (200 - 350) bar, flow rate (5 - 15) g/min, particle size (0.430 - 1.015) mm and amount of co-solvent (0 - 10) % of solvent flow rate through central composite design (CCD). CCD suggested 32 sets of experiments, which was carried out. As SFE process includes large number of variables, the present study recommends the application of resampling techniques for cross-validation of the obtained data. Cross-validation refits the model on each data to achieve the information regarding the error, variability, deviation etc. Bootstrap and jackknife are the most popular resampling techniques, which create a large number of data through resampling from the original dataset and analyze these data to check the validity of the obtained data. Jackknife resampling is based on the eliminating one observation from the original sample of size N without replacement. For jackknife resampling, the sample size is 31 (eliminating one observation), which is repeated by 32 times. Bootstrap is the frequently used statistical approach for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. For bootstrap resampling, the sample size is 32, which was repeated by 100 times. Estimands for these resampling techniques are considered as mean, standard deviation, variation coefficient and standard error of the mean. For ω-6 linoleic acid concentration, mean value was approx. 58.5 for both resampling methods, which is the average (central value) of the sample mean of all data points. Similarly, for ω-3 linoleic acid concentration, mean was observed as 22.5 through both resampling. Variance exhibits the spread out of the data from its mean. Greater value of variance exhibits the large range of output data, which is 18 for ω-6 linoleic acid (ranging from 48.85 to 63.66 %) and 6 for ω-3 linoleic acid (ranging from 16.71 to 26.2 %). Further, low value of standard deviation (approx. 1 %), low standard error of the mean (< 0.8) and low variance coefficient (< 0.2) reflect the accuracy of the sample for prediction. All the estimator value of variance coefficients, standard deviation and standard error of the mean are found within the 95 % of confidence interval.

Keywords: resampling, supercritical fluid extraction, hemp oil, cross-validation

Procedia PDF Downloads 126
28239 Comparing ITV Definitions From 4D CT-PET and Breath-Hold Technique with Abdominal Compression

Authors: R. D. Esposito, P. Dorado Rodriguez, D. Planes Meseguer

Abstract:

In this work, we compare the contour of Internal Target Volume (ITV), for Stereotactic Body Radiation Therapy (SBRT) of a patient affected by a single liver metastasis, obtained from two different patient data acquisition techniques. The first technique consists in a free breathing Computer Tomography (CT) scan acquisition, followed by exhalation breath-hold and inhalation breath-hold CT scans, all of them applying abdominal compression while the second technique consists in a free breathing 4D CT-PET (Positron Emission Tomography) scan. Results obtained with these two methods are consistent, which demonstrate that at least for this specific case, both techniques are adequate for ITV contouring in SBRT treatments.

Keywords: 4D CT-PET, abdominal compression, ITV, SBRT

Procedia PDF Downloads 429
28238 A Survey on Compression Methods for Table Constraints

Authors: N. Gharbi

Abstract:

Constraint Satisfaction problems are mathematical problems that are often used to model many real-world problems for which we look if there exists a solution satisfying all its constraints. Table constraints are important for modeling parts of many problems since they list all combinations of allowed or forbidden values. However, they admit practical limitations because they are sometimes too large to be represented in a direct way. In this paper, we present a survey of the different categories of the proposed approaches to compress table constraints in order to reduce both space and time complexities.

Keywords: constraint programming, compression, data mining, table constraints

Procedia PDF Downloads 307