Search results for: Data mining andInformation Extraction

7681 Probabilistic Approach as a Method Used in the Solution of Engineering Design for Biomechanics and Mining

Abstract:

This paper focuses on the probabilistic numerical solution of the problems in biomechanics and mining. Applications of Simulation-Based Reliability Assessment (SBRA) Method are presented in the solution of designing of the external fixators applied in traumatology and orthopaedics (these fixators can be applied for the treatment of open and unstable fractures etc.) and in the solution of a hard rock (ore) disintegration process (i.e. the bit moves into the ore and subsequently disintegrates it, the results are compared with experiments, new design of excavation tool is proposed.

Keywords: probabilistic approach, engineering design, traumatology, rock mechanics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1479

7680 Study on Extraction of Lanthanum Oxide from Monazite Concentrate

Authors: Nwe Nwe Soe, Lwin Thuzar Shwe, Kay Thi Lwin

Abstract:

Lanthanum oxide is to be recovered from monazite, which contains about 13.44% lanthanum oxide. The principal objective of this study is to be able to extract lanthanum oxide from monazite of Moemeik Myitsone Area. The treatment of monazite in this study involves three main steps; extraction of lanthanum hydroxide from monazite by using caustic soda, digestion with nitric acid and precipitation with ammonium hydroxide and calcination of lanthanum oxalate to lanthanum oxide.

Keywords: Calcination, Digestion, Precipitation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4034

7679 Nanofluid-Based Emulsion Liquid Membrane for Selective Extraction and Separation of Dysprosium

Authors: Maliheh Raji, Hossein Abolghasemi, Jaber Safdari, Ali Kargari

Abstract:

Dysprosium is a rare earth element which is essential for many growing high-technology applications. Dysprosium along with neodymium plays a significant role in different applications such as metal halide lamps, permanent magnets, and nuclear reactor control rods preparation. The purification and separation of rare earth elements are challenging because of their similar chemical and physical properties. Among the various methods, membrane processes provide many advantages over the conventional separation processes such as ion exchange and solvent extraction. In this work, selective extraction and separation of dysprosium from aqueous solutions containing an equimolar mixture of dysprosium and neodymium by emulsion liquid membrane (ELM) was investigated. The organic membrane phase of the ELM was a nanofluid consisting of multiwalled carbon nanotubes (MWCNT), Span80 as surfactant, Cyanex 272 as carrier, kerosene as base fluid, and nitric acid solution as internal aqueous phase. Factors affecting separation of dysprosium such as carrier concentration, MWCNT concentration, feed phase pH and stripping phase concentration were analyzed using Taguchi method. Optimal experimental condition was obtained using analysis of variance (ANOVA) after 10 min extraction. Based on the results, using MWCNT nanofluid in ELM process leads to increase the extraction due to higher stability of membrane and mass transfer enhancement and separation factor of 6 for dysprosium over neodymium can be achieved under the optimum conditions. Additionally, demulsification process was successfully performed and the membrane phase reused effectively in the optimum condition.

Keywords: Emulsion liquid membrane, MWCNT nanofluid, separation, Taguchi Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 988

7678 Phenolic Compounds and Antimicrobial Properties of Pomegranate (Punica granatum) Peel Extracts

Authors: P. Rahnemoon, M. Sarabi Jamab, M. Javanmard Dakheli, A. Bostan

Abstract:

In recent years, tendency to use of natural antimicrobial agents in food industry has increased. Pomegranate peels containing phenolic compounds and anti-microbial agents, are counted as valuable source for extraction of these compounds. In this study, the extraction of pomegranate peel extract was carried out at different ethanol/water ratios (40:60, 60:40, and 80:20), temperatures (25, 40, and 55 ˚C), and time durations (20, 24, and 28 h). The extraction yield, phenolic compounds, flavonoids, and anthocyanins were measured. ‎Antimicrobial activity of pomegranate peel extracts were determined against some food-borne ‎microorganisms such as Salmonella enteritidis, Escherichia coli, Listeria monocytogenes, ‎‎Staphylococcus aureus, Aspergillus niger, and Saccharomyces cerevisiae by agar diffusion and MIC methods. Results showed that at ethanol/water ratio 60:40, 25 ˚C and 24 h maximum amount of phenolic compounds ‎‏(‎‏‎349.518‎‏ ‏mg gallic acid‏/‏g dried extract), ‎flavonoids (250.124 mg rutin‏/‏g dried extract), anthocyanins (252.047 ‎‏‏mg ‎cyanidin‏‎3‎‏glucoside‏/‏‎100 g dried extract), and the strongest antimicrobial activity were obtained. ‎All extracts’ antimicrobial activities were demonstrated against every tested ‎‎microorganisms‏.‎‏ Staphylococcus aureus showed the highest sensitivity among the tested ‎‎‎microorganisms.

Keywords: Antimicrobial agents, phenolic compounds, pomegranate peel, solvent extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952

7677 Application of Fuzzy Neural Network for Image Tumor Description

Authors: Nahla Ibraheem Jabbar, Monica Mehrotra

Abstract:

This paper used a fuzzy kohonen neural network for medical image segmentation. Image segmentation plays a important role in the many of medical imaging applications by automating or facilitating the diagnostic. The paper analyses the tumor by extraction of the features of (area, entropy, means and standard deviation).These measurements gives a description for a tumor.

Keywords: FCM, features extraction, medical image processing, neural network, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109

7676 Region-Based Segmentation of Generic Video Scenes Indexing

Authors: Aree A. Mohammed

Abstract:

In this work we develop an object extraction method and propose efficient algorithms for object motion characterization. The set of proposed tools serves as a basis for development of objectbased functionalities for manipulation of video content. The estimators by different algorithms are compared in terms of quality and performance and tested on real video sequences. The proposed method will be useful for the latest standards of encoding and description of multimedia content – MPEG4 and MPEG7.

Keywords: Object extraction, Video indexing, Segmentation, Optical flow, Motion estimators.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1352

7675 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-

Authors: Nieto Bernal Wilson, Carmona Suarez Edgar

Abstract:

The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.

Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478

7674 Hypoglycemic Activity of Water Soluble Polysaccharides of Yam (Dioscorea hispida Dents) Prepared by Aqueous, Papain, and Tempeh Inoculum Assisted Extractions

Authors: Teti Estiasih, Harijono, Weny Bekti Sunarharum, Atina Rahmawati

Abstract:

This research studied the hypoglycemic effect of water soluble polysaccharide (WSP) extracted from yam (Dioscorea hispida) tuber by three different methods: aqueous extraction, papain assisted extraction, and tempeh inoculums assisted extraction. The two later extraction methods were aimed to remove WSP binding protein to have more pure WSP. The hypoglycemic activities were evaluated by means in vivo test on alloxan induced hyperglycemic rats, glucose response test (GRT), in situ glucose absorption test using everted sac, and short chain fatty acids (SCFAs) analysis. All yam WSP extracts exhibited ability to decrease blood glucose level in hyperglycemia condition as well as inhibited glucose absorption and SCFA formation. The order of hypoglycemic activity was tempeh inoculums assisted- >papain assisted- >aqueous WSP extracts. GRT and in situ glucose absorption test showed that order of inhibition was papain assisted- >tempeh inoculums assisted- >aqueous WSP extracts. Digesta of caecum of yam WSP extracts oral fed rats had more SCFA than control. Tempeh inoculums assisted WSP extract exhibited the most significant hypoglycemic activity.

Keywords: hypoglycemic activity, papain, tempeh inoculums, water soluble polysaccharides, yam (Discorea hispida)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3054

7673 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process

Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek

Abstract:

It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.

Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2397

7672 Efficient STAKCERT KDD Processes in Worm Detection

Authors: Madihah Mohd Saudi, Andrea J Cullen, Mike E Woodward

Abstract:

This paper presents a new STAKCERT KDD processes for worm detection. The enhancement introduced in the data-preprocessing resulted in the formation of a new STAKCERT model for worm detection. In this paper we explained in detail how all the processes involved in the STAKCERT KDD processes are applied within the STAKCERT model for worm detection. Based on the experiment conducted, the STAKCERT model yielded a 98.13% accuracy rate for worm detection by integrating the STAKCERT KDD processes.

Keywords: data mining, incident response, KDD processes, security metrics and worm detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1655

7671 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920

7670 Application Methodology for the Generation of 3D Thermal Models Using UAV Photogrammety and Dual Sensors for Mining/Industrial Facilities Inspection

Authors: Javier Sedano-Cibrián, Julio Manuel de Luis-Ruiz, Rubén Pérez-Álvarez, Raúl Pereda-García, Beatriz Malagón-Picón

Abstract:

Structural inspection activities are necessary to ensure the correct functioning of infrastructures. UAV techniques have become more popular than traditional techniques. Specifically, UAV Photogrammetry allows time and cost savings. The development of this technology has permitted the use of low-cost thermal sensors in UAVs. The representation of 3D thermal models with this type of equipment is in continuous evolution. The direct processing of thermal images usually leads to errors and inaccurate results. In this paper, a methodology is proposed for the generation of 3D thermal models using dual sensors, which involves the application of RGB and thermal images in parallel. Hence, the RGB images are used as the basis for the generation of the model geometry, and the thermal images are the source of the surface temperature information that is projected onto the model. Mining/industrial facilities representations that are obtained can be used for inspection activities.

Keywords: Aerial thermography, data processing, drone, low-cost, point cloud.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 341

7669 2D Graphical Analysis of Wastewater Influent Capacity Time Series

Authors: Monika Chuchro, Maciej Dwornik

Abstract:

The extraction of meaningful information from image could be an alternative method for time series analysis. In this paper, we propose a graphical analysis of time series grouped into table with adjusted colour scale for numerical values. The advantages of this method are also discussed. The proposed method is easy to understand and is flexible to implement the standard methods of pattern recognition and verification, especially for noisy environmental data.

Keywords: graphical analysis, time series, seasonality, noisy environmental data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450

7668 A Self Supervised Bi-directional Neural Network (BDSONN) Architecture for Object Extraction Guided by Beta Activation Function and Adaptive Fuzzy Context Sensitive Thresholding

Authors: Siddhartha Bhattacharyya, Paramartha Dutta, Ujjwal Maulik, Prashanta Kumar Nandi

Abstract:

A multilayer self organizing neural neural network (MLSONN) architecture for binary object extraction, guided by a beta activation function and characterized by backpropagation of errors estimated from the linear indices of fuzziness of the network output states, is discussed. Since the MLSONN architecture is designed to operate in a single point fixed/uniform thresholding scenario, it does not take into cognizance the heterogeneity of image information in the extraction process. The performance of the MLSONN architecture with representative values of the threshold parameters of the beta activation function employed is also studied. A three layer bidirectional self organizing neural network (BDSONN) architecture comprising fully connected neurons, for the extraction of objects from a noisy background and capable of incorporating the underlying image context heterogeneity through variable and adaptive thresholding, is proposed in this article. The input layer of the network architecture represents the fuzzy membership information of the image scene to be extracted. The second layer (the intermediate layer) and the final layer (the output layer) of the network architecture deal with the self supervised object extraction task by bi-directional propagation of the network states. Each layer except the output layer is connected to the next layer following a neighborhood based topology. The output layer neurons are in turn, connected to the intermediate layer following similar topology, thus forming a counter-propagating architecture with the intermediate layer. The novelty of the proposed architecture is that the assignment/updating of the inter-layer connection weights are done using the relative fuzzy membership values at the constituent neurons in the different network layers. Another interesting feature of the network lies in the fact that the processing capabilities of the intermediate and the output layer neurons are guided by a beta activation function, which uses image context sensitive adaptive thresholding arising out of the fuzzy cardinality estimates of the different network neighborhood fuzzy subsets, rather than resorting to fixed and single point thresholding. An application of the proposed architecture for object extraction is demonstrated using a synthetic and a real life image. The extraction efficiency of the proposed network architecture is evaluated by a proposed system transfer index characteristic of the network.

Keywords: Beta activation function, fuzzy cardinality, multilayer self organizing neural network, object extraction,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565

7667 Discovery of Sequential Patterns Based On Constraint Patterns

Authors: Shigeaki Sakurai, Youichi Kitahata, Ryohei Orihara

Abstract:

This paper proposes a method that discovers sequential patterns corresponding to user-s interests from sequential data. This method expresses the interests as constraint patterns. The constraint patterns can define relationships among attributes of the items composing the data. The method recursively decomposes the constraint patterns into constraint subpatterns. The method evaluates the constraint subpatterns in order to efficiently discover sequential patterns satisfying the constraint patterns. Also, this paper applies the method to the sequential data composed of stock price indexes and verifies its effectiveness through comparing it with a method without using the constraint patterns.

Keywords: Sequential pattern mining, Constraint pattern, Attribute constraint, Stock price indexes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1423

7666 Total and Leachable Concentration of Trace Elements in Soil towards Human Health Risk, Related with Coal Mine in Jorong, South Kalimantan, Indonesia

Authors: Arie Pujiwati, Kengo Nakamura, Noriaki Watanabe, Takeshi Komai

Abstract:

Coal mining is well known to cause considerable environmental impacts, including trace element contamination of soil. This study aimed to assess the trace element (As, Cd, Co, Cu, Ni, Pb, Sb, and Zn) contamination of soil in the vicinity of coal mining activities, using the case study of Asam-asam River basin, South Kalimantan, Indonesia, and to assess the human health risk, incorporating total and bioavailable (water-leachable and acid-leachable) concentrations. The results show the enrichment of As and Co in soil, surpassing the background soil value. Contamination was evaluated based on the index of geo-accumulation, I_geo and the pollution index, PI. I_geo values showed that the soil was generally uncontaminated (I_geo ≤ 0), except for elevated As and Co. Mean PI for Ni and Cu indicated slight contamination. Regarding the assessment of health risks, the Hazard Index, HI showed adverse risks (HI > 1) for Ni, Co, and As. Further, Ni and As were found to pose unacceptable carcinogenic risk (risk > 1.10^-5). Farming, settlement, and plantation were found to present greater risk than coal mines. These results show that coal mining activity in the study area contaminates the soils by particular elements and may pose potential human health risk in its surrounding area. This study is important for setting appropriate countermeasure actions and improving basic coal mining management in Indonesia.

Keywords: Coal mine, risk, soil, trace elements.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1175

7665 Laser Data Based Automatic Generation of Lane-Level Road Map for Intelligent Vehicles

Authors: Zehai Yu, Hui Zhu, Linglong Lin, Huawei Liang, Biao Yu, Weixin Huang

Abstract:

With the development of intelligent vehicle systems, a high-precision road map is increasingly needed in many aspects. The automatic lane lines extraction and modeling are the most essential steps for the generation of a precise lane-level road map. In this paper, an automatic lane-level road map generation system is proposed. To extract the road markings on the ground, the multi-region Otsu thresholding method is applied, which calculates the intensity value of laser data that maximizes the variance between background and road markings. The extracted road marking points are then projected to the raster image and clustered using a two-stage clustering algorithm. Lane lines are subsequently recognized from these clusters by the shape features of their minimum bounding rectangle. To ensure the storage efficiency of the map, the lane lines are approximated to cubic polynomial curves using a Bayesian estimation approach. The proposed lane-level road map generation system has been tested on urban and expressway conditions in Hefei, China. The experimental results on the datasets show that our method can achieve excellent extraction and clustering effect, and the fitted lines can reach a high position accuracy with an error of less than 10 cm.

Keywords: Curve fitting, lane-level road map, line recognition, multi-thresholding, two-stage clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 512

7664 The Power of Indigenous Peoples in Decision-Making Processes of Mining Projects: The Pilbara Region

Authors: K. N. Penna, J. P. English

Abstract:

The destruction of the Juukan Gorge rock shelters in 2020 has catalysed impetus within Australian society for a significant change in engagement with Indigenous Peoples, and the approach to Indigenous cultural heritage, both within the Pilbara region and more broadly across Australia. Culture-based and people-centred approaches are inherent to inclusive sustainable development and Free, Prior, Informed Consent, outcomes encouraged by international and local recommendations on the human rights and cultural heritage preservation of Indigenous peoples. In this paper, we present an interpretive model of an evolved process for mining project development, incorporating culture-based and people-centred approaches, based on the Theory U system change method. The evolved process advocates a change in organisational mindset and culture, and a comprehensive understanding of Indigenous Peoples’ culture and values, as the foundations for increasing their influence and achieving mutually beneficial developments.

Keywords: Indigenous Engagement, mining industry, culture-based approach, people-centred approach, Theory U.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 437

7663 EDULOGIC+ - Knowledge Management through Data Analysis in Education

Authors: Alok Sharma, Dr. Harvinder S. Saini, Raviteja Tiruvury

Abstract:

This paper outlines the application of Knowledge Management (KM) principles in the context of Educational institutions. The paper caters to the needs of the engineering institutions for imparting quality education by delineating the instruction delivery process in a highly structured, controlled and quantified manner. This is done using a software tool EDULOGIC+. The central idea has been based on the engineering education pattern in Indian Universities/ Institutions. The data, contents and results produced over contiguous years build the necessary ground for managing the related accumulated knowledge. Application of KM has been explained using certain examples of data analysis and knowledge extraction.

Keywords: Education software system, information system, knowledge management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1753

7662 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1996

7661 Entropy Based Data Hiding for Document Images

Authors: Swetha Kurup, Sridhar G., Sridhar V.

Abstract:

In this paper we present a novel technique for data hiding in binary document images. We use the concept of entropy in order to identify document specific least distortive areas throughout the binary document image. The document image is treated as any other image and the proposed method utilizes the standard document characteristics for the embedding process. Proposed method minimizes perceptual distortion due to embedding and allows watermark extraction without the requirement of any side information at the decoder end.

Keywords: Entropy, Steganography, Watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530

7660 Web Data Scraping Technology Using Term Frequency Inverse Document Frequency to Enhance the Big Data Quality on Sentiment Analysis

Authors: Sangita Pokhrel, Nalinda Somasiri, Rebecca Jeyavadhanam, Swathi Ganesan

Abstract:

Tourism is a booming industry with huge future potential for global wealth and employment. There are countless data generated over social media sites every day, creating numerous opportunities to bring more insights to decision-makers. The integration of big data technology into the tourism industry will allow companies to conclude where their customers have been and what they like. This information can then be used by businesses, such as those in charge of managing visitor centres or hotels, etc., and the tourist can get a clear idea of places before visiting. The technical perspective of natural language is processed by analysing the sentiment features of online reviews from tourists, and we then supply an enhanced long short-term memory (LSTM) framework for sentiment feature extraction of travel reviews. We have constructed a web review database using a crawler and web scraping technique for experimental validation to evaluate the effectiveness of our methodology. The text form of sentences was first classified through VADER and RoBERTa model to get the polarity of the reviews. In this paper, we have conducted study methods for feature extraction, such as Count Vectorization and Term Frequency – Inverse Document Frequency (TFIDF) Vectorization and implemented Convolutional Neural Network (CNN) classifier algorithm for the sentiment analysis to decide if the tourist’s attitude towards the destinations is positive, negative, or simply neutral based on the review text that they posted online. The results demonstrated that from the CNN algorithm, after pre-processing and cleaning the dataset, we received an accuracy of 96.12% for the positive and negative sentiment analysis.

Keywords: Counter vectorization, Convolutional Neural Network, Crawler, data technology, Long Short-Term Memory, LSTM, Web Scraping, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 175

7659 An Attribute-Centre Based Decision Tree Classification Algorithm

Authors: Gökhan Silahtaroğlu

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1369

7658 The Benefits of End-To-End Integrated Planning from the Mine to Client Supply for Minimizing Penalties

Authors: G. Martino, F. Silva, E. Marchal

Abstract:

The control over delivered iron ore blend characteristics is one of the most important aspects of the mining business. The iron ore price is a function of its composition, which is the outcome of the beneficiation process. So, end-to-end integrated planning of mine operations can reduce risks of penalties on the iron ore price. In a standard iron mining company, the production chain is composed of mining, ore beneficiation, and client supply. When mine planning and client supply decisions are made uncoordinated, the beneficiation plant struggles to deliver the best blend possible. Technological improvements in several fields allowed bridging the gap between departments and boosting integrated decision-making processes. Clusterization and classification algorithms over historical production data generate reasonable previsions for quality and volume of iron ore produced for each pile of run-of-mine (ROM) processed. Mathematical modeling can use those deterministic relations to propose iron ore blends that better-fit specifications within a delivery schedule. Additionally, a model capable of representing the whole production chain can clearly compare the overall impact of different decisions in the process. This study shows how flexibilization combined with a planning optimization model between the mine and the ore beneficiation processes can reduce risks of out of specification deliveries. The model capabilities are illustrated on a hypothetical iron ore mine with magnetic separation process. Finally, this study shows ways of cost reduction or profit increase by optimizing process indicators across the production chain and integrating the different plannings with the sales decisions.

Keywords: Clusterization and classification algorithms, integrated planning, optimization, mathematical modeling, penalty minimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 645

7657 Automatic Road Network Recognition and Extraction for Urban Planning

Authors: D. B. L. Bong, K.C. Lai, A. Joseph

Abstract:

The uses of road map in daily activities are numerous but it is a hassle to construct and update a road map whenever there are changes. In Universiti Malaysia Sarawak, research on Automatic Road Extraction (ARE) was explored to solve the difficulties in updating road map. The research started with using Satellite Image (SI), or in short, the ARE-SI project. A Hybrid Simple Colour Space Segmentation & Edge Detection (Hybrid SCSS-EDGE) algorithm was developed to extract roads automatically from satellite-taken images. In order to extract the road network accurately, the satellite image must be analyzed prior to the extraction process. The characteristics of these elements are analyzed and consequently the relationships among them are determined. In this study, the road regions are extracted based on colour space elements and edge details of roads. Besides, edge detection method is applied to further filter out the non-road regions. The extracted road regions are validated by using a segmentation method. These results are valuable for building road map and detecting the changes of the existing road database. The proposed Hybrid Simple Colour Space Segmentation and Edge Detection (Hybrid SCSS-EDGE) algorithm can perform the tasks fully automatic, where the user only needs to input a high-resolution satellite image and wait for the result. Moreover, this system can work on complex road network and generate the extraction result in seconds.

Keywords: Road Network Recognition, Colour Space, Edge Detection, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2994

7656 Combining Bagging and Boosting

Authors: S. B. Kotsiantis, P. E. Pintelas

Abstract:

Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.

Keywords: data mining, machine learning, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2562

7655 An Advanced Time-Frequency Domain Method for PD Extraction with Non-Intrusive Measurement

Authors: Guomin Luo, Daming Zhang, Yong Kwee Koh, Kim Teck Ng, Helmi Kurniawan, Weng Hoe Leong

Abstract:

Partial discharge (PD) detection is an important method to evaluate the insulation condition of metal-clad apparatus. Non-intrusive sensors which are easy to install and have no interruptions on operation are preferred in onsite PD detection. However, it often lacks of accuracy due to the interferences in PD signals. In this paper a novel PD extraction method that uses frequency analysis and entropy based time-frequency (TF) analysis is introduced. The repetitive pulses from convertor are first removed via frequency analysis. Then, the relative entropy and relative peak-frequency of each pulse (i.e. time-indexed vector TF spectrum) are calculated and all pulses with similar parameters are grouped. According to the characteristics of non-intrusive sensor and the frequency distribution of PDs, the pulses of PD and interferences are separated. Finally the PD signal and interferences are recovered via inverse TF transform. The de-noised result of noisy PD data demonstrates that the combination of frequency and time-frequency techniques can discriminate PDs from interferences with various frequency distributions.

Keywords: Entropy, Fourier analysis, non-intrusive measurement, time-frequency analysis, partial discharge

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589

7654 Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Authors: T. Niknam, M. Nayeripour, B.Bahmani Firouzi

Abstract:

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Keywords: Ant Colony Optimization (ACO), Data clustering, Hybrid evolutionary optimization algorithm, K-means clustering, Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2198

7653 Comparative Study of Universities’ Web Structure Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.

Keywords: Algorithm, ranking, website, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667

7652 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951