Search results for: Data mining andInformation Extraction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8050

Search results for: Data mining andInformation Extraction

7840 On Pattern-Based Programming towards the Discovery of Frequent Patterns

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop

Abstract:

The problem of frequent pattern discovery is defined as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns has become an important data mining task because it reveals associations, correlations, and many other interesting relationships hidden in a database. Most of the proposed frequent pattern mining algorithms have been implemented with imperative programming languages. Such paradigm is inefficient when set of patterns is large and the frequent pattern is long. We suggest a high-level declarative style of programming apply to the problem of frequent pattern discovery. We consider two languages: Haskell and Prolog. Our intuitive idea is that the problem of finding frequent patterns should be efficiently and concisely implemented via a declarative paradigm since pattern matching is a fundamental feature supported by most functional languages and Prolog. Our frequent pattern mining implementation using the Haskell and Prolog languages confirms our hypothesis about conciseness of the program. The comparative performance studies on line-of-code, speed and memory usage of declarative versus imperative programming have been reported in the paper.

Keywords: Frequent pattern mining, functional programming, pattern matching, logic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1296
7839 Web Traffic Mining using Neural Networks

Authors: Farhad F. Yusifov

Abstract:

With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.

Keywords: Clustering, Self organizing map, Web log files, Web traffic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565
7838 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino

Abstract:

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Keywords: Cluster analysis, constipation, data mining, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1264
7837 A Decision Support System for Predicting Hospitalization of Hemodialysis Patients

Authors: Jinn-Yi Yeh, Tai-Hsi Wu

Abstract:

Hemodialysis patients might suffer from unhealthy care behaviors or long-term dialysis treatments. Ultimately they need to be hospitalized. If the hospitalization rate of a hemodialysis center is high, its quality of service would be low. Therefore, how to decrease hospitalization rate is a crucial problem for health care. In this study we combined temporal abstraction with data mining techniques for analyzing the dialysis patients' biochemical data to develop a decision support system. The mined temporal patterns are helpful for clinicians to predict hospitalization of hemodialysis patients and to suggest them some treatments immediately to avoid hospitalization.

Keywords: Hemodialysis, Temporal abstract, Data mining, Healthcare quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1694
7836 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
7835 Numerical Modeling of Artisanal and Small-Scale Mining of Coltan in the African Great Lakes Region

Authors: Sergio Perez Rodriguez

Abstract:

Findings of a production model of Artisanal and Small-Scale Mining (ASM) of coltan ore by an average Democratic Republic of Congo (DRC) mineworker are presented in this paper. These can be used as a reference for a similar characterization of the daily labor of counterparts from other countries in the Africa's Great Lakes region. To that end, the Fundamental Equation of Mineral Production has been applied in this paper, considering a miner's average daily output of coltan, estimated in the base of gross statistical data gathered from reputable sources. Results indicate daily yields of individual miners in the order of 300 g of coltan ore, with hourly peaks of production in the range of 30 to 40 g of the mineral. Yields are expected to be in the order of 5 g or less during the least productive hours. These outputs are expected to be achieved during the halves of the eight to 10 hours of daily working sessions that these artisanal laborers can attend during the mining season.

Keywords: Coltan, mineral production, Production to Reserve ratio, artisanal mining, small-scale mining, ASM, human work, Great Lakes region, Democratic Republic of Congo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 137
7834 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
7833 Using Phase Equilibrium Theory to Calculate Solubility of γ-Oryzanol in Supercritical CO2

Authors: Boy Arief Fachri

Abstract:

Even its content is rich in antioxidants ϒ-oryzanol, rice bran is not used properly as functional food. This research aims to (1) extract ϒ-oryzanol; (2) determine the solubility of ϒ-oryzanol in supercritical CO2 based on phase equilibrium theory; and (3) study the effect of process variables on solubility. Extraction experiments were carried out for rice bran (5 g) at various extraction pressures, temperatures and reaction times. The flowrate of supercritical fluid through the extraction vessel was 25 g/min. The extracts were collected and analysed with high-pressure liquid chromatography (HPLC). The conclusion based on the experiments are as: (1) The highest experimental solubility was 0.303 mcg/mL RBO at T= 60°C, P= 90 atm, t= 30 min; (2) Solubility of ϒ-oryzanol was influenced by pressure and temperature. As the pressure and temperature increase, the solubility increases; (3) The solubility data of supercritical extraction can be successfully determined using phase equilibrium theory. Meanwhile, tocopherol was found and slightly investigated in this work.

Keywords: Rice bran, solubility, supercritical CO2, ϒ-orizanol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2103
7832 Optimization of Soybean Oil by Modified Supercritical Carbon Dioxide

Authors: N. R. Putra, A. H. Abdul Aziz, A. S. Zaini, Z. Idham, F. Idrus, M. Z. Bin Zullyadini, M. A. Che Yunus

Abstract:

The content of omega-3 in soybean oil is important in the development of infants and is an alternative for the omega-3 in fish oils. The investigation of extraction of soybean oil is needed to obtain the bioactive compound in the extract. Supercritical carbon dioxide extraction is modern and green technology to extract herbs and plants to obtain high quality extract due to high diffusivity and solubility of the solvent. The aim of this study was to obtain the optimum condition of soybean oil extraction by modified supercritical carbon dioxide. The soybean oil was extracted by using modified supercritical carbon dioxide (SC-CO2) under the temperatures of 40, 60, 80 °C, pressures of 150, 250, 350 Bar, and constant flow-rate of 10 g/min as the parameters of extraction processes. An experimental design was performed in order to optimize three important parameters of SC-CO2 extraction which are pressure (X1), temperature (X2) to achieve optimum yields of soybean oil. Box Behnken Design was applied for experimental design. From the optimization process, the optimum condition of extraction of soybean oil was obtained at pressure 338 Bar and temperature 80 °C with oil yield of 2.713 g. Effect of pressure is significant on the extraction of soybean oil by modified supercritical carbon dioxide. Increasing of pressure will increase the oil yield of soybean oil.

Keywords: Soybean oil, SC-CO2 extraction, yield, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 889
7831 Novelty as a Measure of Interestingness in Knowledge Discovery

Authors: Vasudha Bhatnagar, Ahmed Sultan Al-Hegami, Naveen Kumar

Abstract:

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.

Keywords: Knowledge Discovery in Databases (KDD), Interestingness, Subjective Measures, Novelty Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766
7830 Selective Solvent Extraction of Calcium and Magnesium from Concentrate Nickel Solutions Using Mixtures of Cyanex 272 and D2EHPA

Authors: Alexandre S. Guimarães, Marcelo B. Mansur

Abstract:

The performance of organophosphorus extractants Cyanex 272 and D2EHPA on the purification of concentrate nickel sulfate solutions was evaluated. Batch scale tests were carried out at pH range of 2 to 7 using a laboratory solution simulating concentrate nickel liquors as those typically obtained when sulfate intermediates from nickel laterite are re-leached and treated for the selective removal of cobalt, zinc, manganese and copper with Cyanex 272 ([Ca] = 0.57 g/L, [Mg] = 3.2 g/L, and [Ni] = 88 g/L). The increase on the concentration of D2EHPA favored the calcium extraction. The extraction of magnesium is dependent on the pH and of ratio of extractants D2EHPA and Cyanex 272 in the organic phase. The composition of the investigated organic phase did not affect nickel extraction. The number of stages is dependent on the magnesium extraction. The most favorable operating condition to selectively remove calcium and magnesium was determined.

Keywords: Solvent extraction, organophosphorus extractants, alkaline earth metals, nickel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2427
7829 Modified Buck Boost Circuit for Linear and Non-Linear Piezoelectric Energy Harvesting

Authors: I Made Darmayuda, Chai Tshun Chuan Kevin, Je Minkyu

Abstract:

Plenty researches have reported techniques to harvest energy from piezoelectric transducer. In the earlier years, the researches mainly report linear energy harvesting techniques whereby interface circuitry is designed to have input impedance that match with the impedance of the piezoelectric transducer. In recent years non-linear techniques become more popular. The non-linear technique employs voltage waveform manipulation to boost the available-for-extraction energy at the time of energy transfer.  The fact that non-linear energy extraction provides larger available-for-extraction energy doesn’t mean the linear energy extraction is completely obsolete. In some scenarios, such as where initial power is not available, linear energy extraction is still preferred. A modified Buck Boost circuit which is capable of harvesting piezoelectric energy using both linear and non-linear techniques is reported in this paper. Efficiency of at least 64% can be achieved using this circuit. For linear extraction, the modified Buck Boost circuit is controlled using a fix frequency and duty cycle clock. A voltage sensor and a pulse generator are added as the controller for the non-linear extraction technique. 

Keywords: Buck boost, energy harvester, linear energy harvester, non-linear energy harvester, piezoelectric, synchronized charge extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2402
7828 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application

Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman

Abstract:

Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.

Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4446
7827 Accelerated Microwave Extraction of Natural Product using the Cryogrinding

Authors: F. Benkaci-Ali, R. Mekaoui, G. Eppe, E. De Pau, J. F. Faucont

Abstract:

Team distillation assisted by microwave extraction (SDAM) considered as accelerated technique extraction is a combination of microwave heating and steam distillation, performed at atmospheric pressure. SDAM has been compared with the same technique coupled with the cryogrinding of seeds (SDAM -CG). Isolation and concentration of volatile compounds are performed by a single stage for the extraction of essential oil from Cuminum cyminum seeds. The essential oils extracted by these two methods for 5 min were quantitatively (yield) and qualitatively (aromatic profile) no similar. These methods yield an essential oil with higher amounts of more valuable oxygenated compounds, and allow substantial savings of costs, in terms of time, energy and plant material. SDAM and SDAM-CG is a green technology and appears as a good alternative for the extraction of essential oils from aromatic plants.

Keywords: Steam distillation, microwave extraction, Cuminum cyminum, chromatography, mass spectrometry

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2302
7826 Video Mining for Creative Rendering

Authors: Mei Chen

Abstract:

More and more home videos are being generated with the ever growing popularity of digital cameras and camcorders. For many home videos, a photo rendering, whether capturing a moment or a scene within the video, provides a complementary representation to the video. In this paper, a video motion mining framework for creative rendering is presented. The user-s capture intent is derived by analyzing video motions, and respective metadata is generated for each capture type. The metadata can be used in a number of applications, such as creating video thumbnail, generating panorama posters, and producing slideshows of video.

Keywords: Motion mining, semantic abstraction, video mining, video representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611
7825 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672
7824 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1172
7823 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1762
7822 Navigation Patterns Mining Approach based on Expectation Maximization Algorithm

Authors: Norwati Mustapha, Manijeh Jalali, Abolghasem Bozorgniya, Mehrdad Jalali

Abstract:

Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user-s navigation pattern. The model makes user model based on expectation-maximization (EM) algorithm.An EM algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment.

Keywords: Web Usage Mining, Expectation maximization, navigation pattern mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1540
7821 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

Authors: Helyane Bronoski Borges, Júlio Cesar Nievola

Abstract:

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Keywords: Attribute selection, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374
7820 Optimization of Extraction of Phenolic Compounds from Avicennia marina (Forssk.)Vierh using Response Surface Methodology

Authors: V.Bharathi, Jamila Patterson, R.Rajendiran

Abstract:

Optimization of extraction of phenolic compounds from Avicennia marina using response surface methodology was carried out during the present study. Five levels, three factors rotatable design (CCRD) was utilized to examine the optimum combination of extraction variables based on the TPC of Avicennia marina leaves. The best combination of response function was 78.41 °C, drying temperature; 26.18°C; extraction temperature and 36.53 minutes of extraction time. However, the procedure can be promptly extended to the study of several others pharmaceutical processes like purification of bioactive substances, drying of extracts and development of the pharmaceutical dosage forms for the benefit of consumers.

Keywords: Avicennia marina, Central Composite RotatableDesign (CCRD), Response Surface Methodology, Total Phenoliccontents (TPC)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2021
7819 Data Mining on the Router Logs for Statistical Application Classification

Authors: M. Rahmati, S.M. Mirzababaei

Abstract:

With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1609
7818 Reduction of Plants Biodiversity in Hyrcanian Forest by Coal Mining Activities

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

Considering that coal mining is one of the important industrial activities, it may cause damages to environment. According to the author’s best knowledge, the effect of traditional coal mining activities on plant biodiversity has not been investigated in the Hyrcanian forests. Therefore, in this study, the effect of coal mining activities on vegetation and tree diversity was investigated in Hyrcanian forest, North Iran. After filed visiting and determining the mine, 16 plots (20×20 m2) were established by systematic-randomly (60×60 m2) in an area of 4 ha (200×200 m2-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity, and it is considered as the control area. In each plot, the data about trees such as number and type of species were recorded. The biodiversity of vegetation cover was considered 5 square sub-plots (1 m2) in each plot. PAST software and Ecological Methodology were used to calculate Biodiversity indices. The value of Shannon Wiener and Simpson diversity indices for tree cover in control area (1.04±0.34 and 0.62±0.20) was significantly higher than mining area (0.78±0.27 and 0.45±0.14). The value of evenness indices for tree cover in the mining area was significantly lower than that of the control area. The value of Shannon Wiener and Simpson diversity indices for vegetation cover in the control area (1.37±0.06 and 0.69±0.02) was significantly higher than the mining area (1.02±0.13 and 0.50±0.07). The value of evenness index in the control area was significantly higher than the mining area. Plant communities are a good indicator of the changes in the site. Study about changes in vegetation biodiversity and plant dynamics in the degraded land can provide necessary information for forest management and reforestation of these areas.

Keywords: Vegetation biodiversity, species composition, traditional coal mining, caspian forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 842
7817 Using Data Mining Techniques for Estimating Minimum, Maximum and Average Daily Temperature Values

Authors: S. Kotsiantis, A. Kostoulas, S. Lykoudis, A. Argiriou, K. Menagias

Abstract:

Estimates of temperature values at a specific time of day, from daytime and daily profiles, are needed for a number of environmental, ecological, agricultural and technical applications, ranging from natural hazards assessments, crop growth forecasting to design of solar energy systems. The scope of this research is to investigate the efficiency of data mining techniques in estimating minimum, maximum and mean temperature values. For this reason, a number of experiments have been conducted with well-known regression algorithms using temperature data from the city of Patras in Greece. The performance of these algorithms has been evaluated using standard statistical indicators, such as Correlation Coefficient, Root Mean Squared Error, etc.

Keywords: regression algorithms, supervised machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3369
7816 Extraction of Natural Colorant from the Flowers of Flame of Forest Using Ultrasound

Authors: Sunny Arora, Meghal A. Desai

Abstract:

An impetus towards green consumerism and implementation of sustainable techniques, consumption of natural products and utilization of environment friendly techniques have gained accelerated acceptance. Butein, a natural colorant, has many medicinal properties apart from its use in dyeing industries. Extraction of butein from the flowers of flame of forest was carried out using ultrasonication bath. Solid loading (2-6 g), extraction time (30-50 min), volume of solvent (30-50 mL) and types of solvent (methanol, ethanol and water) have been studied to maximize the yield of butein using the Taguchi method. The highest yield of butein 4.67% (w/w) was obtained using 4 g of plant material, 40 min of extraction time and 30 mL volume of methanol as a solvent. The present method provided a greater reduction in extraction time compared to the conventional method of extraction. Hence, the outcome of the present investigation could further be utilized to develop the method at a higher scale.

Keywords: Butein, flowers of flame of forest, Taguchi method, ultrasonic bath.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 882
7815 Road Extraction Using Stationary Wavelet Transform

Authors: Somkait Udomhunsakul

Abstract:

In this paper, a novel road extraction method using Stationary Wavelet Transform is proposed. To detect road features from color aerial satellite imagery, Mexican hat Wavelet filters are used by applying the Stationary Wavelet Transform in a multiresolution, multi-scale, sense and forming the products of Wavelet coefficients at a different scales to locate and identify road features at a few scales. In addition, the shifting of road features locations is considered through multiple scales for robust road extraction in the asymmetry road feature profiles. From the experimental results, the proposed method leads to a useful technique to form the basis of road feature extraction. Also, the method is general and can be applied to other features in imagery.

Keywords: Road extraction, Multiresolution, Stationary Wavelet Transform, Multi-scale analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1831
7814 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: Opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 772
7813 Multiple-Level Sequential Pattern Discovery from Customer Transaction Databases

Authors: An Chen, Huilin Ye

Abstract:

Mining sequential patterns from large customer transaction databases has been recognized as a key research topic in database systems. However, the previous works more focused on mining sequential patterns at a single concept level. In this study, we introduced concept hierarchies into this problem and present several algorithms for discovering multiple-level sequential patterns based on the hierarchies. An experiment was conducted to assess the performance of the proposed algorithms. The performances of the algorithms were measured by the relative time spent on completing the mining tasks on two different datasets. The experimental results showed that the performance depends on the characteristics of the datasets and the pre-defined threshold of minimal support for each level of the concept hierarchy. Based on the experimental results, some suggestions were also given for how to select appropriate algorithm for a certain datasets.

Keywords: Data Mining, Multiple-Level Sequential Pattern, Concept Hierarchy, Customer Transaction Database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1415
7812 Enhancing capabilities of Texture Extraction for Color Image Retrieval

Authors: Pranam Janney, Sridhar G, Sridhar V.

Abstract:

Content-Based Image Retrieval has been a major area of research in recent years. Efficient image retrieval with high precision would require an approach which combines usage of both the color and texture features of the image. In this paper we propose a method for enhancing the capabilities of texture based feature extraction and further demonstrate the use of these enhanced texture features in Texture-Based Color Image Retrieval.

Keywords: Image retrieval, texture feature extraction, color extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573
7811 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1879