Search results for: Data mining andInformation Extraction

7905 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

Authors: Tatjana Eitrich, Bruno Lang

Abstract:

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1574

7904 Customer Need Type Classification Model using Data Mining Techniques for Recommender Systems

Authors: Kyoung-jae Kim

Abstract:

Recommender systems are usually regarded as an important marketing tool in the e-commerce. They use important information about users to facilitate accurate recommendation. The information includes user context such as location, time and interest for personalization of mobile users. We can easily collect information about location and time because mobile devices communicate with the base station of the service provider. However, information about user interest can-t be easily collected because user interest can not be captured automatically without user-s approval process. User interest usually represented as a need. In this study, we classify needs into two types according to prior research. This study investigates the usefulness of data mining techniques for classifying user need type for recommendation systems. We employ several data mining techniques including artificial neural networks, decision trees, case-based reasoning, and multivariate discriminant analysis. Experimental results show that CHAID algorithm outperforms other models for classifying user need type. This study performs McNemar test to examine the statistical significance of the differences of classification results. The results of McNemar test also show that CHAID performs better than the other models with statistical significance.

Keywords: Customer need type, Data mining techniques, Recommender system, Personalization, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2139

7903 Web Traffic Mining using Neural Networks

Authors: Farhad F. Yusifov

Abstract:

With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.

Keywords: Clustering, Self organizing map, Web log files, Web traffic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598

7902 Data Mining Techniques in Computer-Aided Diagnosis: Non-Invasive Cancer Detection

Authors: Florin Gorunescu

Abstract:

Diagnosis can be achieved by building a model of a certain organ under surveillance and comparing it with the real time physiological measurements taken from the patient. This paper deals with the presentation of the benefits of using Data Mining techniques in the computer-aided diagnosis (CAD), focusing on the cancer detection, in order to help doctors to make optimal decisions quickly and accurately. In the field of the noninvasive diagnosis techniques, the endoscopic ultrasound elastography (EUSE) is a recent elasticity imaging technique, allowing characterizing the difference between malignant and benign tumors. Digitalizing and summarizing the main EUSE sample movies features in a vector form concern with the use of the exploratory data analysis (EDA). Neural networks are then trained on the corresponding EUSE sample movies vector input in such a way that these intelligent systems are able to offer a very precise and objective diagnosis, discriminating between benign and malignant tumors. A concrete application of these Data Mining techniques illustrates the suitability and the reliability of this methodology in CAD.

Keywords: Endoscopic ultrasound elastography, exploratorydata analysis, neural networks, non-invasive cancer detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1858

7901 Idiopathic Constipation can be Subdivided in Clinical Subtypes: Data Mining by Cluster Analysis on a Population based Study

Authors: Mauro Giacomini, Stefania Bertone, Carlo Mansi, Pietro Dulbecco, Vincenzo Savarino

Abstract:

The prevalence of non organic constipation differs from country to country and the reliability of the estimate rates is uncertain. Moreover, the clinical relevance of subdividing the heterogeneous functional constipation disorders into pre-defined subgroups is largely unknown.. Aim: to estimate the prevalence of constipation in a population-based sample and determine whether clinical subgroups can be identified. An age and gender stratified sample population from 5 Italian cities was evaluated using a previously validated questionnaire. Data mining by cluster analysis was used to determine constipation subgroups. Results: 1,500 complete interviews were obtained from 2,083 contacted households (72%). Self-reported constipation correlated poorly with symptombased constipation found in 496 subjects (33.1%). Cluster analysis identified four constipation subgroups which correlated to subgroups identified according to pre-defined symptom criteria. Significant differences in socio-demographics and lifestyle were observed among subgroups.

Keywords: Cluster analysis, constipation, data mining, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290

7900 Walsh-Hadamard Transform for Facial Feature Extraction in Face Recognition

Authors: M. Hassan, I. Osman, M. Yahia

Abstract:

This Paper proposes a new facial feature extraction approach, Wash-Hadamard Transform (WHT). This approach is based on correlation between local pixels of the face image. Its primary advantage is the simplicity of its computation. The paper compares the proposed approach, WHT, which was traditionally used in data compression with two other known approaches: the Principal Component Analysis (PCA) and the Discrete Cosine Transform (DCT) using the face database of Olivetti Research Laboratory (ORL). In spite of its simple computation, the proposed algorithm (WHT) gave very close results to those obtained by the PCA and DCT. This paper initiates the research into WHT and the family of frequency transforms and examines their suitability for feature extraction in face recognition applications.

Keywords: Face Recognition, Facial Feature Extraction, Principal Component Analysis, and Discrete Cosine Transform, Wash-Hadamard Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2562

7899 Novelty as a Measure of Interestingness in Knowledge Discovery

Authors: Vasudha Bhatnagar, Ahmed Sultan Al-Hegami, Naveen Kumar

Abstract:

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules leads to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach based on both objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules (knowledge). We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are promising.

Keywords: Knowledge Discovery in Databases (KDD), Interestingness, Subjective Measures, Novelty Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1801

7898 A Decision Support System for Predicting Hospitalization of Hemodialysis Patients

Authors: Jinn-Yi Yeh, Tai-Hsi Wu

Abstract:

Hemodialysis patients might suffer from unhealthy care behaviors or long-term dialysis treatments. Ultimately they need to be hospitalized. If the hospitalization rate of a hemodialysis center is high, its quality of service would be low. Therefore, how to decrease hospitalization rate is a crucial problem for health care. In this study we combined temporal abstraction with data mining techniques for analyzing the dialysis patients' biochemical data to develop a decision support system. The mined temporal patterns are helpful for clinicians to predict hospitalization of hemodialysis patients and to suggest them some treatments immediately to avoid hospitalization.

Keywords: Hemodialysis, Temporal abstract, Data mining, Healthcare quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727

7897 Opinion Mining and Sentiment Analysis on DEFT

Authors: Najiba Ouled Omar, Azza Harbaoui, Henda Ben Ghezala

Abstract:

Current research practices sentiment analysis with a focus on social networks, DEfi Fouille de Texte (DEFT) (Text Mining Challenge) evaluation campaign focuses on opinion mining and sentiment analysis on social networks, especially social network Twitter. It aims to confront the systems produced by several teams from public and private research laboratories. DEFT offers participants the opportunity to work on regularly renewed themes and proposes to work on opinion mining in several editions. The purpose of this article is to scrutinize and analyze the works relating to opinions mining and sentiment analysis in the Twitter social network realized by DEFT. It examines the tasks proposed by the organizers of the challenge and the methods used by the participants.

Keywords: Opinion mining, sentiment analysis, emotion, polarity, annotation, OSEE, figurative language, DEFT, Twitter, Tweet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 811

7896 Reduction of Plants Biodiversity in Hyrcanian Forest by Coal Mining Activities

Authors: Mahsa Tavakoli, Seyed Mohammad Hojjati, Yahya Kooch

Abstract:

Considering that coal mining is one of the important industrial activities, it may cause damages to environment. According to the author’s best knowledge, the effect of traditional coal mining activities on plant biodiversity has not been investigated in the Hyrcanian forests. Therefore, in this study, the effect of coal mining activities on vegetation and tree diversity was investigated in Hyrcanian forest, North Iran. After filed visiting and determining the mine, 16 plots (20×20 m²) were established by systematic-randomly (60×60 m²) in an area of 4 ha (200×200 m²-mine entrance placed at center). An area adjacent to the mine was not affected by the mining activity, and it is considered as the control area. In each plot, the data about trees such as number and type of species were recorded. The biodiversity of vegetation cover was considered 5 square sub-plots (1 m²) in each plot. PAST software and Ecological Methodology were used to calculate Biodiversity indices. The value of Shannon Wiener and Simpson diversity indices for tree cover in control area (1.04±0.34 and 0.62±0.20) was significantly higher than mining area (0.78±0.27 and 0.45±0.14). The value of evenness indices for tree cover in the mining area was significantly lower than that of the control area. The value of Shannon Wiener and Simpson diversity indices for vegetation cover in the control area (1.37±0.06 and 0.69±0.02) was significantly higher than the mining area (1.02±0.13 and 0.50±0.07). The value of evenness index in the control area was significantly higher than the mining area. Plant communities are a good indicator of the changes in the site. Study about changes in vegetation biodiversity and plant dynamics in the degraded land can provide necessary information for forest management and reforestation of these areas.

Keywords: Vegetation biodiversity, species composition, traditional coal mining, caspian forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 887

7895 Advanced Information Extraction with n-gram based LSI

Authors: Ahmet Güven, Ö. Özgür Bozkurt, Oya Kalıpsız

Abstract:

Number of documents being created increases at an increasing pace while most of them being in already known topics and little of them introducing new concepts. This fact has started a new era in information retrieval discipline where the requirements have their own specialties. That is digging into topics and concepts and finding out subtopics or relations between topics. Up to now IR researches were interested in retrieving documents about a general topic or clustering documents under generic subjects. However these conventional approaches can-t go deep into content of documents which makes it difficult for people to reach to right documents they were searching. So we need new ways of mining document sets where the critic point is to know much about the contents of the documents. As a solution we are proposing to enhance LSI, one of the proven IR techniques by supporting its vector space with n-gram forms of words. Positive results we have obtained are shown in two different application area of IR domain; querying a document database, clustering documents in the document database.

Keywords: Document clustering, Information Extraction, Information Retrieval, LSI, n-gram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794

7894 Knowledge Discovery Techniques for Talent Forecasting in Human Resource Application

Authors: Hamidah Jantan, Abdul Razak Hamdan, Zulaiha Ali Othman

Abstract:

Human Resource (HR) applications can be used to provide fair and consistent decisions, and to improve the effectiveness of decision making processes. Besides that, among the challenge for HR professionals is to manage organization talents, especially to ensure the right person for the right job at the right time. For that reason, in this article, we attempt to describe the potential to implement one of the talent management tasks i.e. identifying existing talent by predicting their performance as one of HR application for talent management. This study suggests the potential HR system architecture for talent forecasting by using past experience knowledge known as Knowledge Discovery in Database (KDD) or Data Mining. This article consists of three main parts; the first part deals with the overview of HR applications, the prediction techniques and application, the general view of Data mining and the basic concept of talent management in HRM. The second part is to understand the use of Data Mining technique in order to solve one of the talent management tasks, and the third part is to propose the potential HR system architecture for talent forecasting.

Keywords: HR Application, Knowledge Discovery inDatabase (KDD), Talent Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4473

7893 Optimization of Soybean Oil by Modified Supercritical Carbon Dioxide

Authors: N. R. Putra, A. H. Abdul Aziz, A. S. Zaini, Z. Idham, F. Idrus, M. Z. Bin Zullyadini, M. A. Che Yunus

Abstract:

The content of omega-3 in soybean oil is important in the development of infants and is an alternative for the omega-3 in fish oils. The investigation of extraction of soybean oil is needed to obtain the bioactive compound in the extract. Supercritical carbon dioxide extraction is modern and green technology to extract herbs and plants to obtain high quality extract due to high diffusivity and solubility of the solvent. The aim of this study was to obtain the optimum condition of soybean oil extraction by modified supercritical carbon dioxide. The soybean oil was extracted by using modified supercritical carbon dioxide (SC-CO₂) under the temperatures of 40, 60, 80 °C, pressures of 150, 250, 350 Bar, and constant flow-rate of 10 g/min as the parameters of extraction processes. An experimental design was performed in order to optimize three important parameters of SC-CO₂extraction which are pressure (X₁), temperature (X₂) to achieve optimum yields of soybean oil. Box Behnken Design was applied for experimental design. From the optimization process, the optimum condition of extraction of soybean oil was obtained at pressure 338 Bar and temperature 80 °C with oil yield of 2.713 g. Effect of pressure is significant on the extraction of soybean oil by modified supercritical carbon dioxide. Increasing of pressure will increase the oil yield of soybean oil.

Keywords: Soybean oil, SC-CO2 extraction, yield, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 932

7892 Selective Solvent Extraction of Calcium and Magnesium from Concentrate Nickel Solutions Using Mixtures of Cyanex 272 and D2EHPA

Authors: Alexandre S. Guimarães, Marcelo B. Mansur

Abstract:

The performance of organophosphorus extractants Cyanex 272 and D2EHPA on the purification of concentrate nickel sulfate solutions was evaluated. Batch scale tests were carried out at pH range of 2 to 7 using a laboratory solution simulating concentrate nickel liquors as those typically obtained when sulfate intermediates from nickel laterite are re-leached and treated for the selective removal of cobalt, zinc, manganese and copper with Cyanex 272 ([Ca] = 0.57 g/L, [Mg] = 3.2 g/L, and [Ni] = 88 g/L). The increase on the concentration of D2EHPA favored the calcium extraction. The extraction of magnesium is dependent on the pH and of ratio of extractants D2EHPA and Cyanex 272 in the organic phase. The composition of the investigated organic phase did not affect nickel extraction. The number of stages is dependent on the magnesium extraction. The most favorable operating condition to selectively remove calcium and magnesium was determined.

Keywords: Solvent extraction, organophosphorus extractants, alkaline earth metals, nickel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2464

7891 Modified Buck Boost Circuit for Linear and Non-Linear Piezoelectric Energy Harvesting

Authors: I Made Darmayuda, Chai Tshun Chuan Kevin, Je Minkyu

Abstract:

Plenty researches have reported techniques to harvest energy from piezoelectric transducer. In the earlier years, the researches mainly report linear energy harvesting techniques whereby interface circuitry is designed to have input impedance that match with the impedance of the piezoelectric transducer. In recent years non-linear techniques become more popular. The non-linear technique employs voltage waveform manipulation to boost the available-for-extraction energy at the time of energy transfer. The fact that non-linear energy extraction provides larger available-for-extraction energy doesn’t mean the linear energy extraction is completely obsolete. In some scenarios, such as where initial power is not available, linear energy extraction is still preferred. A modified Buck Boost circuit which is capable of harvesting piezoelectric energy using both linear and non-linear techniques is reported in this paper. Efficiency of at least 64% can be achieved using this circuit. For linear extraction, the modified Buck Boost circuit is controlled using a fix frequency and duty cycle clock. A voltage sensor and a pulse generator are added as the controller for the non-linear extraction technique.

Keywords: Buck boost, energy harvester, linear energy harvester, non-linear energy harvester, piezoelectric, synchronized charge extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2429

7890 Determining Cluster Boundaries Using Particle Swarm Optimization

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

Keywords: Particle swarm optimization, self-organizing maps, clustering, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1709

7889 Accelerated Microwave Extraction of Natural Product using the Cryogrinding

Authors: F. Benkaci-Ali, R. Mekaoui, G. Eppe, E. De Pau, J. F. Faucont

Abstract:

Team distillation assisted by microwave extraction (SDAM) considered as accelerated technique extraction is a combination of microwave heating and steam distillation, performed at atmospheric pressure. SDAM has been compared with the same technique coupled with the cryogrinding of seeds (SDAM -CG). Isolation and concentration of volatile compounds are performed by a single stage for the extraction of essential oil from Cuminum cyminum seeds. The essential oils extracted by these two methods for 5 min were quantitatively (yield) and qualitatively (aromatic profile) no similar. These methods yield an essential oil with higher amounts of more valuable oxygenated compounds, and allow substantial savings of costs, in terms of time, energy and plant material. SDAM and SDAM-CG is a green technology and appears as a good alternative for the extraction of essential oils from aromatic plants.

Keywords: Steam distillation, microwave extraction, Cuminum cyminum, chromatography, mass spectrometry

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335

7888 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1793

7887 Using Phase Equilibrium Theory to Calculate Solubility of γ-Oryzanol in Supercritical CO2

Authors: Boy Arief Fachri

Abstract:

Even its content is rich in antioxidants ϒ-oryzanol, rice bran is not used properly as functional food. This research aims to (1) extract ϒ-oryzanol; (2) determine the solubility of ϒ-oryzanol in supercritical CO₂ based on phase equilibrium theory; and (3) study the effect of process variables on solubility. Extraction experiments were carried out for rice bran (5 g) at various extraction pressures, temperatures and reaction times. The flowrate of supercritical fluid through the extraction vessel was 25 g/min. The extracts were collected and analysed with high-pressure liquid chromatography (HPLC). The conclusion based on the experiments are as: (1) The highest experimental solubility was 0.303 mcg/mL RBO at T= 60°C, P= 90 atm, t= 30 min; (2) Solubility of ϒ-oryzanol was influenced by pressure and temperature. As the pressure and temperature increase, the solubility increases; (3) The solubility data of supercritical extraction can be successfully determined using phase equilibrium theory. Meanwhile, tocopherol was found and slightly investigated in this work.

Keywords: Rice bran, solubility, supercritical CO2, ϒ-orizanol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2129

7886 Attribute Selection Methods Comparison for Classification of Diffuse Large B-Cell Lymphoma

Authors: Helyane Bronoski Borges, Júlio Cesar Nievola

Abstract:

The most important subtype of non-Hodgkin-s lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately 40% of the patients suffering from it respond well to therapy, whereas the remainder needs a more aggressive treatment, in order to better their chances of survival. Data Mining techniques have helped to identify the class of the lymphoma in an efficient manner. Despite that, thousands of genes should be processed to obtain the results. This paper presents a comparison of the use of various attribute selection methods aiming to reduce the number of genes to be searched, looking for a more effective procedure as a whole.

Keywords: Attribute selection, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411

7885 Multiple-Level Sequential Pattern Discovery from Customer Transaction Databases

Authors: An Chen, Huilin Ye

Abstract:

Mining sequential patterns from large customer transaction databases has been recognized as a key research topic in database systems. However, the previous works more focused on mining sequential patterns at a single concept level. In this study, we introduced concept hierarchies into this problem and present several algorithms for discovering multiple-level sequential patterns based on the hierarchies. An experiment was conducted to assess the performance of the proposed algorithms. The performances of the algorithms were measured by the relative time spent on completing the mining tasks on two different datasets. The experimental results showed that the performance depends on the characteristics of the datasets and the pre-defined threshold of minimal support for each level of the concept hierarchy. Based on the experimental results, some suggestions were also given for how to select appropriate algorithm for a certain datasets.

Keywords: Data Mining, Multiple-Level Sequential Pattern, Concept Hierarchy, Customer Transaction Database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450

7884 Optimization of Extraction of Phenolic Compounds from Avicennia marina (Forssk.)Vierh using Response Surface Methodology

Authors: V.Bharathi, Jamila Patterson, R.Rajendiran

Abstract:

Optimization of extraction of phenolic compounds from Avicennia marina using response surface methodology was carried out during the present study. Five levels, three factors rotatable design (CCRD) was utilized to examine the optimum combination of extraction variables based on the TPC of Avicennia marina leaves. The best combination of response function was 78.41 °C, drying temperature; 26.18°C; extraction temperature and 36.53 minutes of extraction time. However, the procedure can be promptly extended to the study of several others pharmaceutical processes like purification of bioactive substances, drying of extracts and development of the pharmaceutical dosage forms for the benefit of consumers.

Keywords: Avicennia marina, Central Composite RotatableDesign (CCRD), Response Surface Methodology, Total Phenoliccontents (TPC)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057

7883 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1701

7882 Using Data Mining Techniques for Estimating Minimum, Maximum and Average Daily Temperature Values

Authors: S. Kotsiantis, A. Kostoulas, S. Lykoudis, A. Argiriou, K. Menagias

Abstract:

Estimates of temperature values at a specific time of day, from daytime and daily profiles, are needed for a number of environmental, ecological, agricultural and technical applications, ranging from natural hazards assessments, crop growth forecasting to design of solar energy systems. The scope of this research is to investigate the efficiency of data mining techniques in estimating minimum, maximum and mean temperature values. For this reason, a number of experiments have been conducted with well-known regression algorithms using temperature data from the city of Patras in Greece. The performance of these algorithms has been evaluated using standard statistical indicators, such as Correlation Coefficient, Root Mean Squared Error, etc.

Keywords: regression algorithms, supervised machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3409

7881 Extraction of Natural Colorant from the Flowers of Flame of Forest Using Ultrasound

Authors: Sunny Arora, Meghal A. Desai

Abstract:

An impetus towards green consumerism and implementation of sustainable techniques, consumption of natural products and utilization of environment friendly techniques have gained accelerated acceptance. Butein, a natural colorant, has many medicinal properties apart from its use in dyeing industries. Extraction of butein from the flowers of flame of forest was carried out using ultrasonication bath. Solid loading (2-6 g), extraction time (30-50 min), volume of solvent (30-50 mL) and types of solvent (methanol, ethanol and water) have been studied to maximize the yield of butein using the Taguchi method. The highest yield of butein 4.67% (w/w) was obtained using 4 g of plant material, 40 min of extraction time and 30 mL volume of methanol as a solvent. The present method provided a greater reduction in extraction time compared to the conventional method of extraction. Hence, the outcome of the present investigation could further be utilized to develop the method at a higher scale.

Keywords: Butein, flowers of flame of forest, Taguchi method, ultrasonic bath.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 935

7880 Data Mining on the Router Logs for Statistical Application Classification

Authors: M. Rahmati, S.M. Mirzababaei

Abstract:

With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1647

7879 Road Extraction Using Stationary Wavelet Transform

Authors: Somkait Udomhunsakul

Abstract:

In this paper, a novel road extraction method using Stationary Wavelet Transform is proposed. To detect road features from color aerial satellite imagery, Mexican hat Wavelet filters are used by applying the Stationary Wavelet Transform in a multiresolution, multi-scale, sense and forming the products of Wavelet coefficients at a different scales to locate and identify road features at a few scales. In addition, the shifting of road features locations is considered through multiple scales for robust road extraction in the asymmetry road feature profiles. From the experimental results, the proposed method leads to a useful technique to form the basis of road feature extraction. Also, the method is general and can be applied to other features in imagery.

Keywords: Road extraction, Multiresolution, Stationary Wavelet Transform, Multi-scale analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1868

7878 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208

7877 Enhancing capabilities of Texture Extraction for Color Image Retrieval

Authors: Pranam Janney, Sridhar G, Sridhar V.

Abstract:

Content-Based Image Retrieval has been a major area of research in recent years. Efficient image retrieval with high precision would require an approach which combines usage of both the color and texture features of the image. In this paper we propose a method for enhancing the capabilities of texture based feature extraction and further demonstrate the use of these enhanced texture features in Texture-Based Color Image Retrieval.

Keywords: Image retrieval, texture feature extraction, color extraction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613

7876 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707