Search results for: Sequential mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 753

Search results for: Sequential mining

273 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6098
272 Utilizing 5G Mobile Connection as a Node in Layer 1 Proof of Authority Blockchain Used for Microtransaction

Authors: Frode van der Laak

Abstract:

The paper contributes to the feasibility of using a 5G mobile connection as a node for a Proof of Authority (PoA) blockchain, which is used for microtransactions at the same time. It uses the phone number identity of the users that are linked to the crypto wallet address. It also proposed a consensus protocol based on PoA blockchain; PoA is a permission blockchain where consensus is achieved through a set of designated authority rather than through mining, as is the case with a Proof of Work (PoW) blockchain. This report will first explain the concept of a PoA blockchain and how it works. It will then discuss the potential benefits and challenges of using a 5G mobile connection as a node in such a blockchain, and finally, the main open problem statement and proposed solutions with the requirements.

Keywords: 5G, mobile, connection, node, PoA, blockchain, microtransaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 193
271 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620
270 Experience Modularization for New Value of Evanescent Cultural Communities: Developing Creative Tourism Services in Bangkok

Authors: Wuttigrai Ngamsirijit

Abstract:

Creative tourism is an ongoing development in many countries as an attempt to moving away from serial reproduction of culture and reviving the culture. Despite, in the destinations with diverse and potential cultural resources, creating new tourism services can be vague. This paper presents how tourism experiences are modularized and consolidated in order to form new creative tourism service offerings in evanescent cultural communities of Bangkok, Thailand. The benefits from data mining in accommodating value co-creation are discussed, and implication of experience modularization to national creative tourism policy is addressed.

Keywords: Co-creation, Creative tourism, New Service Design

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2406
269 MCOKE: Multi-Cluster Overlapping K-Means Extension Algorithm

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold be defined a priori which can be difficult to determine by novice users.

Keywords: Data mining, k-means, MCOKE, overlapping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2757
268 Assessment of Energy Demand Considering Different Model Simulations in a Low Energy Demand House

Authors: M. Cañada-Soriano, C. Aparicio-Fernández, P. Sebastián Ferrer Gisbert, M. Val Field, J.-L. Vivancos-Bono

Abstract:

The lack of insulation along with the existence of air leakages constitute a meaningful impact on the energy performance of buildings. Both of them lead to increases in the energy demand through additional heating and/or cooling loads. Additionally, they cause thermal discomfort. In order to quantify these uncontrolled air currents, the Blower Door test can be used. It is a standardized procedure that determines the airtightness of a space by characterizing the rate of air leakages through the envelope surface. In this sense, the low-energy buildings complying with the Passive House design criteria are required to achieve high levels of airtightness. Due to the invisible nature of air leakages, additional tools are often considered to identify where the infiltrations take place such as the infrared thermography. The aim of this study is to assess the airtightness of a typical Mediterranean dwelling house, refurbished under the Passive House standard, using the Blower Door test. Moreover, the building energy performance modelling tools TRNSYS (TRaNsient System Simulation program) and TRNFlow (TRaNsient Flow) have been used to estimate the energy demand in different scenarios. In this sense, a sequential implementation of three different energy improvement measures (insulation thickness, glazing type and infiltrations) have been analyzed.

Keywords: Airtightness, blower door, TRNSYS, infrared thermography, energy demand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 223
267 Reasons for Non-Applicability of Software Entropy Metrics for Bug Prediction in Android

Authors: Arvinder Kaur, Deepti Chopra

Abstract:

Software Entropy Metrics for bug prediction have been validated on various software systems by different researchers. In our previous research, we have validated that Software Entropy Metrics calculated for Mozilla subsystem’s predict the future bugs reasonably well. In this study, the Software Entropy metrics are calculated for a subsystem of Android and it is noticed that these metrics are not suitable for bug prediction. The results are compared with a subsystem of Mozilla and a comparison is made between the two software systems to determine the reasons why Software Entropy metrics are not applicable for Android.

Keywords: Android, bug prediction, mining software repositories, Software Entropy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1094
266 Topology Preservation in SOM

Authors: E. Arsuaga Uriarte, F. Díaz Martín

Abstract:

The SOM has several beneficial features which make it a useful method for data mining. One of the most important features is the ability to preserve the topology in the projection. There are several measures that can be used to quantify the goodness of the map in order to obtain the optimal projection, including the average quantization error and many topological errors. Many researches have studied how the topology preservation should be measured. One option consists of using the topographic error which considers the ratio of data vectors for which the first and second best BMUs are not adjacent. In this work we present a study of the behaviour of the topographic error in different kinds of maps. We have found that this error devaluates the rectangular maps and we have studied the reasons why this happens. Finally, we suggest a new topological error to improve the deficiency of the topographic error.

Keywords: Map lattice, Self-Organizing Map, topographic error, topology preservation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3014
265 Evaluation of the Internal Quality for Pineapple Based on the Spectroscopy Approach and Neural Network

Authors: Nonlapun Meenil, Pisitpong Intarapong, Thitima Wongsheree, Pranchalee Samanpiboon

Abstract:

In Thailand, once pineapples are harvested, they must be classified into two classes based on their sweetness: sweet and unsweet. This paper has studied and developed the assessment of internal quality of pineapples using a low-cost compact spectroscopy sensor according to the spectroscopy approach and Neural Network (NN). During the experiments, Batavia pineapples were utilized, generating 100 samples. The extracted pineapple juice of each sample was used to determine the Soluble Solid Content (SSC) labeling into sweet and unsweet classes. In terms of experimental equipment, the sensor cover was specifically designed to install the sensor and light source to read the reflectance at a five mm depth from pineapple flesh. By using a spectroscopy sensor, data on visible and near-infrared reflectance (Vis-NIR) were collected. The NN was used to classify the pineapple classes. Before the classification step, the preprocessing methods, which are class balancing, data shuffling, and standardization, were applied. The 510 nm and 900 nm reflectance values of the middle parts of pineapples were used as features of the NN. With the sequential model and ReLU activation function, 100% accuracy of the training set and 76.67% accuracy of the test set were achieved. According to the abovementioned information, using a low-cost compact spectroscopy sensor has achieved favorable results in classifying the sweetness of the two classes of pineapples.

Keywords: Spectroscopy, soluble solid content, pineapple, neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 126
264 Enhanced Particle Swarm Optimization Approach for Solving the Non-Convex Optimal Power Flow

Authors: M. R. AlRashidi, M. F. AlHajri, M. E. El-Hawary

Abstract:

An enhanced particle swarm optimization algorithm (PSO) is presented in this work to solve the non-convex OPF problem that has both discrete and continuous optimization variables. The objective functions considered are the conventional quadratic function and the augmented quadratic function. The latter model presents non-differentiable and non-convex regions that challenge most gradient-based optimization algorithms. The optimization variables to be optimized are the generator real power outputs and voltage magnitudes, discrete transformer tap settings, and discrete reactive power injections due to capacitor banks. The set of equality constraints taken into account are the power flow equations while the inequality ones are the limits of the real and reactive power of the generators, voltage magnitude at each bus, transformer tap settings, and capacitor banks reactive power injections. The proposed algorithm combines PSO with Newton-Raphson algorithm to minimize the fuel cost function. The IEEE 30-bus system with six generating units is used to test the proposed algorithm. Several cases were investigated to test and validate the consistency of detecting optimal or near optimal solution for each objective. Results are compared to solutions obtained using sequential quadratic programming and Genetic Algorithms.

Keywords: Particle Swarm Optimization, Optimal Power Flow, Economic Dispatch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2368
263 Multidimensional Compromise Optimization for Development Ranking of the Gulf Cooperation Council Countries and Turkey

Authors: C. Ardil

Abstract:

In this research, a multidimensional  compromise optimization method is proposed for multidimensional decision making analysis in the development ranking of the Gulf Cooperation Council Countries and Turkey. The proposed approach presents ranking solutions resulting from different multicriteria decision analyses, which yield different ranking orders for the same ranking problem, consisting of a set of alternatives in terms of numerous competing criteria when they are applied with the same numerical data. The multiobjective optimization decision making problem is considered in three sequential steps. In the first step, five different criteria related to the development ranking are gathered from the research field. In the second step, identified evaluation criteria are, objectively, weighted using standard deviation procedure. In the third step, a country selection problem is illustrated with a numerical example as an application of the proposed multidimensional  compromise optimization model. Finally, multidimensional  compromise optimization approach is applied to rank the Gulf Cooperation Council Countries and Turkey. 

Keywords: Standard deviation, performance evaluation, multicriteria decision making, multidimensional compromise optimization, vector normalization, multicriteria decision making, multicriteria analysis, multidimensional decision analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 812
262 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2280
261 Performance Assessment of Computational Gridon Weather Indices from HOAPS Data

Authors: Madhuri Bhavsar, Anupam K Singh, Shrikant Pradhan

Abstract:

Long term rainfall analysis and prediction is a challenging task especially in the modern world where the impact of global warming is creating complications in environmental issues. These factors which are data intensive require high performance computational modeling for accurate prediction. This research paper describes a prototype which is designed and developed on grid environment using a number of coupled software infrastructural building blocks. This grid enabled system provides the demanding computational power, efficiency, resources, user-friendly interface, secured job submission and high throughput. The results obtained using sequential execution and grid enabled execution shows that computational performance has enhanced among 36% to 75%, for decade of climate parameters. Large variation in performance can be attributed to varying degree of computational resources available for job execution. Grid Computing enables the dynamic runtime selection, sharing and aggregation of distributed and autonomous resources which plays an important role not only in business, but also in scientific implications and social surroundings. This research paper attempts to explore the grid enabled computing capabilities on weather indices from HOAPS data for climate impact modeling and change detection.

Keywords: Climate model, Computational Grid, GridApplication, Heterogeneous Grid

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443
260 Annual Power Load Forecasting Using Support Vector Regression Machines: A Study on Guangdong Province of China 1985-2008

Authors: Zhiyong Li, Zhigang Chen, Chao Fu, Shipeng Zhang

Abstract:

Load forecasting has always been the essential part of an efficient power system operation and planning. A novel approach based on support vector machines is proposed in this paper for annual power load forecasting. Different kernel functions are selected to construct a combinatorial algorithm. The performance of the new model is evaluated with a real-world dataset, and compared with two neural networks and some traditional forecasting techniques. The results show that the proposed method exhibits superior performance.

Keywords: combinatorial algorithm, data mining, load forecasting, support vector machines

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1648
259 Mining News Sites to Create Special Domain News Collections

Authors: David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Abstract:

We present a method to create special domain collections from news sites. The method only requires a single sample article as a seed. No prior corpus statistics are needed and the method is applicable to multiple languages. We examine various similarity measures and the creation of document collections for English and Japanese. The main contributions are as follows. First, the algorithm can build special domain collections from as little as one sample document. Second, unlike other algorithms it does not require a second “general" corpus to compute statistics. Third, in our testing the algorithm outperformed others in creating collections made up of highly relevant articles.

Keywords: Information Retrieval, News, Special DomainCollections,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1488
258 Prediction of a Human Facial Image by ANN using Image Data and its Content on Web Pages

Authors: Chutimon Thitipornvanid, Siripun Sanguansintukul

Abstract:

Choosing the right metadata is a critical, as good information (metadata) attached to an image will facilitate its visibility from a pile of other images. The image-s value is enhanced not only by the quality of attached metadata but also by the technique of the search. This study proposes a technique that is simple but efficient to predict a single human image from a website using the basic image data and the embedded metadata of the image-s content appearing on web pages. The result is very encouraging with the prediction accuracy of 95%. This technique may become a great assist to librarians, researchers and many others for automatically and efficiently identifying a set of human images out of a greater set of images.

Keywords: Metadata, Prediction, Multi-layer perceptron, Human facial image, Image mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1215
257 Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression

Authors: Seo Young Kim, Jae Won Lee, Jong Sung Bae

Abstract:

Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.

Keywords: Clustering, microarray experiment, temporal pattern of gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1360
256 Automatic Detection of Breast Tumors in Sonoelastographic Images Using DWT

Authors: A. Sindhuja, V. Sadasivam

Abstract:

Breast Cancer is the most common malignancy in women and the second leading cause of death for women all over the world. Earlier the detection of cancer, better the treatment. The diagnosis and treatment of the cancer rely on segmentation of Sonoelastographic images. Texture features has not considered for Sonoelastographic segmentation. Sonoelastographic images of 15 patients containing both benign and malignant tumorsare considered for experimentation.The images are enhanced to remove noise in order to improve contrast and emphasize tumor boundary. It is then decomposed into sub-bands using single level Daubechies wavelets varying from single co-efficient to six coefficients. The Grey Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP) features are extracted and then selected by ranking it using Sequential Floating Forward Selection (SFFS) technique from each sub-band. The resultant images undergo K-Means clustering and then few post-processing steps to remove the false spots. The tumor boundary is detected from the segmented image. It is proposed that Local Binary Pattern (LBP) from the vertical coefficients of Daubechies wavelet with two coefficients is best suited for segmentation of Sonoelastographic breast images among the wavelet members using one to six coefficients for decomposition. The results are also quantified with the help of an expert radiologist. The proposed work can be used for further diagnostic process to decide if the segmented tumor is benign or malignant.

Keywords: Breast Cancer, Segmentation, Sonoelastography, Tumor Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2207
255 Mobile Robot Control by Von Neumann Computer

Authors: E. V. Larkin, T. A. Akimenko, A. V. Bogomolov, A. N. Privalov

Abstract:

The digital control system of mobile robots (MR) control is considered. It is shown that sequential interpretation of control algorithm operators, unfolding in physical time, suggests the occurrence of time delays between inputting data from sensors and outputting data to actuators. Another destabilizing control factor is presence of backlash in the joints of an actuator with an executive unit. Complex model of control system, which takes into account the dynamics of the MR, the dynamics of the digital controller and backlash in actuators, is worked out. The digital controller model is divided into two parts: the first part describes the control law embedded in the controller in the form of a control program that realizes a polling procedure when organizing transactions to sensors and actuators. The second part of the model describes the time delays that occur in the Von Neumann-type controller when processing data. To estimate time intervals, the algorithm is represented in the form of an ergodic semi-Markov process. For an ergodic semi-Markov process of common form, a method is proposed for estimation a wandering time from one arbitrary state to another arbitrary state. Example shows how the backlash and time delays affect the quality characteristics of the MR control system functioning.

Keywords: Mobile robot, backlash, control algorithm, Von Neumann controller, semi-Markov process, time delay.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 370
254 Non-Overlapping Hierarchical Index Structure for Similarity Search

Authors: Mounira Taileb, Sid Lamrous, Sami Touati

Abstract:

In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.

Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
253 WebGD: A CORBA-based Document Classification and Retrieval System on the Web

Authors: Fuyang Peng, Bo Deng, Chao Qi, Mou Zhan

Abstract:

This paper presents the design and implementation of the WebGD, a CORBA-based document classification and retrieval system on Internet. The WebGD makes use of such techniques as Web, CORBA, Java, NLP, fuzzy technique, knowledge-based processing and database technology. Unified classification and retrieval model, classifying and retrieving with one reasoning engine and flexible working mode configuration are some of its main features. The architecture of WebGD, the unified classification and retrieval model, the components of the WebGD server and the fuzzy inference engine are discussed in this paper in detail.

Keywords: Text Mining, document classification, knowledgeprocessing, fuzzy logic, Web, CORBA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1851
252 An Overview of Construction and Demolition Waste as Coarse Aggregate in Concrete

Authors: S. R. Shamili, J. Karthikeyan

Abstract:

Fast development of the total populace and far and wide urbanization has surprisingly expanded the advancement of the construction industry. As a result of these activities, old structures are being demolished to make new buildings. Due to these large-scale demolitions, a huge amount of debris is generated all over the world, which results in a landfill. The use of construction and demolition waste as landfill causes groundwater contamination, which is hazardous. Using construction and demolition waste as aggregate can reduce the use of natural aggregates and the problem of mining. The objective of this study is to provide a detailed overview on how the construction and demolition waste material has been used as aggregate in structural concrete. In this study, the preparation, classification, and composition of construction and demolition wastes are also discussed.

Keywords: Aggregate, construction and demolition waste, landfill, large scale demolition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 643
251 Optimized Facial Features-based Age Classification

Authors: Md. Zahangir Alom, Mei-Lan Piao, Md. Shariful Islam, Nam Kim, Jae-Hyeung Park

Abstract:

The evaluation and measurement of human body dimensions are achieved by physical anthropometry. This research was conducted in view of the importance of anthropometric indices of the face in forensic medicine, surgery, and medical imaging. The main goal of this research is to optimization of facial feature point by establishing a mathematical relationship among facial features and used optimize feature points for age classification. Since selected facial feature points are located to the area of mouth, nose, eyes and eyebrow on facial images, all desire facial feature points are extracted accurately. According this proposes method; sixteen Euclidean distances are calculated from the eighteen selected facial feature points vertically as well as horizontally. The mathematical relationships among horizontal and vertical distances are established. Moreover, it is also discovered that distances of the facial feature follows a constant ratio due to age progression. The distances between the specified features points increase with respect the age progression of a human from his or her childhood but the ratio of the distances does not change (d = 1 .618 ) . Finally, according to the proposed mathematical relationship four independent feature distances related to eight feature points are selected from sixteen distances and eighteen feature point-s respectively. These four feature distances are used for classification of age using Support Vector Machine (SVM)-Sequential Minimal Optimization (SMO) algorithm and shown around 96 % accuracy. Experiment result shows the proposed system is effective and accurate for age classification.

Keywords: 3D Face Model, Face Anthropometrics, Facial Features Extraction, Feature distances, SVM-SMO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2048
250 Eclectic Rule-Extraction from Support Vector Machines

Authors: Nahla Barakat, Joachim Diederich

Abstract:

Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge acquired by the SVM and represented in its support vectors as well as the parameters associated with them. The approach includes three stages; training, propositional rule-extraction and rule quality evaluation. Results from four different experiments have demonstrated the value of the approach for extracting comprehensible rules of high accuracy and fidelity.

Keywords: Data mining, hybrid rule-extraction algorithms, medical diagnosis, SVMs

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713
249 Investigation of Cytotoxic Compounds in Ethyl Acetate and Chloroform Extracts of Nigella sativa by Sulforhodamine-B Assay-Guided Fractionation

Authors: Harshani Uggallage, Kapila D. Dissanayaka

Abstract:

A Sulforhodamine-B assay-guided fractionation on Nigella sativa seeds was conducted to determine the presence of cytotoxic compounds against human hepatoma (HepG2) cells. Initially, a freeze-dried sample of Nigella sativa seeds was sequentially extracted into solvents of increasing polarities. Crude extracts from the sequential extraction of Nigella sativa seeds in chloroform and ethyl acetate showed the highest cytotoxicity. The combined mixture of these two extracts was subjected to bioassay guided fractionation using a modified Kupchan method of partitioning, followed by Sephadex® LH-20 chromatography. This chromatographic separation process resulted in a column fraction with a convincing IC50 (half-maximal inhibitory concentration) value of 13.07 µg/ml, which is considerable for developing therapeutic drug leads against human hepatoma. Reversed phase High-Performance Liquid Chromatography (HPLC) was finally conducted for the same column fraction and the result indicates the presence of one or several main cytotoxic compounds against human HepG2 cells.

Keywords: Cytotoxic compounds, half-maximal inhibitory concentration, high-performance liquid chromatography, human HepG2 cells, Nigella sativa seeds, Sulforhodamine-B assay-guided fractionation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 446
248 Fast Painting with Different Colors Using Cross Correlation in the Frequency Domain

Authors: Hazem M. El-Bakry

Abstract:

In this paper, a new technique for fast painting with different colors is presented. The idea of painting relies on applying masks with different colors to the background. Fast painting is achieved by applying these masks in the frequency domain instead of spatial (time) domain. New colors can be generated automatically as a result from the cross correlation operation. This idea was applied successfully for faster specific data (face, object, pattern, and code) detection using neural algorithms. Here, instead of performing cross correlation between the input input data (e.g., image, or a stream of sequential data) and the weights of neural networks, the cross correlation is performed between the colored masks and the background. Furthermore, this approach is developed to reduce the computation steps required by the painting operation. The principle of divide and conquer strategy is applied through background decomposition. Each background is divided into small in size subbackgrounds and then each sub-background is processed separately by using a single faster painting algorithm. Moreover, the fastest painting is achieved by using parallel processing techniques to paint the resulting sub-backgrounds using the same number of faster painting algorithms. In contrast to using only faster painting algorithm, the speed up ratio is increased with the size of the background when using faster painting algorithm and background decomposition. Simulation results show that painting in the frequency domain is faster than that in the spatial domain.

Keywords: Fast Painting, Cross Correlation, Frequency Domain, Parallel Processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797
247 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

The need to extract R&D keywords from issues and use them to retrieve R&D information is increasing rapidly. However, it is difficult to identify related issues or distinguish them. Although the similarity between issues cannot be identified, with an R&D lexicon, issues that always share the same R&D keywords can be determined. In detail, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Furthermore, the relationship among issues that share the same R&D keywords can be shown in a more systematic way by clustering them according to keywords. Thus, sharing R&D results and reusing R&D technology can be facilitated. Indirectly, redundant investment in R&D can be reduced as the relevant R&D information can be shared among corresponding issues and the reusability of related R&D can be improved. Therefore, a methodology to cluster issues from the perspective of common R&D keywords is proposed to satisfy these demands.

Keywords: Clustering, Social Network Analysis, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2038
246 Modeling Language for Constructing Solvers in Machine Learning: Reductionist Perspectives

Authors: Tsuyoshi Okita

Abstract:

For a given specific problem an efficient algorithm has been the matter of study. However, an alternative approach orthogonal to this approach comes out, which is called a reduction. In general for a given specific problem this reduction approach studies how to convert an original problem into subproblems. This paper proposes a formal modeling language to support this reduction approach in order to make a solver quickly. We show three examples from the wide area of learning problems. The benefit is a fast prototyping of algorithms for a given new problem. It is noted that our formal modeling language is not intend for providing an efficient notation for data mining application, but for facilitating a designer who develops solvers in machine learning.

Keywords: Formal language, statistical inference problem, reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1329
245 Ozone Decomposition over Silver-Loaded Perlite

Authors: Krassimir Genov, Vladimir Georgiev, Todor Batakliev, Dipak K. Sarker

Abstract:

The Bulgarian natural expanded mineral obtained from Bentonite AD perlite (A deposit of "The Broken Mountain" for perlite mining, near by the village of Vodenicharsko, in the municipality of Djebel), was loaded with silver (as ion form - Ag+ 2 and 5 wt% by the incipient wetness impregnation method), and as atomic silver - Ag0 using Tollen-s reagent (silver mirror reaction). Some physicochemical characterization of the samples are provided via: DC arc-AES, XRD, DR-IR and UV-VIS. The aim of this work was to obtain and test the silver-loaded catalyst for ozone decomposition. So the samples loaded with atomic silver show ca. 80% conversion of ozone 20 minutes after the reaction start. Then conversion decreases to ca. 20 % but stay stable during the prolongation of time.

Keywords: aluminum-silicates, Ag/perlite expanded glass, ozone decomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2270
244 Methodology of Restoration Research in Czech Republic

Authors: M. Rehor, V. Ondracek

Abstract:

Restoration research has become important on principle recently in Czech Republic. The reason is simple. More than 70 % of mined brown coal comes from the North Bohemian Basin these days. Open cast brown coal mining has lead to large damage on the landscape. Reclamation of phytotoxic areas is one of the serious problems in the North Bohemian Basin. It mainly concerns the areas with the occurrence of overburden rocks from the coal bed enriched with coal. The presented paper includes the characteristics of the important phytotoxic areas and the methodology of their reclamation. The results are documented with the long term monitoring of physical, mineralogical, chemical and pedological parameters of rocks in the testing areas.

Keywords: Brown coal, dump, methodology, restoration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544