Search results for: positive count data.

7626 Software Test Data Generation using Ant Colony Optimization

Abstract:

State-based testing is frequently used in software testing. Test data generation is one of the key issues in software testing. A properly generated test suite may not only locate the errors in a software system, but also help in reducing the high cost associated with software testing. It is often desired that test data in the form of test sequences within a test suite can be automatically generated to achieve required test coverage. This paper proposes an Ant Colony Optimization approach to test data generation for the state-based software testing.

Keywords: Software testing, ant colony optimization, UML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3459

7625 Natural Language News Generation from Big Data

Authors: Bastian Haarmann, Lukas Sikorski

Abstract:

In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The resulting fully automatic generated news stories have a high resemblance to the style in which the human writer would draw up such a story. Topics include soccer games, stock exchange market reports, and weather forecasts. Each generated text is unique. Readyto-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save timeconsuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist.

Keywords: Big data, natural language generation, publishing, robotic journalism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687

7624 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process

Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek

Abstract:

It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.

Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2398

7623 A New Model for Discovering XML Association Rules from XML Documents

Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani

Abstract:

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

Keywords: XML, Data Mining, Association Rule Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631

7622 Modelling Silica Optical Fibre Reliability: A Software Application

Authors: I. Severin, M. Caramihai, R. El Abdi, M. Poulain, A. Avadanii

Abstract:

In order to assess optical fiber reliability in different environmental and stress conditions series of testing are performed simulating overlapping of chemical and mechanical controlled varying factors. Each series of testing may be compared using statistical processing: i.e. Weibull plots. Due to the numerous data to treat, a software application has appeared useful to interpret selected series of experiments in function of envisaged factors. The current paper presents a software application used in the storage, modelling and interpretation of experimental data gathered from optical fibre testing. The present paper strictly deals with the software part of the project (regarding the modelling, storage and processing of user supplied data).

Keywords: Optical fibres, computer aided analysis, data models, data processing, graphical user interfaces.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1823

7621 Biological Effects of a Carbohydrate-Binding Protein from an Annelid, Perinereis nuntia Against Human and Phytopathogenic Microorganisms

Authors: Sarkar M. A. Kawsar, Sarkar M. A. Mamun, Md S. Rahman, Hidetaro Yasumitsu, Yasuhiro Ozeki

Abstract:

Lectins have a good scope in current clinical microbiology research. In the present study evaluated the antimicrobial activities of a D-galactose binding lectin (PnL) was purified from the annelid, Perinereis nuntia (polychaeta) by affinity chromatography. The molecular mass of the lectin was determined to be 32 kDa as a single polypeptide by SDS-PAGE under both reducing and non-reducing conditions. The hemagglutinating activity of the PnL showed against trypsinized and glutaraldehyde-fixed human erythrocytes was specifically inhibited by D-Gal, GalNAc, Galβ1-4Glc and Galα1-6Glc. PnL was evaluated for in vitro antibacterial screening studies against 11 gram-positive and gram-negative microorganisms. From the screening results, it was revealed that PnL exhibited significant antibacterial activity against gram-positive bacteria. Bacillus megaterium showed the highest growth inhibition by the lectin (250 μg/disc). However, PnL did not inhibit the growth of gram-negative bacteria such as Vibrio cholerae and Pseudomonas sp. PnL was also examined for in vitro antifungal activity against six fungal phytopathogens. PnL (100 μg/mL) inhibited the mycelial growth of Alternaria alternata (24.4%). These results indicate that future findings of lectin applications obtained from annelids may be of importance to life sciences.

Keywords: Perinereis nuntia, Lectin, Inhibition zone, Mycelial growth.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1978

7620 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 852

7619 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920

7618 An Implementation of Data Reusable MPEG Video Coding Scheme

Authors: Vasily G. Moshnyaga

Abstract:

This paper presents an optimized MPEG2 video codec implementation, which drastically reduces the number of computations and memory accesses required for video compression. Unlike traditional scheme, we reuse data stored in frame memory to omit unnecessary coding operations and memory read/writes for unchanged macroblocks. Due to dynamic memory sharing among reference frames, data-driven macroblock characterization and selective macroblock processing, we perform less than 15% of the total operations required by a conventional coder while maintaining high picture quality.

Keywords: Data reuse, adaptive processing, video coding, MPEG

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1266

7617 Antecedents of Word-of-Mouth for Meat with Traceability: Evidence from Thai Consumers

Authors: Kawpong Polyorat, Nathamon Buaprommee

Abstract:

Because of the outbreak of mad cow disease and bird flu, consumers have become more concerned with quality and safety of meat and poultry. As a consequence, meat traceability has been implemented as a tool to raise the standard in the meat production industry. In Thailand, while traceability is relatively common among the manufacturer-wholesaler-retailers cycle, it is rarely used as a marketing tool specifically designed to persuade consumers who are the actual meat endusers. Therefore, the present study attempts to understand what influences consumers to spread their words-of-mouth (WOM) regarding meat with traceability by conducting a study in Thailand where research in this area is rather scant. Data were collected from one hundred and sixty-seven consumers in the northeastern region and analyzed with SEM. The study results reveal that perceived usefulness of traceability system, social norms, and product class knowledge are significant antecedents where consumers spread positive words regarding meat with traceability system. A number of theoretical and managerial implications as well as future study directions are offered at the end of this study report.

Keywords: Perceived usefulness, product knowledge, social norms, traceability, word-of-mouth,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645

7616 Fermentation of Germinated Native Black Rice Milk Mixture by Probiotic Lactic Acid Bacteria

Authors: N. Mongkontanawat

Abstract:

This research aimed to demonstrate probiotic germinated native black rice juice fermentation by lactic acid bacteria (Lactobacillus casei TISTR 390). Germinated native black rice juice was inoculated with a 24-h old lactic culture and incubated at 30 °C for 72 hours. Changes in pH, acidity, total soluble solid, and viable cell counts during fermentation under controlled conditions at 0-h, 24-h, 48-h, and 72-h fermentations were evaluated. The study found out that the change in pH and total soluble solid of probiotic germinated black rice juice significantly (p ≤ 0.05) decreased at 72-h fermentation (5.67±0.12 to 2.86±0.04 and 7.00±0.00 to 6.40±0.00 ºbrix at 0-h and 72-h fermentations, respectively). On the other hand, the amount of titratable acidity expressed as lactic acid and the viable cell count significantly (p≤0.05) increased at 72-h fermentation (0.11±0.06 to 0.43±0.06 (% lactic acid) and 3.60 x 10⁶ to 2.75 x 10⁸ CFU/ml at 0-h and 72-h fermentations, respectively). Interestingly, the amount of γ-Amino Butyric Acid (GABA) had a significant difference (p≤0.05) twice as high as that of the control group (0.25±0.01 and 0.13±0.01 mg/100g, respectively). In addition, the free radical scavenging capacity assayed by DPPH method also showed that the IC₅₀ values were significantly (p≤0.05) higher than the control (147.71±0.96 and 202.55±1.24 mg/ml, respectively). After 4 weeks of cold storage at 4 °C, the viable cell counts of lactic acid bacteria reduced to 1.37 x 10⁶ CFU/ml. In conclusion, fermented germinated native black rice juice could be served as a healthy beverage for vegans and people who are allergic to cow milk products.

Keywords: Germinated native black rice, probiotic, lactic acid bacteria, Lactobacillus casei.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642

7615 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique

Authors: Hyun-Woo Cho

Abstract:

The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.

Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1317

7614 Increasing Replica Consistency Performances with Load Balancing Strategy in Data Grid Systems

Authors: Sarra Senhadji, Amar Kateb, Hafida Belbachir

Abstract:

Data replication in data grid systems is one of the important solutions that improve availability, scalability, and fault tolerance. However, this technique can also bring some involved issues such as maintaining replica consistency. Moreover, as grid environment are very dynamic some nodes can be more uploaded than the others to become eventually a bottleneck. The main idea of our work is to propose a complementary solution between replica consistency maintenance and dynamic load balancing strategy to improve access performances under a simulated grid environment.

Keywords: Consistency, replication, data grid, load balancing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2325

7613 Nonparametric Control Chart Using Density Weighted Support Vector Data Description

Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek

Abstract:

In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.

Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2121

7612 Model-Based Person Tracking Through Networked Cameras

Authors: Kyoung-Mi Lee, Youn-Mi Lee

Abstract:

This paper proposes a way to track persons by making use of multiple non-overlapping cameras. Tracking persons on multiple non-overlapping cameras enables data communication among cameras through the network connection between a camera and a computer, while at the same time transferring human feature data captured by a camera to another camera that is connected via the network. To track persons with a camera and send the tracking data to another camera, the proposed system uses a hierarchical human model that comprises a head, a torso, and legs. The feature data of the person being modeled are transferred to the server, after which the server sends the feature data of the human model to the cameras connected over the network. This enables a camera that captures a person's movement entering its vision to keep tracking the recognized person with the use of the feature data transferred from the server.

Keywords: Person tracking, human model, networked cameras, vision-based surveillance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489

7611 Slugging Frequency Correlation for Inclined Gas-liquid Flow

Authors: V. Hernandez-Perez, M. Abdulkadir, B. J. Azzopardi

Abstract:

In this work, new experimental data for slugging frequency in inclined gas-liquid flow are reported, and a new correlation is proposed. Scale experiments were carried out using a mixture of air and water in a 6 m long pipe. Two different pipe diameters were used, namely, 38 and 67 mm. The data were taken with capacitance type sensors at a data acquisition frequency of 200 Hz over an interval of 60 seconds. For the range of flow conditions studied, the liquid superficial velocity is observed to influence the frequency strongly. A comparison of the present data with correlations available in the literature reveals a lack of agreement. A new correlation for slug frequency has been proposed for the inclined flow, which represents the main contribution of this work.

Keywords: slug frequency, inclined flow

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3163

7610 FCA-based Conceptual Knowledge Discovery in Folksonomy

Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang

Abstract:

The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.

Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2028

7609 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: Biometric voice prints, fundamental frequency, phonogram, speech signal, temporal characteristics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 577

7608 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data

Authors: Ruchika Malhotra, Megha Khanna

Abstract:

The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.

Keywords: Change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520

7607 Students’ Perception and Patterns of Listening Behavior in an Online Forum Discussion

Authors: K. L. Wong, I. N. Umar

Abstract:

Online forum is part of a Learning Management System (LMS) environment in which students share their opinions. This study attempts to investigate the perceptions of students towards online forum and their patterns of listening behavior during the forum interaction. The students’ perceptions were measured using a questionnaire, in which seven dimensions were used involving online experience, benefits of forum participation, cost of participation, perceived ease of use, usefulness, attitude, and intention. Meanwhile, their patterns of listening behaviors were obtained using the log file extracted from the LMS. A total of 25 postgraduate students undertaking a course were involved in this study, and their activities in the forum session were recorded by the LMS and used as a log file. The results from the questionnaire analysis indicated that the students perceived that the forum is easy to use, useful, and bring benefits to them. Also, they showed positive attitude towards online forum, and they have the intention to use it in future. Based on the log data, the participants were also divided into six clusters of listening behavior, in which they are different in terms of temporality, breadth, depth and speaking level. The findings were compared to previous clusters grouping and future recommendations are also discussed.

Keywords: e-learning, learning management system, listening behavior, online forum.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2420

7606 Plant Varieties Selection System

Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh

Abstract:

In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.

Keywords: Plant varieties selection system, decision tree, expert recommendation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1793

7605 Jitter Transfer in High Speed Data Links

Authors: Tsunwai Gary Yip

Abstract:

Phase locked loops for data links operating at 10 Gb/s or faster are low phase noise devices designed to operate with a low jitter reference clock. Characterization of their jitter transfer function is difficult because the intrinsic noise of the device is comparable to the random noise level in the reference clock signal. A linear model is proposed to account for the intrinsic noise of a PLL. The intrinsic noise data of a PLL for 10 Gb/s links is presented. The jitter transfer function of a PLL in a test chip for 12.8 Gb/s data links was determined in experiments using the 400 MHz reference clock as the source of simultaneous excitations over a wide range of frequency. The result shows that the PLL jitter transfer function can be approximated by a second order linear model.

Keywords: Intrinsic phase noise, jitter in data link, PLL jitter transfer function, high speed clocking in electronic circuit

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1946

7604 Effect of Plant Nutrients on Anthocyanin Content and Yield Component of Black Glutinous Rice Plants

Authors: Chonlada Bennett, Phumon Sookwong, Sakul Moolkam, Sivapong Naruebal Sugunya Mahatheeranont

Abstract:

The cultivation of black glutinous rice rich in anthocyanins can provide great benefits to both farmers and consumers. Total anthocyanins content and yield component data of black glutinous rice cultivar (KHHK) grown with the addition of mineral elements (Ca, Mg, Cu, Cr, Fe and Se) under soilless conditions were studied. Ca application increased seed anthocyanins content by three-folds compared to controls. Cu application to rice plants obtained the highest number of grains panicle, panicle length and subsequently high panicle weight. Se application had the largest effect on leaf anthocyanins content, the number of tillers, number of panicles and 100-grain weight. These findings showed that the addition of mineral elements had a positive effect on increasing anthocyanins content in black rice plants and seeds as well as the heightened development of black glutinous rice plant growth.

Keywords: Anthocyanins, black glutinous rice, mineral elements, soilless culture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 849

7603 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2132

7602 A New Algorithm for Cluster Initialization

Authors: Moth'd Belal. Al-Daoud

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the k-means algorithm. Solutions obtained from this technique are dependent on the initialization of cluster centers. In this article we propose a new algorithm to initialize the clusters. The proposed algorithm is based on finding a set of medians extracted from a dimension with maximum variance. The algorithm has been applied to different data sets and good results are obtained.

Keywords: clustering, k-means, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2103

7601 Approximate Frequent Pattern Discovery Over Data Stream

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop

Abstract:

Frequent pattern discovery over data stream is a hard problem because a continuously generated nature of stream does not allow a revisit on each data element. Furthermore, pattern discovery process must be fast to produce timely results. Based on these requirements, we propose an approximate approach to tackle the problem of discovering frequent patterns over continuous stream. Our approximation algorithm is intended to be applied to process a stream prior to the pattern discovery process. The results of approximate frequent pattern discovery have been reported in the paper.

Keywords: Frequent pattern discovery, Approximate algorithm, Data stream analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1342

7600 An Adaptive Hand-Talking System for the Hearing Impaired

Authors: Zhou Yu, Jiang Feng

Abstract:

An adaptive Chinese hand-talking system is presented in this paper. By analyzing the 3 data collecting strategies for new users, the adaptation framework including supervised and unsupervised adaptation methods is proposed. For supervised adaptation, affinity propagation (AP) is used to extract exemplar subsets, and enhanced maximum a posteriori / vector field smoothing (eMAP/VFS) is proposed to pool the adaptation data among different models. For unsupervised adaptation, polynomial segment models (PSMs) are used to help hidden Markov models (HMMs) to accurately label the unlabeled data, then the "labeled" data together with signerindependent models are inputted to MAP algorithm to generate signer-adapted models. Experimental results show that the proposed framework can execute both supervised adaptation with small amount of labeled data and unsupervised adaptation with large amount of unlabeled data to tailor the original models, and both achieve improvements on the performance of recognition rate.

Keywords: sign language recognition, signer adaptation, eMAP/VFS, polynomial segment model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1759

7599 Wavelet-Based Data Compression Technique for Wireless Sensor Networks

Authors: P. Kumsawat, N. Pimpru, K. Attakitmongcol, A.Srikaew

Abstract:

In this paper, we proposed an efficient data compression strategy exploiting the multi-resolution characteristic of the wavelet transform. We have developed a sensor node called “Smart Sensor Node; SSN". The main goals of the SSN design are lightweight, minimal power consumption, modular design and robust circuitry. The SSN is made up of four basic components which are a sensing unit, a processing unit, a transceiver unit and a power unit. FiOStd evaluation board is chosen as the main controller of the SSN for its low costs and high performance. The software coding of the implementation was done using Simulink model and MATLAB programming language. The experimental results show that the proposed data compression technique yields recover signal with good quality. This technique can be applied to compress the collected data to reduce the data communication as well as the energy consumption of the sensor and so the lifetime of sensor node can be extended.

Keywords: Wireless sensor network, wavelet transform, data compression, ZigBee, skipped high-pass sub-band.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2989

7598 Assessment of the Efficacy of Oral Vaccination of Wild Canids and Stray Dogs against Rabies in Azerbaijan

Authors: E. N. Hasanov, K. Y. Yusifova, M. A. Ali

Abstract:

Rabies is a zoonotic disease that causes acute encephalitis in domestic and wild carnivores. The goal of this investigation was to analyze the data on oral vaccination of wild canids and stray dogs in Azerbaijan. Before the start of vaccination campaign conducted by the IDEA (International Dialogue for Environmental Action) Animal Care Center (IACC), all rabies cases in Azerbaijan for the period of 2017-2020 were analyzed. So, 30 regions for oral immunization with the Rabadrop vaccine were selected. In total, 95.9 thousand doses of baits were scattered in 30 regions, 970 (0.97%) remained intact. In addition, a campaign to sterilize and vaccinate stray dogs and cats undoubtedly had a positive impact on reducing the dynamics of rabies incidence. During the period 2017-2020, 2,339 dogs and 2,962 cats were sterilized and vaccinated under this program. It can be noted that the risk of rabies infection can be reduced through special preventive measures against disease reservoirs, which include oral immunization of wild and stray animals.

Keywords: Rabies, vaccination, oral immunization, wild canids, stray dogs, vaccine, disease reservoirs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 487

7597 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: Analysis of optimization, artificial intelligence-based optimization, optimization for learning and data analysis, global optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 912