Search results for: Large Data
8007 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process
Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek
Abstract:
It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.
Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23988006 A New Model for Discovering XML Association Rules from XML Documents
Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani
Abstract:
The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.Keywords: XML, Data Mining, Association Rule Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16318005 Modelling Silica Optical Fibre Reliability: A Software Application
Authors: I. Severin, M. Caramihai, R. El Abdi, M. Poulain, A. Avadanii
Abstract:
In order to assess optical fiber reliability in different environmental and stress conditions series of testing are performed simulating overlapping of chemical and mechanical controlled varying factors. Each series of testing may be compared using statistical processing: i.e. Weibull plots. Due to the numerous data to treat, a software application has appeared useful to interpret selected series of experiments in function of envisaged factors. The current paper presents a software application used in the storage, modelling and interpretation of experimental data gathered from optical fibre testing. The present paper strictly deals with the software part of the project (regarding the modelling, storage and processing of user supplied data).
Keywords: Optical fibres, computer aided analysis, data models, data processing, graphical user interfaces.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18238004 Normalizing Scientometric Indicators of Individual Publications Using Local Cluster Detection Methods on Citation Networks
Authors: Levente Varga, Dávid Deritei, Mária Ercsey-Ravasz, Răzvan Florian, Zsolt I. Lázár, István Papp, Ferenc Járai-Szabó
Abstract:
One of the major shortcomings of widely used scientometric indicators is that different disciplines cannot be compared with each other. The issue of cross-disciplinary normalization has been long discussed, but even the classification of publications into scientific domains poses problems. Structural properties of citation networks offer new possibilities, however, the large size and constant growth of these networks asks for precaution. Here we present a new tool that in order to perform cross-field normalization of scientometric indicators of individual publications relays on the structural properties of citation networks. Due to the large size of the networks, a systematic procedure for identifying scientific domains based on a local community detection algorithm is proposed. The algorithm is tested with different benchmark and real-world networks. Then, by the use of this algorithm, the mechanism of the scientometric indicator normalization process is shown for a few indicators like the citation number, P-index and a local version of the PageRank indicator. The fat-tail trend of the article indicator distribution enables us to successfully perform the indicator normalization process.Keywords: Citation networks, scientometric indicator, cross-field normalization, local cluster detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7258003 Design and Optimization for a Compliant Gripper with Force Regulation Mechanism
Authors: Nhat Linh Ho, Thanh-Phong Dao, Shyh-Chour Huang, Hieu Giang Le
Abstract:
This paper presents a design and optimization for a compliant gripper. The gripper is constructed based on the concept of compliant mechanism with flexure hinge. A passive force regulation mechanism is presented to control the grasping force a micro-sized object instead of using a sensor force. The force regulation mechanism is designed using the planar springs. The gripper is expected to obtain a large range of displacement to handle various sized objects. First of all, the statics and dynamics of the gripper are investigated by using the finite element analysis in ANSYS software. And then, the design parameters of the gripper are optimized via Taguchi method. An orthogonal array L9 is used to establish an experimental matrix. Subsequently, the signal to noise ratio is analyzed to find the optimal solution. Finally, the response surface methodology is employed to model the relationship between the design parameters and the output displacement of the gripper. The design of experiment method is then used to analyze the sensitivity so as to determine the effect of each parameter on the displacement. The results showed that the compliant gripper can move with a large displacement of 213.51 mm and the force regulation mechanism is expected to be used for high precision positioning systems.
Keywords: Flexure hinge, compliant mechanism, compliant gripper, force regulation mechanism, Taguchi method, response surface methodology, design of experiment.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16138002 PointNetLK-OBB: A Point Cloud Registration Algorithm with High Accuracy
Authors: Wenhao Lan, Ning Li, Qiang Tong
Abstract:
To improve the registration accuracy of a source point cloud and template point cloud when the initial relative deflection angle is too large, a PointNetLK algorithm combined with an oriented bounding box (PointNetLK-OBB) is proposed. In this algorithm, the OBB of a 3D point cloud is used to represent the macro feature of source and template point clouds. Under the guidance of the iterative closest point algorithm, the OBB of the source and template point clouds is aligned, and a mirror symmetry effect is produced between them. According to the fitting degree of the source and template point clouds, the mirror symmetry plane is detected, and the optimal rotation and translation of the source point cloud is obtained to complete the 3D point cloud registration task. To verify the effectiveness of the proposed algorithm, a comparative experiment was performed using the publicly available ModelNet40 dataset. The experimental results demonstrate that, compared with PointNetLK, PointNetLK-OBB improves the registration accuracy of the source and template point clouds when the initial relative deflection angle is too large, and the sensitivity of the initial relative position between the source point cloud and template point cloud is reduced. The primary contribution of this paper is the use of PointNetLK to avoid the non-convex problem of traditional point cloud registration and leveraging the regularity of the OBB to avoid the local optimization problem in the PointNetLK context.
Keywords: Mirror symmetry, oriented bounding box, point cloud registration, PointNetLK-OBB.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7088001 The Role of Synthetic Data in Aerial Object Detection
Authors: Ava Dodd, Jonathan Adams
Abstract:
The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.
Keywords: computer vision, machine learning, synthetic data, YOLOv4
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8528000 Multi-Disciplinary Optimisation Methodology for Aircraft Load Prediction
Authors: Sudhir Kumar Tiwari
Abstract:
The paper demonstrates a methodology that can be used at an early design stage of any conventional aircraft. This research activity assesses the feasibility derivation of methodology for aircraft loads estimation during the various phases of design for a transport category aircraft by utilizing potential of using commercial finite element analysis software, which may drive significant time saving. Early Design phase have limited data and quick changing configuration results in handling of large number of load cases. It is useful to idealize the aircraft as a connection of beams, which can be very accurately modelled using finite element analysis (beam elements). This research explores the correct approach towards idealizing an aircraft using beam elements. FEM Techniques like inertia relief were studied for implementation during course of work. The correct boundary condition technique envisaged for generation of shear force, bending moment and torque diagrams for the aircraft. The possible applications of this approach are the aircraft design process, which have been investigated.
Keywords: Multi-disciplinary optimization, aircraft load, finite element analysis, Stick Model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11307999 Unsupervised Text Mining Approach to Early Warning System
Authors: Ichihan Tai, Bill Olson, Paul Blessner
Abstract:
Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.
Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19207998 An Implementation of Data Reusable MPEG Video Coding Scheme
Authors: Vasily G. Moshnyaga
Abstract:
This paper presents an optimized MPEG2 video codec implementation, which drastically reduces the number of computations and memory accesses required for video compression. Unlike traditional scheme, we reuse data stored in frame memory to omit unnecessary coding operations and memory read/writes for unchanged macroblocks. Due to dynamic memory sharing among reference frames, data-driven macroblock characterization and selective macroblock processing, we perform less than 15% of the total operations required by a conventional coder while maintaining high picture quality.
Keywords: Data reuse, adaptive processing, video coding, MPEG
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12667997 A Modelling Study of the Photochemical and Particulate Pollution Characteristics above a Typical Southeast Mediterranean Urban Area
Authors: Kiriaki-Maria Fameli, Vasiliki D. Assimakopoulos, Vasiliki Kotroni
Abstract:
The Greater Athens Area (GAA) faces photochemical and particulate pollution episodes as a result of the combined effects of local pollutant emissions, regional pollution transport, synoptic circulation and topographic characteristics. The area has undergone significant changes since the Athens 2004 Olympic Games because of large scale infrastructure works that lead to the shift of population to areas previously characterized as rural, the increase of the traffic fleet and the operation of highways. However, few recent modelling studies have been performed due to the lack of an accurate, updated emission inventory. The photochemical modelling system MM5/CAMx was applied in order to study the photochemical and particulate pollution characteristics above the GAA for two distinct ten-day periods in the summer of 2006 and 2010, where air pollution episodes occurred. A new updated emission inventory was used based on official data. Comparison of modeled results with measurements revealed the importance and accuracy of the new Athens emission inventory as compared to previous modeling studies. The model managed to reproduce the local meteorological conditions, the daily ozone and particulates fluctuations at different locations across the GAA. Higher ozone levels were found at suburban and rural areas as well as over the sea at the south of the basin. Concerning PM10, high concentrations were computed at the city centre and the southeastern suburbs in agreement with measured data. Source apportionment analysis showed that different sources contribute to the ozone levels, the local sources (traffic, port activities) affecting its formation.Keywords: Photochemical modelling, urban pollution, greater Athens area, MM5/CAMx.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13677996 Learning and Practicing Assessment in a Pre-service Teacher Education Program: Comparative Perspective of UK and Pakistani Universities
Authors: Malik Ghulam Behlol, Alison Fox, Faiza Masood, Sabiha Arshad
Abstract:
This paper explores the barriers to the application of learning-supportive assessment at teaching practicum while investigating the role of university teachers (UT), cooperative teachers (CT), prospective teachers (PT) and heads of the practicum schools (HPS) in the selected universities of Pakistan and the UK. It is a qualitative case study and data were collected through the lesson observation of UT in the pre-service teacher education setting and PT in practicum schools. Interviews with UT, HPS, and Focus Group Discussions with PT were conducted too. The study has concluded that as compared to the UK counterpart, PTs in Pakistan face significant barriers in applying learning-supportive assessment in the school practicum settings because of large class sizes, lack of institutionalised collaboration between universities and schools, poor modelling of the lesson, ineffective feedback practices, lower order thinking assignments, and limited opportunities to use technology in school settings.
Keywords: Learning supportive assessment, pre-service teacher education, theory-practice gap, teacher education.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927995 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique
Authors: Hyun-Woo Cho
Abstract:
The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.
Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13177994 Increasing Replica Consistency Performances with Load Balancing Strategy in Data Grid Systems
Authors: Sarra Senhadji, Amar Kateb, Hafida Belbachir
Abstract:
Data replication in data grid systems is one of the important solutions that improve availability, scalability, and fault tolerance. However, this technique can also bring some involved issues such as maintaining replica consistency. Moreover, as grid environment are very dynamic some nodes can be more uploaded than the others to become eventually a bottleneck. The main idea of our work is to propose a complementary solution between replica consistency maintenance and dynamic load balancing strategy to improve access performances under a simulated grid environment.
Keywords: Consistency, replication, data grid, load balancing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23257993 Mixed Model Assembly Line Sequencing In Make to Order System with Available to Promise Consideration
Authors: N. Manavizadeh, A. Dehghani, M. Rabbani
Abstract:
Mixed model assembly lines (MMAL) are a type of production line where a variety of product models similar in product characteristics are assembled. The effective design of these lines requires that schedule for assembling the different products is determined. In this paper we tried to fit the sequencing problem with the main characteristics of make to order (MTO) environment. The problem solved in this paper is a multiple objective sequencing problem in mixed model assembly lines sequencing using weighted Sum Method (WSM) using GAMS software for small problem and an effective GA for large scale problems because of the nature of NP-hardness of our problem and vast time consume to find the optimum solution in large problems. In this problem three practically important objectives are minimizing: total utility work, keeping a constant production rate variation, and minimizing earliness and tardiness cost which consider the priority of each customer and different due date which is a real situation in mixed model assembly lines and it is the first time we consider different attribute to prioritize the customers which help the company to reduce the cost of earliness and tardiness. This mechanism is a way to apply an advance available to promise (ATP) in mixed model assembly line sequencing which is the main contribution of this paper.Keywords: Available to promise, Earliness & Tardiness, GA, Mixed-Model assembly line Sequencing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25337992 Continuous Feature Adaptation for Non-Native Speech Recognition
Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern
Abstract:
The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation. Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32177991 Learning Algorithms for Fuzzy Inference Systems Composed of Double- and Single-Input Rule Modules
Authors: Hirofumi Miyajima, Kazuya Kishida, Noritaka Shigei, Hiromi Miyajima
Abstract:
Most of self-tuning fuzzy systems, which are automatically constructed from learning data, are based on the steepest descent method (SDM). However, this approach often requires a large convergence time and gets stuck into a shallow local minimum. One of its solutions is to use fuzzy rule modules with a small number of inputs such as DIRMs (Double-Input Rule Modules) and SIRMs (Single-Input Rule Modules). In this paper, we consider a (generalized) DIRMs model composed of double and single-input rule modules. Further, in order to reduce the redundant modules for the (generalized) DIRMs model, pruning and generative learning algorithms for the model are suggested. In order to show the effectiveness of them, numerical simulations for function approximation, Box-Jenkins and obstacle avoidance problems are performed.Keywords: Box-Jenkins’s problem, Double-input rule module, Fuzzy inference model, Obstacle avoidance, Single-input rule module.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19577990 Nonparametric Control Chart Using Density Weighted Support Vector Data Description
Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek
Abstract:
In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.
Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21217989 An Intelligent System for Phish Detection, using Dynamic Analysis and Template Matching
Authors: Chinmay Soman, Hrishikesh Pathak, Vishal Shah, Aniket Padhye, Amey Inamdar
Abstract:
Phishing, or stealing of sensitive information on the web, has dealt a major blow to Internet Security in recent times. Most of the existing anti-phishing solutions fail to handle the fuzziness involved in phish detection, thus leading to a large number of false positives. This fuzziness is attributed to the use of highly flexible and at the same time, highly ambiguous HTML language. We introduce a new perspective against phishing, that tries to systematically prove, whether a given page is phished or not, using the corresponding original page as the basis of the comparison. It analyzes the layout of the pages under consideration to determine the percentage distortion between them, indicative of any form of malicious alteration. The system design represents an intelligent system, employing dynamic assessment which accurately identifies brand new phishing attacks and will prove effective in reducing the number of false positives. This framework could potentially be used as a knowledge base, in educating the internet users against phishing.Keywords: World Wide Web, Phishing, Internet security, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18337988 Model-Based Person Tracking Through Networked Cameras
Authors: Kyoung-Mi Lee, Youn-Mi Lee
Abstract:
This paper proposes a way to track persons by making use of multiple non-overlapping cameras. Tracking persons on multiple non-overlapping cameras enables data communication among cameras through the network connection between a camera and a computer, while at the same time transferring human feature data captured by a camera to another camera that is connected via the network. To track persons with a camera and send the tracking data to another camera, the proposed system uses a hierarchical human model that comprises a head, a torso, and legs. The feature data of the person being modeled are transferred to the server, after which the server sends the feature data of the human model to the cameras connected over the network. This enables a camera that captures a person's movement entering its vision to keep tracking the recognized person with the use of the feature data transferred from the server.
Keywords: Person tracking, human model, networked cameras, vision-based surveillance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14897987 Slugging Frequency Correlation for Inclined Gas-liquid Flow
Authors: V. Hernandez-Perez, M. Abdulkadir, B. J. Azzopardi
Abstract:
In this work, new experimental data for slugging frequency in inclined gas-liquid flow are reported, and a new correlation is proposed. Scale experiments were carried out using a mixture of air and water in a 6 m long pipe. Two different pipe diameters were used, namely, 38 and 67 mm. The data were taken with capacitance type sensors at a data acquisition frequency of 200 Hz over an interval of 60 seconds. For the range of flow conditions studied, the liquid superficial velocity is observed to influence the frequency strongly. A comparison of the present data with correlations available in the literature reveals a lack of agreement. A new correlation for slug frequency has been proposed for the inclined flow, which represents the main contribution of this work.Keywords: slug frequency, inclined flow
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31637986 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria
Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova
Abstract:
Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.Keywords: Cross-validation, decision tree, lagged variables, short-term forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7387985 FCA-based Conceptual Knowledge Discovery in Folksonomy
Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang
Abstract:
The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.
Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20287984 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy
Authors: Nazaket Gazieva
Abstract:
Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.
Keywords: Biometric voice prints, fundamental frequency, phonogram, speech signal, temporal characteristics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5777983 Selenium Content in Agricultural Soils and Wheat from the Balkan Peninsula
Authors: S. Krustev, V. Angelova, P. Zaprjanova
Abstract:
Selenium (Se) is an essential micro-nutrient for human and animals but it is highly toxic. Its organic compounds play an important role in biochemistry and nutrition of the cells. Concentration levels of this element in the different regions of the world vary considerably. This study aimed to compare the availability and levels of the Se in some rural areas of the Balkan Peninsula and relationship with the concentrations of other trace elements. For this purpose soil samples and wheat grains from different regions of Bulgaria, Serbia, Nord Macedonia, Romania, and Greece situated far from large industrial centers have been analyzed. The main methods for their determination were the atomic spectral techniques – atomic absorption and plasma atomic emission. As a result of this study, data on microelements levels from the main grain-producing regions of the Balkan Peninsula were determined and systematized. The presented results confirm the low levels of Se in this region: 0.222– 0.962 mg.kg-1 in soils and 0.001 - 0.005 mg.kg-1 in wheat grains and require measures to offset the effect of this deficiency.
Keywords: Agricultural soils, Balkan Peninsula, rural areas, selenium.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6557982 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data
Authors: Ruchika Malhotra, Megha Khanna
Abstract:
The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.Keywords: Change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15207981 Screening of Process Variables for the Production of Extracellular Lipase from Palm Oil by Trichoderma Viride using Plackett-Burman Design
Authors: R. Rajendiran, S. Gayathri devi, B.T. SureshKumar, V. Arul Priya
Abstract:
Plackett-Burman statistical screening of media constituents and operational conditions for extracellular lipase production from isolate Trichoderma viride has been carried out in submerged fermentation. This statistical design is used in the early stages of experimentation to screen out unimportant factors from a large number of possible factors. This design involves screening of up to 'n-1' variables in just 'n' number of experiments. Regression coefficients and t-values were calculated by subjecting the experimental data to statistical analysis using Minitab version 15. The effects of nine process variables were studied in twelve experimental trials. Maximum lipase activity of 7.83 μmol /ml /min was obtained in the 6th trail. Pareto chart illustrates the order of significance of the variables affecting the lipase production. The present study concludes that the most significant variables affecting lipase production were found to be palm oil, yeast extract, K2HPO4, MgSO4 and CaCl2.Keywords: lipase, submerged fermentation, statistical optimization, Trichoderma viride
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23207980 Plant Varieties Selection System
Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh
Abstract:
In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.
Keywords: Plant varieties selection system, decision tree, expert recommendation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17937979 Jitter Transfer in High Speed Data Links
Authors: Tsunwai Gary Yip
Abstract:
Phase locked loops for data links operating at 10 Gb/s or faster are low phase noise devices designed to operate with a low jitter reference clock. Characterization of their jitter transfer function is difficult because the intrinsic noise of the device is comparable to the random noise level in the reference clock signal. A linear model is proposed to account for the intrinsic noise of a PLL. The intrinsic noise data of a PLL for 10 Gb/s links is presented. The jitter transfer function of a PLL in a test chip for 12.8 Gb/s data links was determined in experiments using the 400 MHz reference clock as the source of simultaneous excitations over a wide range of frequency. The result shows that the PLL jitter transfer function can be approximated by a second order linear model.Keywords: Intrinsic phase noise, jitter in data link, PLL jitter transfer function, high speed clocking in electronic circuit
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19467978 Socio-Economic Characteristics of Tribal Areas in KwaZulu-Natal, South Africa
Authors: Carilette Fourie, Chris Cloete
Abstract:
The occurrence of traditional authorities and tribal land within South Africa results in unique developmental trends and challenges. Tribal communities, typically located in rural environments, are perceived to be severely affected by poverty and poor living conditions relative to their urban counterparts. The exact extent of the socio-economic disparity between tribal and non-tribal communities is addressed in this paper. After adjustment of available census data to correspond with the delineation of tribal and non-tribal land in the Kwazulu-Natal province, seven selected socio-economic indicators were compared. The investigation revealed that although tribal areas are characterised by low employment rates and educational levels, a young population, fairly large household sizes, lower access to basic services and lower income households that are highly dependent on social grants, tribal area populations do have moderate levels of education, access to formal housing and relatively good access to services.
Keywords: KwaZulu-Natal, tribal areas, traditional authority, socio-economic, well-being.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 403