Search results for: personal and sensitive data
7276 Adjusted Ratio and Regression Type Estimators for Estimation of Population Mean when some Observations are missing
Authors: Nuanpan Nangsue
Abstract:
Ratio and regression type estimators have been used by previous authors to estimate a population mean for the principal variable from samples in which both auxiliary x and principal y variable data are available. However, missing data are a common problem in statistical analyses with real data. Ratio and regression type estimators have also been used for imputing values of missing y data. In this paper, six new ratio and regression type estimators are proposed for imputing values for any missing y data and estimating a population mean for y from samples with missing x and/or y data. A simulation study has been conducted to compare the six ratio and regression type estimators with a previous estimator of Rueda. Two population sizes N = 1,000 and 5,000 have been considered with sample sizes of 10% and 30% and with correlation coefficients between population variables X and Y of 0.5 and 0.8. In the simulations, 10 and 40 percent of sample y values and 10 and 40 percent of sample x values were randomly designated as missing. The new ratio and regression type estimators give similar mean absolute percentage errors that are smaller than the Rueda estimator for all cases. The new estimators give a large reduction in errors for the case of 40% missing y values and sampling fraction of 30%.
Keywords: Auxiliary variable, missing data, ratio and regression type estimators.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17317275 A Framework for Personalized Multi-Device Information Communicating System
Authors: Rohiza Ahmad, Rozana Kasbon, Eliza Mazmee Mazlan, Aliza Sarlan
Abstract:
Due to the mobility of users, many information systems are now developed with the capability of supporting retrieval of information from both static and mobile users. Hence, the amount, content and format of the information retrieved will need to be tailored according to the device and the user who requested for it. Thus, this paper presents a framework for the design and implementation of such a system, which is to be developed for communicating final examination related information to the academic community at one university in Malaysia. The concept of personalization will be implemented in the system so that only highly relevant information will be delivered to the users. The personalization concept used will be based on user profiling as well as context. The system in its final state will be accessible through cell phones as well as intranet connected personal computers.Keywords: System framework, personalization, informationcommunicating system, multi-device.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13857274 Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machines, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15767273 Structural Basis of Resistance of Helicobacterpylori DnaK to Antimicrobial Peptide Pyrrhocoricin
Authors: Musammat F. Nahar, Anna Roujeinikova
Abstract:
Bacterial molecular chaperone DnaK plays an essential role in protein folding, stress response and transmembrane targeting of proteins. DnaKs from many bacterial species, including Escherichia coli, Salmonella typhimurium and Haemophilus infleunzae are the molecular targets for the insect-derived antimicrobial peptide pyrrhocoricin. Pyrrhocoricin-like peptides bind in the substrate recognition tunnel. Despite the high degree of crossspecies sequence conservation in the substrate-binding tunnel, some bacteria are not sensitive to pyrrhocoricin. This work addresses the molecular mechanism of resistance of Helicobacter pylori DnaK to pyrrhocoricin. Homology modelling, structural and sequence analysis identify a single aminoacid substitution at the interface between the lid and the β-sandwich subdomains of the DnaK substrate-binding domain as the major determinant for its resistance.
Keywords: Helicobacter pylori, molecular chaperone DnaK, pyrrhocoricin, structural biology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17487272 Software Test Data Generation using Ant Colony Optimization
Authors: Huaizhong Li, C.Peng Lam
Abstract:
State-based testing is frequently used in software testing. Test data generation is one of the key issues in software testing. A properly generated test suite may not only locate the errors in a software system, but also help in reducing the high cost associated with software testing. It is often desired that test data in the form of test sequences within a test suite can be automatically generated to achieve required test coverage. This paper proposes an Ant Colony Optimization approach to test data generation for the state-based software testing.
Keywords: Software testing, ant colony optimization, UML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34587271 Natural Language News Generation from Big Data
Authors: Bastian Haarmann, Lukas Sikorski
Abstract:
In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The resulting fully automatic generated news stories have a high resemblance to the style in which the human writer would draw up such a story. Topics include soccer games, stock exchange market reports, and weather forecasts. Each generated text is unique. Readyto-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save timeconsuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist.
Keywords: Big data, natural language generation, publishing, robotic journalism.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16867270 Technology Integrated Education – Shaping the Personality and Social Development of the Young
Abstract:
There has been a strong link between computermediated education and constructivism learning and teaching theory.. Acknowledging how well the constructivism doctrine would work online, it has been established that constructivist views of learning would agreeably correlate with the philosophy of open and distance learning. Asynchronous and synchronous communications have placed online learning on the right track of a constructive learning path. This paper is written based on the social constructivist framework, where knowledge is constructed from social communication and interaction. The study explores the possibility of practicing this theory through incorporating online discussion in the syllabus and the ways it can be implemented to contribute to young people-s personality and social development by addressing some aspects that may contribute to the social problem such as prejudice, ignorance and intolerance.
Keywords: Educational Technology, Internet, Personal Development, Student Exchange
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18047269 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process
Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek
Abstract:
It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.
Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23967268 Analysis of Roasted and Ground Grains on the Seoul (Korea) Market for Their Contaminants of Aflatoxins, Ochratoxin A and Fusarium Toxins by LC-MS/MS
Authors: So-young Jung, Bu-chuhl Choe, Gi-young Shin, Jung-hun Kim, Young-zoo Chae
Abstract:
A sensitive and specific method for quantitative determination of aflatoxins(B1, B2, G1,G2), deoxynivalenol, fumonisin(B1,B2), ochratoxin A, zearalenone, T-2 and HT-2 in roasted and ground grains using liquid chromatography combined with tandem mass spectrometry. A double extraction using a phosphate buffer solution followed by methanol was applied to achieve effective co extraction of 11 mycotoxins. A multitoxin immunoaffinity column for all these mycotoxins was used to clean up the extract. The LODs of mycotoxins were 0.1~6.1 μg/kg, LOQs were 0.3~18.4 μg/kg. Forty seven samples collected from Seoul (Korea) for mycotoxin contamination monitoring. The results showed that the occurrence of zearalenone and deoxynivalenol were frequent. Zearalenone was detected in all samples and deoxynivalenol was detected in 80.9 % samples in the range 0.626 ~ 29.264 μg/kg and N.D ~ 48.332 μg/kg respectively. Fumonisins and ochratoxin A were detected in 46.8% samples and 17 % samples respectively, aflatoxins and T-2/HT-2 toxins were not detected all samples.Keywords: LC-MS/MS, mycotoxins, roasted and ground grains.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19957267 A New Model for Discovering XML Association Rules from XML Documents
Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani
Abstract:
The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.Keywords: XML, Data Mining, Association Rule Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16307266 Modelling Silica Optical Fibre Reliability: A Software Application
Authors: I. Severin, M. Caramihai, R. El Abdi, M. Poulain, A. Avadanii
Abstract:
In order to assess optical fiber reliability in different environmental and stress conditions series of testing are performed simulating overlapping of chemical and mechanical controlled varying factors. Each series of testing may be compared using statistical processing: i.e. Weibull plots. Due to the numerous data to treat, a software application has appeared useful to interpret selected series of experiments in function of envisaged factors. The current paper presents a software application used in the storage, modelling and interpretation of experimental data gathered from optical fibre testing. The present paper strictly deals with the software part of the project (regarding the modelling, storage and processing of user supplied data).
Keywords: Optical fibres, computer aided analysis, data models, data processing, graphical user interfaces.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18217265 Development and Characterization of Normoxic Polyhydroxyethylacrylate (PHEA) Gel Dosimeter using Raman Spectroscopy
Authors: Aifa Afirah Rozlan, Mohamad Suhaimi Jaafar, Azhar Abdul Rahman
Abstract:
Raman spectroscopy are used to characterize the chemical changes in normoxic polyhydroxyethylacrylate gel dosimeter (PHEA) induced by radiation. Irradiations in the low dose region are performed and the polymerizations of PHEA gels are monitored by the observing the changes of Raman shift intensity of the carbon covalent bond of PHEA originated from both monomer and the cross-linker. The variation in peak intensities with absorbed dose was observed. As the dose increase, the peak intensities of covalent bond of carbon in the polymer gels decrease. This point out that the amount of absorbed dose affect the polymerization of polymer gels. As the absorbed dose increase, the polymerizations also increase. Results verify that PHEA gel dosimeters are sensitive even in lower dose region.Keywords: normoxic polymer gel, ascorbic acid, Ramanspectroscopy, radiation dosimetry.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19417264 The Role of Synthetic Data in Aerial Object Detection
Authors: Ava Dodd, Jonathan Adams
Abstract:
The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.
Keywords: computer vision, machine learning, synthetic data, YOLOv4
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8477263 Unsupervised Text Mining Approach to Early Warning System
Authors: Ichihan Tai, Bill Olson, Paul Blessner
Abstract:
Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.
Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19197262 An Implementation of Data Reusable MPEG Video Coding Scheme
Authors: Vasily G. Moshnyaga
Abstract:
This paper presents an optimized MPEG2 video codec implementation, which drastically reduces the number of computations and memory accesses required for video compression. Unlike traditional scheme, we reuse data stored in frame memory to omit unnecessary coding operations and memory read/writes for unchanged macroblocks. Due to dynamic memory sharing among reference frames, data-driven macroblock characterization and selective macroblock processing, we perform less than 15% of the total operations required by a conventional coder while maintaining high picture quality.
Keywords: Data reuse, adaptive processing, video coding, MPEG
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12647261 Manual Pit Emptiers and Their Heath: Profiles, Determinants and Interventions
Authors: Ivy Chumo, Sheillah Simiyu, Hellen Gitau, Isaac Kisiangani, Caroline Kabaria Kanyiva Muindi, Blessing Mberu
Abstract:
The global sanitation workforce bridges the gap between sanitation infrastructure and the provision of sanitation services through essential public service work. Manual pit emptiers often perform the work at the cost of their dignity, safety, and health as their work requires repeated heavy physical activities such as lifting, carrying, pulling, and pushing. This exposes them to occupational and environmental health hazards and risking illness, injury, and death. The study will extend the studies by presenting occupational health risks and suggestions for improvement in informal settlements of Nairobi, Kenya. This is a qualitative study conducted among sanitation stakeholders in Korogocho, Mukuru and Kibera informal settlements in Nairobi. Data were captured using digital voice recorders, transcribed and thematically analysed. The discussion notes were further supported by observational notes made during the interviews. These formed the basis for a robust picture of occupational health of manual pit emptiers; a lack or inappropriate use of protective clothing, and prolonged duration of working hours were described to contribute to the occupational health hazard. To continue working, manual pit emptiers had devised coping strategies which include working in groups, improvised protective clothing, sharing the available protective clothing, working at night and consuming alcohol drinks while at work. Many of these strategies are detrimental to their health. Occupational health hazards among pit emptiers are key for effective working and is as a result of a lack of collaboration amongst stakeholders linked to health, safety and lack of PPE of pit emptiers. Collaborations amongst sanitation stakeholders is paramount for health, safety, and in ensuring the provision and use of personal protective devices.
Keywords: Sanitation, occupational health, manual emptiers, informal settlements.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8787260 Transliterating Methods of the Kazakh Onyms in the Arabic Language
Authors: K. A. Kydyrbayev, B.N. Zhubatova, G.E. Nadirova, A.A. Mustafayeva
Abstract:
Transliteration is frequently used especially in writing geographic denominations, personal names (onyms) etc. Proper names (onyms) of all languages must sound similarly in translated works as well as in scientific projects and works written in mother tongue, because we can get introduced with the nation, its history, culture, traditions and other spiritual values through the onyms of that nation. Therefore it is necessary to systematize the different transliterations of onyms of foreign languages. This paper is dedicated to the problem of making the project of transliterating Kazakh onyms into Arabic. In order to achieve this goal we use scientific or practical types of transliteration. Because in this type of transliteration provides easy reading writing source language's texts in the target language without any diacritical symbols, it is limited by the target language's alphabetic system.
Keywords: The Arabic, Kazakh languages, onyms, transliterating
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15527259 Off-State Leakage Power Reduction by Automatic Monitoring and Control System
Authors: S. Abdollahi Pour, M. Saneei
Abstract:
This paper propose a new circuit design which monitor total leakage current during standby mode and generates the optimal reverse body bias voltage, by using the adaptive body bias (ABB) technique to compensate die-to-die parameter variations. Design details of power monitor are examined using simulation framework in 65nm and 32nm BTPM model CMOS process. Experimental results show the overhead of proposed circuit in terms of its power consumption is about 10 μW for 32nm technology and about 12 μW for 65nm technology at the same power supply voltage as the core power supply. Moreover the results show that our proposed circuit design is not far sensitive to the temperature variations and also process variations. Besides, uses the simple blocks which offer good sensitivity, high speed, the continuously feedback loop.Keywords: leakage current, leakage power monitor, body biasing, low power
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17387258 A Hybrid Scheme for on-Line Diagnostic Decision Making Using Optimal Data Representation and Filtering Technique
Authors: Hyun-Woo Cho
Abstract:
The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.
Keywords: Diagnostics, batch process, nonlinear representation, data filtering, multivariate statistical approach
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13157257 Increasing Replica Consistency Performances with Load Balancing Strategy in Data Grid Systems
Authors: Sarra Senhadji, Amar Kateb, Hafida Belbachir
Abstract:
Data replication in data grid systems is one of the important solutions that improve availability, scalability, and fault tolerance. However, this technique can also bring some involved issues such as maintaining replica consistency. Moreover, as grid environment are very dynamic some nodes can be more uploaded than the others to become eventually a bottleneck. The main idea of our work is to propose a complementary solution between replica consistency maintenance and dynamic load balancing strategy to improve access performances under a simulated grid environment.
Keywords: Consistency, replication, data grid, load balancing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23247256 Nonparametric Control Chart Using Density Weighted Support Vector Data Description
Authors: Myungraee Cha, Jun Seok Kim, Seung Hwan Park, Jun-Geol Baek
Abstract:
In manufacturing industries, development of measurement leads to increase the number of monitoring variables and eventually the importance of multivariate control comes to the fore. Statistical process control (SPC) is one of the most widely used as multivariate control chart. Nevertheless, SPC is restricted to apply in processes because its assumption of data as following specific distribution. Unfortunately, process data are composed by the mixture of several processes and it is hard to estimate as one certain distribution. To alternative conventional SPC, therefore, nonparametric control chart come into the picture because of the strength of nonparametric control chart, the absence of parameter estimation. SVDD based control chart is one of the nonparametric control charts having the advantage of flexible control boundary. However,basic concept of SVDD has been an oversight to the important of data characteristic, density distribution. Therefore, we proposed DW-SVDD (Density Weighted SVDD) to cover up the weakness of conventional SVDD. DW-SVDD makes a new attempt to consider dense of data as introducing the notion of density Weight. We extend as control chart using new proposed SVDD and a simulation study of various distributional data is conducted to demonstrate the improvement of performance.
Keywords: Density estimation, Multivariate control chart, Oneclass classification, Support vector data description (SVDD)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21197255 Model-Based Person Tracking Through Networked Cameras
Authors: Kyoung-Mi Lee, Youn-Mi Lee
Abstract:
This paper proposes a way to track persons by making use of multiple non-overlapping cameras. Tracking persons on multiple non-overlapping cameras enables data communication among cameras through the network connection between a camera and a computer, while at the same time transferring human feature data captured by a camera to another camera that is connected via the network. To track persons with a camera and send the tracking data to another camera, the proposed system uses a hierarchical human model that comprises a head, a torso, and legs. The feature data of the person being modeled are transferred to the server, after which the server sends the feature data of the human model to the cameras connected over the network. This enables a camera that captures a person's movement entering its vision to keep tracking the recognized person with the use of the feature data transferred from the server.
Keywords: Person tracking, human model, networked cameras, vision-based surveillance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14887254 Slugging Frequency Correlation for Inclined Gas-liquid Flow
Authors: V. Hernandez-Perez, M. Abdulkadir, B. J. Azzopardi
Abstract:
In this work, new experimental data for slugging frequency in inclined gas-liquid flow are reported, and a new correlation is proposed. Scale experiments were carried out using a mixture of air and water in a 6 m long pipe. Two different pipe diameters were used, namely, 38 and 67 mm. The data were taken with capacitance type sensors at a data acquisition frequency of 200 Hz over an interval of 60 seconds. For the range of flow conditions studied, the liquid superficial velocity is observed to influence the frequency strongly. A comparison of the present data with correlations available in the literature reveals a lack of agreement. A new correlation for slug frequency has been proposed for the inclined flow, which represents the main contribution of this work.Keywords: slug frequency, inclined flow
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31617253 FCA-based Conceptual Knowledge Discovery in Folksonomy
Authors: Yu-Kyung Kang, Suk-Hyung Hwang, Kyoung-Mo Yang
Abstract:
The tagging data of (users, tags and resources) constitutes a folksonomy that is the user-driven and bottom-up approach to organizing and classifying information on the Web. Tagging data stored in the folksonomy include a lot of very useful information and knowledge. However, appropriate approach for analyzing tagging data and discovering hidden knowledge from them still remains one of the main problems on the folksonomy mining researches. In this paper, we have proposed a folksonomy data mining approach based on FCA for discovering hidden knowledge easily from folksonomy. Also we have demonstrated how our proposed approach can be applied in the collaborative tagging system through our experiment. Our proposed approach can be applied to some interesting areas such as social network analysis, semantic web mining and so on.
Keywords: Folksonomy data mining, formal concept analysis, collaborative tagging, conceptual knowledge discovery, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20277252 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy
Authors: Nazaket Gazieva
Abstract:
Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.
Keywords: Biometric voice prints, fundamental frequency, phonogram, speech signal, temporal characteristics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5727251 An Empirical Evaluation of Performance of Machine Learning Techniques on Imbalanced Software Quality Data
Authors: Ruchika Malhotra, Megha Khanna
Abstract:
The development of change prediction models can help the software practitioners in planning testing and inspection resources at early phases of software development. However, a major challenge faced during the training process of any classification model is the imbalanced nature of the software quality data. A data with very few minority outcome categories leads to inefficient learning process and a classification model developed from the imbalanced data generally does not predict these minority categories correctly. Thus, for a given dataset, a minority of classes may be change prone whereas a majority of classes may be non-change prone. This study explores various alternatives for adeptly handling the imbalanced software quality data using different sampling methods and effective MetaCost learners. The study also analyzes and justifies the use of different performance metrics while dealing with the imbalanced data. In order to empirically validate different alternatives, the study uses change data from three application packages of open-source Android data set and evaluates the performance of six different machine learning techniques. The results of the study indicate extensive improvement in the performance of the classification models when using resampling method and robust performance measures.Keywords: Change proneness, empirical validation, imbalanced learning, machine learning techniques, object-oriented metrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15197250 Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction
Authors: Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai
Abstract:
Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.
Keywords: Human Machine Interaction, Human Computer Interaction, Voice Recognition, Acoustic Model Compensation, Acoustic Speech Enhancement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18837249 Plant Varieties Selection System
Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh
Abstract:
In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.
Keywords: Plant varieties selection system, decision tree, expert recommendation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17927248 Jitter Transfer in High Speed Data Links
Authors: Tsunwai Gary Yip
Abstract:
Phase locked loops for data links operating at 10 Gb/s or faster are low phase noise devices designed to operate with a low jitter reference clock. Characterization of their jitter transfer function is difficult because the intrinsic noise of the device is comparable to the random noise level in the reference clock signal. A linear model is proposed to account for the intrinsic noise of a PLL. The intrinsic noise data of a PLL for 10 Gb/s links is presented. The jitter transfer function of a PLL in a test chip for 12.8 Gb/s data links was determined in experiments using the 400 MHz reference clock as the source of simultaneous excitations over a wide range of frequency. The result shows that the PLL jitter transfer function can be approximated by a second order linear model.Keywords: Intrinsic phase noise, jitter in data link, PLL jitter transfer function, high speed clocking in electronic circuit
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19457247 How to Connect User Research and not so Forthcoming Technology Scenarios – The Extended Home Environment Case Study
Authors: E. Guercio, A. Marcengo, A. Rapp
Abstract:
This paper draws a methodological framework adopted within an internal Telecomitalia project aimed to identify, on a user centred base, the potential interest towards a technological scenario aimed to extend on a personal bubble the typical communication and media fruition home environment. The problem is that involving user in the early stage of the development of such disruptive technology scenario asking users opinions on something that users actually do not manage even in a rough manner could lead to wrong or distorted results. For that reason we chose an approach that indirectly aim to understand users hidden needs in order to obtain a meaningful picture of the possible interest for a technological proposition non yet easily understandable.
Keywords: Personas, focus groups, scenarios, extended home environment, telecommunication, media.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589