Search results for: Spectral data
6544 Aggregation Scheduling Algorithms in Wireless Sensor Networks
Authors: Min Kyung An
Abstract:
In Wireless Sensor Networks which consist of tiny wireless sensor nodes with limited battery power, one of the most fundamental applications is data aggregation which collects nearby environmental conditions and aggregates the data to a designated destination, called a sink node. Important issues concerning the data aggregation are time efficiency and energy consumption due to its limited energy, and therefore, the related problem, named Minimum Latency Aggregation Scheduling (MLAS), has been the focus of many researchers. Its objective is to compute the minimum latency schedule, that is, to compute a schedule with the minimum number of timeslots, such that the sink node can receive the aggregated data from all the other nodes without any collision or interference. For the problem, the two interference models, the graph model and the more realistic physical interference model known as Signal-to-Interference-Noise-Ratio (SINR), have been adopted with different power models, uniform-power and non-uniform power (with power control or without power control), and different antenna models, omni-directional antenna and directional antenna models. In this survey article, as the problem has proven to be NP-hard, we present and compare several state-of-the-art approximation algorithms in various models on the basis of latency as its performance measure.Keywords: Data aggregation, convergecast, gathering, approximation, interference, omni-directional, directional.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7996543 Health Assessment of Electronic Products using Mahalanobis Distance and Projection Pursuit Analysis
Authors: Sachin Kumar, Vasilis Sotiris, Michael Pecht
Abstract:
With increasing complexity in electronic systems there is a need for system level anomaly detection and fault isolation. Anomaly detection based on vector similarity to a training set is used in this paper through two approaches, one the preserves the original information, Mahalanobis Distance (MD), and the other that compresses the data into its principal components, Projection Pursuit Analysis. These methods have been used to detect deviations in system performance from normal operation and for critical parameter isolation in multivariate environments. The study evaluates the detection capability of each approach on a set of test data with known faults against a baseline set of data representative of such “healthy" systems.Keywords: Mahalanobis distance, Principle components, Projection pursuit, Health assessment, Anomaly.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16816542 Application of the Data Distribution Service for Flexible Manufacturing Automation
Authors: Marco Ryll, Svetan Ratchev
Abstract:
This paper discusses the applicability of the Data Distribution Service (DDS) for the development of automated and modular manufacturing systems which require a flexible and robust communication infrastructure. DDS is an emergent standard for datacentric publish/subscribe middleware systems that provides an infrastructure for platform-independent many-to-many communication. It particularly addresses the needs of real-time systems that require deterministic data transfer, have low memory footprints and high robustness requirements. After an overview of the standard, several aspects of DDS are related to current challenges for the development of modern manufacturing systems with distributed architectures. Finally, an example application is presented based on a modular active fixturing system to illustrate the described aspects.Keywords: Flexible Manufacturing, Publish/Subscribe, Plug & Produce.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23526541 Impacts of Building Design Factors on Auckland School Energy Consumptions
Authors: Bin Su
Abstract:
This study focuses on the impact of school building design factors on winter extra energy consumption which mainly includes space heating, water heating and other appliances related to winter indoor thermal conditions. A number of Auckland schools were randomly selected for the study which introduces a method of using real monthly energy consumption data for a year to calculate winter extra energy data of school buildings. The study seeks to identify the relationships between winter extra energy data related to school building design data related to the main architectural features, building envelope and elements of the sample schools. The relationships can be used to estimate the approximate saving in winter extra energy consumption which would result from a changed design datum for future school development, and identify any major energy-efficient design problems. The relationships are also valuable for developing passive design guides for school energy efficiency.
Keywords: Building energy efficiency, Building thermal design, Building thermal performance, School building design.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19456540 Tree Based Data Aggregation to Resolve Funneling Effect in Wireless Sensor Network
Authors: G. Rajesh, B. Vinayaga Sundaram, C. Aarthi
Abstract:
In wireless sensor network, sensor node transmits the sensed data to the sink node in multi-hop communication periodically. This high traffic induces congestion at the node which is present one-hop distance to the sink node. The packet transmission and reception rate of these nodes should be very high, when compared to other sensor nodes in the network. Therefore, the energy consumption of that node is very high and this effect is known as the “funneling effect”. The tree based-data aggregation technique (TBDA) is used to reduce the energy consumption of the node. The throughput of the overall performance shows a considerable decrease in the number of packet transmissions to the sink node. The proposed scheme, TBDA, avoids the funneling effect and extends the lifetime of the wireless sensor network. The average case time complexity for inserting the node in the tree is O(n log n) and for the worst case time complexity is O(n2).Keywords: Data Aggregation, Funneling Effect, Traffic Congestion, Wireless Sensor Network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13166539 Sleep Scheduling Schemes Based on Location of Mobile User in Sensor-Cloud
Authors: N. Mahendran, R. Priya
Abstract:
The mobile cloud computing (MCC) with wireless sensor networks (WSNs) technology gets more attraction by research scholars because its combines the sensors data gathering ability with the cloud data processing capacity. This approach overcomes the limitation of data storage capacity and computational ability of sensor nodes. Finally, the stored data are sent to the mobile users when the user sends the request. The most of the integrated sensor-cloud schemes fail to observe the following criteria: 1) The mobile users request the specific data to the cloud based on their present location. 2) Power consumption since most of them are equipped with non-rechargeable batteries. Mostly, the sensors are deployed in hazardous and remote areas. This paper focuses on above observations and introduces an approach known as collaborative location-based sleep scheduling (CLSS) scheme. Both awake and asleep status of each sensor node is dynamically devised by schedulers and the scheduling is done purely based on the of mobile users’ current location; in this manner, large amount of energy consumption is minimized at WSN. CLSS work depends on two different methods; CLSS1 scheme provides lower energy consumption and CLSS2 provides the scalability and robustness of the integrated WSN.
Keywords: Sleep scheduling, mobile cloud computing, wireless sensor network, integration, location, network lifetime.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9766538 Quantitative and Fourier Transform Infrared Analysis of Saponins from Three Kenyan Ruellia Species: Ruellia prostrata, Ruellia lineari-bracteolata and Ruellia bignoniiflora
Authors: Christine O. Wangia, Jennifer A. Orwa, Francis W. Muregi, Patrick G. Kareru, Kipyegon Cheruiyot, Eric Guantai
Abstract:
Ruellia (syn. Dipteracanthus) species are wild perennial creepers belonging to the Acanthaceae family. These species are reported to possess anti-inflammatory, analgesic, antioxidant, gastroprotective, anticancer, and immuno-stimulant properties. Phytochemical screening of both aqueous and methanolic extracts of Ruellia species revealed the presence of saponins. Saponins have been reported to possess anti-inflammatory, antioxidant, immuno-stimulant, antihepatotoxic, antibacterial, anticarcinogenic, and antiulcerogenic activities. The objective of this study was to quantify and analyze the Fourier transform infrared (FTIR) spectra of saponins in crude extracts of three Kenyan Ruellia species namely Ruellia prostrata (RPM), Ruellia lineari-bracteolata (RLB) and Ruellia bignoniiflora (RBK). Sequential organic extraction of the ground whole plant material was done using petroleum ether (PE), chloroform, ethyl acetate (EtOAc), and absolute methanol by cold maceration, while aqueous extraction was by hot maceration. The plant powders and extracts were mixed with spectroscopic grade KBr and compressed into a pellet. The infrared spectra were recorded using a Shimadzu FTIR spectrophotometer of 8000 series in the range of 3500 cm-1 - 500 cm-1. Quantitative determination of the saponins was done using standard procedures. Quantitative analysis of saponins showed that RPM had the highest quantity of crude saponins (2.05% ± 0.03), followed by RLB (1.4% ± 0.15) and RBK (1.25% ± 0.11), respectively. FTIR spectra revealed the spectral peaks characteristic for saponins in RPM, RLB, and RBK plant powders, aqueous and methanol extracts; O-H absorption (3265 - 3393 cm-1), C-H absorption ranging from 2851 to 2924 cm-1, C=C absorbance (1628 - 1655 cm-1), oligosaccharide linkage (C-O-C) absorption due to sapogenins (1036 - 1042 cm-1). The crude saponins from RPM, RLB and RBK showed similar peaks to their respective extracts. The presence of the saponins in extracts of RPM, RLB and RBK may be responsible for some of the biological activities reported in the Ruellia species.1Keywords: Ruellia bignoniiflora, Ruellia lineari-bracteolata, Ruellia prostrata, Saponins.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11916537 Impact of Safety and Quality Considerations of Housing Clients on the Construction Firms’ Intention to Adopt Quality Function Deployment: A Case of Construction Sector
Authors: Saif Ul Haq
Abstract:
The current study intends to examine the safety and quality considerations of clients of housing projects and their impact on the adoption of Quality Function Deployment (QFD) by the construction firm. Mixed method research technique has been used to collect and analyze the data wherein a survey was conducted to collect the data from 220 clients of housing projects in Saudi Arabia. Then, the telephonic and Skype interviews were conducted to collect data of 15 professionals working in the top ten real estate companies of Saudi Arabia. Data were analyzed by using partial least square (PLS) and thematic analysis techniques. Findings reveal that today’s customer prioritizes the safety and quality requirements of their houses and as a result, construction firms adopt QFD to address the needs of customers. The findings are of great importance for the clients of housing projects as well as for the construction firms as they could apply QFD in housing projects to address the safety and quality concerns of their clients.Keywords: Construction industry, quality considerations, quality function deployment, safety considerations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8996536 Tourism Satellite Account: Approach and Information System Development
Authors: Pappas Theodoros, Michael Diakomichalis
Abstract:
Measuring the economic impact of tourism in a benchmark economy is a global concern, with previous measurements being partial and not fully integrated. Tourism is a phenomenon that requires individual consumption of visitors, and which should be observed and measured to reveal the overall contribution of tourism to an economy. The Tourism Satellite Account (TSA) is a critical tool for assessing the annual growth of tourism, providing reliable measurements. This article presents a system of TSA information that encompasses all functions TSA functions, including input, storage, management, and analysis of data, as well as additional future functions and enhances the efficiency of tourism data management and TSA collection utility. The methodology and results presented offer new insights for the development and implementation of TSA.
Keywords: Tourism Satellite Account, information system, data-based tourist account.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 596535 Input Data Balancing in a Neural Network PM-10 Forecasting System
Authors: Suk-Hyun Yu, Heeyong Kwon
Abstract:
Recently PM-10 has become a social and global issue. It is one of major air pollutants which affect human health. Therefore, it needs to be forecasted rapidly and precisely. However, PM-10 comes from various emission sources, and its level of concentration is largely dependent on meteorological and geographical factors of local and global region, so the forecasting of PM-10 concentration is very difficult. Neural network model can be used in the case. But, there are few cases of high concentration PM-10. It makes the learning of the neural network model difficult. In this paper, we suggest a simple input balancing method when the data distribution is uneven. It is based on the probability of appearance of the data. Experimental results show that the input balancing makes the neural networks’ learning easy and improves the forecasting rates.
Keywords: AI, air quality prediction, neural networks, pattern recognition, PM-10.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8266534 Tree Based Data Fusion Clustering Routing Algorithm for Illimitable Network Administration in Wireless Sensor Network
Authors: Y. Harold Robinson, M. Rajaram, E. Golden Julie, S. Balaji
Abstract:
In wireless sensor networks, locality and positioning information can be captured using Global Positioning System (GPS). This message can be congregated initially from spot to identify the system. Users can retrieve information of interest from a wireless sensor network (WSN) by injecting queries and gathering results from the mobile sink nodes. Routing is the progression of choosing optimal path in a mobile network. Intermediate node employs permutation of device nodes into teams and generating cluster heads that gather the data from entity cluster’s node and encourage the collective data to base station. WSNs are widely used for gathering data. Since sensors are power-constrained devices, it is quite vital for them to reduce the power utilization. A tree-based data fusion clustering routing algorithm (TBDFC) is used to reduce energy consumption in wireless device networks. Here, the nodes in a tree use the cluster formation, whereas the elevation of the tree is decided based on the distance of the member nodes to the cluster-head. Network simulation shows that this scheme improves the power utilization by the nodes, and thus considerably improves the lifetime.
Keywords: WSN, TBDFC, LEACH, PEGASIS, TREEPSI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11166533 Holistic Face Recognition using Multivariate Approximation, Genetic Algorithms and AdaBoost Classifier: Preliminary Results
Authors: C. Villegas-Quezada, J. Climent
Abstract:
Several works regarding facial recognition have dealt with methods which identify isolated characteristics of the face or with templates which encompass several regions of it. In this paper a new technique which approaches the problem holistically dispensing with the need to identify geometrical characteristics or regions of the face is introduced. The characterization of a face is achieved by randomly sampling selected attributes of the pixels of its image. From this information we construct a set of data, which correspond to the values of low frequencies, gradient, entropy and another several characteristics of pixel of the image. Generating a set of “p" variables. The multivariate data set with different polynomials minimizing the data fitness error in the minimax sense (L∞ - Norm) is approximated. With the use of a Genetic Algorithm (GA) it is able to circumvent the problem of dimensionality inherent to higher degree polynomial approximations. The GA yields the degree and values of a set of coefficients of the polynomials approximating of the image of a face. By finding a family of characteristic polynomials from several variables (pixel characteristics) for each face (say Fi ) in the data base through a resampling process the system in use, is trained. A face (say F ) is recognized by finding its characteristic polynomials and using an AdaBoost Classifier from F -s polynomials to each of the Fi -s polynomials. The winner is the polynomial family closer to F -s corresponding to target face in data base.
Keywords: AdaBoost Classifier, Holistic Face Recognition, Minimax Multivariate Approximation, Genetic Algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14976532 Application of Exact String Matching Algorithms towards SMILES Representation of Chemical Structure
Authors: Ahmad Fadel Klaib, Zurinahni Zainol, Nurul Hashimah Ahamed, Rosma Ahmad, Wahidah Hussin
Abstract:
Bioinformatics and Cheminformatics use computer as disciplines providing tools for acquisition, storage, processing, analysis, integrate data and for the development of potential applications of biological and chemical data. A chemical database is one of the databases that exclusively designed to store chemical information. NMRShiftDB is one of the main databases that used to represent the chemical structures in 2D or 3D structures. SMILES format is one of many ways to write a chemical structure in a linear format. In this study we extracted Antimicrobial Structures in SMILES format from NMRShiftDB and stored it in our Local Data Warehouse with its corresponding information. Additionally, we developed a searching tool that would response to user-s query using the JME Editor tool that allows user to draw or edit molecules and converts the drawn structure into SMILES format. We applied Quick Search algorithm to search for Antimicrobial Structures in our Local Data Ware House.
Keywords: Exact String-matching Algorithms, NMRShiftDB, SMILES Format, Antimicrobial Structures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22246531 Intrusion Detection based on Distance Combination
Authors: Joffroy Beauquier, Yongjie Hu
Abstract:
The intrusion detection problem has been frequently studied, but intrusion detection methods are often based on a single point of view, which always limits the results. In this paper, we introduce a new intrusion detection model based on the combination of different current methods. First we use a notion of distance to unify the different methods. Second we combine these methods using the Pearson correlation coefficients, which measure the relationship between two methods, and we obtain a combined distance. If the combined distance is greater than a predetermined threshold, an intrusion is detected. We have implemented and tested the combination model with two different public data sets: the data set of masquerade detection collected by Schonlau & al., and the data set of program behaviors from the University of New Mexico. The results of the experiments prove that the combination model has better performances.
Keywords: Intrusion detection, combination, distance, Pearson correlation coefficients.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18426530 Fault Tolerance in Distributed Database Systems
Authors: M. A. Adeboyejo, O. O. Adeosun
Abstract:
Pioneer networked systems assume that connections are reliable, and a faulty operation will be considered in case of losing a connection. Transient connections are typical of mobile devices. Areas of application of data sharing system such as these, lead to the conclusion that network connections may not always be reliable, and that the conventional approaches can be improved. Nigerian commercial banking industry is a critical system whose operation is increasingly becoming dependent on information technology (IT) driven information system. The proposed solution to this problem makes use of a hierarchically clustered network structure which we selected to reflect (as much as possible) the typical organizational structure of the Nigerian commercial banks. Representative transactions such as data updates and replication of the results of such updates were used to simulate the proposed model to show its applicability.
Keywords: Dependability, reliability, data redundancy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33576529 Normalization Discriminant Independent Component Analysis
Authors: Liew Yee Ping, Pang Ying Han, Lau Siong Hoe, Ooi Shih Yin, Housam Khalifa Bashier Babiker
Abstract:
In face recognition, feature extraction techniques attempts to search for appropriate representation of the data. However, when the feature dimension is larger than the samples size, it brings performance degradation. Hence, we propose a method called Normalization Discriminant Independent Component Analysis (NDICA). The input data will be regularized to obtain the most reliable features from the data and processed using Independent Component Analysis (ICA). The proposed method is evaluated on three face databases, Olivetti Research Ltd (ORL), Face Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC). NDICA showed it effectiveness compared with other unsupervised and supervised techniques.
Keywords: Face recognition, small sample size, regularization, independent component analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19546528 Daily Global Solar Radiation Modeling Using Multi-Layer Perceptron (MLP) Neural Networks
Authors: Seyed Fazel Ziaei Asl, Ali Karami, Gholamreza Ashari, Azam Behrang, Arezoo Assareh, N.Hedayat
Abstract:
Predict daily global solar radiation (GSR) based on meteorological variables, using Multi-layer perceptron (MLP) neural networks is the main objective of this study. Daily mean air temperature, relative humidity, sunshine hours, evaporation, wind speed, and soil temperature values between 2002 and 2006 for Dezful city in Iran (32° 16' N, 48° 25' E), are used in this study. The measured data between 2002 and 2005 are used to train the neural networks while the data for 214 days from 2006 are used as testing data.
Keywords: Multi-layer Perceptron (MLP) Neural Networks;Global Solar Radiation (GSR), Meteorological Parameters, Prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29836527 The Effect of CPU Location in Total Immersion of Microelectronics
Authors: A. Almaneea, N. Kapur, J. L. Summers, H. M. Thompson
Abstract:
Meeting the growth in demand for digital services such as social media, telecommunications, and business and cloud services requires large scale data centres, which has led to an increase in their end use energy demand. Generally, over 30% of data centre power is consumed by the necessary cooling overhead. Thus energy can be reduced by improving the cooling efficiency. Air and liquid can both be used as cooling media for the data centre. Traditional data centre cooling systems use air, however liquid is recognised as a promising method that can handle the more densely packed data centres. Liquid cooling can be classified into three methods; rack heat exchanger, on-chip heat exchanger and full immersion of the microelectronics. This study quantifies the improvements of heat transfer specifically for the case of immersed microelectronics by varying the CPU and heat sink location. Immersion of the server is achieved by filling the gap between the microelectronics and a water jacket with a dielectric liquid which convects the heat from the CPU to the water jacket on the opposite side. Heat transfer is governed by two physical mechanisms, which is natural convection for the fixed enclosure filled with dielectric liquid and forced convection for the water that is pumped through the water jacket. The model in this study is validated with published numerical and experimental work and shows good agreement with previous work. The results show that the heat transfer performance and Nusselt number (Nu) is improved by 89% by placing the CPU and heat sink on the bottom of the microelectronics enclosure.
Keywords: CPU location, data centre cooling, heat sink in enclosures, Immersed microelectronics, turbulent natural convection in enclosures.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21746526 A Study on the Cloud Simulation with a Network Topology Generator
Authors: Jun-Kwon Jung, Sung-Min Jung, Tae-Kyung Kim, Tai-Myoung Chung
Abstract:
CloudSim is a useful tool to simulate the cloud environment. It shows the service availability, the power consumption, and the network traffic of services on the cloud environment. Moreover, it supports to calculate a network communication delay through a network topology data easily. CloudSim allows inputting a file of topology data, but it does not provide any generating process. Thus, it needs the file of topology data generated from some other tools. The BRITE is typical network topology generator. Also, it supports various type of topology generating algorithms. If CloudSim can include the BRITE, network simulation for clouds is easier than existing version. This paper shows the potential of connection between BRITE and CloudSim. Also, it proposes the direction to link between them.Keywords: Cloud, simulation, topology, BRITE, network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37786525 Low Power Circuit Architecture of AES Crypto Module for Wireless Sensor Network
Authors: MooSeop Kim, Juhan Kim, Yongje Choi
Abstract:
Recently, much research has been conducted for security for wireless sensor networks and ubiquitous computing. Security issues such as authentication and data integrity are major requirements to construct sensor network systems. Advanced Encryption Standard (AES) is considered as one of candidate algorithms for data encryption in wireless sensor networks. In this paper, we will present the hardware architecture to implement low power AES crypto module. Our low power AES crypto module has optimized architecture of data encryption unit and key schedule unit which could be applicable to wireless sensor networks. We also details low power design methods used to design our low power AES crypto module.Keywords: Algorithm, Low Power Crypto Circuit, AES, Security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25156524 Role of Credit on Production Efficiency of Farming Sector in Pakistan(A Data Envelopment Analysis)
Authors: Saima Ayaz, Zakir Hussain, Maqbool Hussain Sial
Abstract:
The study identified the sources of production inefficiency of the farming sector in district Faisalabad in the Punjab province of Pakistan. Data Envelopment Analysis (DEA) technique was utilized at farm level survey data of 300 farmers for the year 2009. The overall mean efficiency score was 0.78 indicating 22 percent inefficiency of the sample farmers. Computed efficiency scores were then regressed on farm specific variables using Tobit regression analysis. Farming experience, education, access to farming credit, herd size and number of cultivation practices showed constructive and significant effect on the farmer-s technical efficiency.Keywords: Agricultural credit, DEA, Technical efficiency, Tobit analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23516523 Towards End-To-End Disease Prediction from Raw Metagenomic Data
Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker
Abstract:
Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.Keywords: Metagenomics, phenotype prediction, deep learning, embeddings, multiple instance learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9106522 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14436521 Establishing a Probabilistic Model of Extrapolated Wind Speed Data for Wind Energy Prediction
Authors: Mussa I. Mgwatu, Reuben R. M. Kainkwa
Abstract:
Wind is among the potential energy resources which can be harnessed to generate wind energy for conversion into electrical power. Due to the variability of wind speed with time and height, it becomes difficult to predict the generated wind energy more optimally. In this paper, an attempt is made to establish a probabilistic model fitting the wind speed data recorded at Makambako site in Tanzania. Wind speeds and direction were respectively measured using anemometer (type AN1) and wind Vane (type WD1) both supplied by Delta-T-Devices at a measurement height of 2 m. Wind speeds were then extrapolated for the height of 10 m using power law equation with an exponent of 0.47. Data were analysed using MINITAB statistical software to show the variability of wind speeds with time and height, and to determine the underlying probability model of the extrapolated wind speed data. The results show that wind speeds at Makambako site vary cyclically over time; and they conform to the Weibull probability distribution. From these results, Weibull probability density function can be used to predict the wind energy.Keywords: Probabilistic models, wind speed, wind energy
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23476520 Demographic Factors Influencing Employees’ Salary Expectations and Labor Turnover
Authors: M. Osipova
Abstract:
Thanks to informational technologies development every sphere of economics is becoming more and more datacentralized as people are generating huge datasets containing information on any aspect of their life. Applying research of such data to human resources management allows getting scarce statistics on labor market state including salary expectations and potential employees’ typical career behavior, and this information can become a reliable basis for management decisions. The following article presents results of career behavior research based on freely accessible resume data. Information used for study is much wider than one usually uses in human resources surveys. That is why there is enough data for statistically significant results even for subgroups analysis.
Keywords: Human resources management, labor market, salary expectations, statistics, turnover.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18466519 Mathematical Modeling to Predict Surface Roughness in CNC Milling
Authors: Ab. Rashid M.F.F., Gan S.Y., Muhammad N.Y.
Abstract:
Surface roughness (Ra) is one of the most important requirements in machining process. In order to obtain better surface roughness, the proper setting of cutting parameters is crucial before the process take place. This research presents the development of mathematical model for surface roughness prediction before milling process in order to evaluate the fitness of machining parameters; spindle speed, feed rate and depth of cut. 84 samples were run in this study by using FANUC CNC Milling α-Τ14ιE. Those samples were randomly divided into two data sets- the training sets (m=60) and testing sets(m=24). ANOVA analysis showed that at least one of the population regression coefficients was not zero. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the surface roughness is most influenced by the feed rate. By using Multiple Regression Method equation, the average percentage deviation of the testing set was 9.8% and 9.7% for training data set. This showed that the statistical model could predict the surface roughness with about 90.2% accuracy of the testing data set and 90.3% accuracy of the training data set.
Keywords: Surface roughness, regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21316518 Nonlinear Transformation of Laser Generated Ultrasonic Pulses in Geomaterials
Authors: Elena B. Cherepetskaya, Alexander A. Karabutov, Natalia B. Podymova, Ivan Sas
Abstract:
Nonlinear evolution of broadband ultrasonic pulses passed through the rock specimens is studied using the apparatus “GEOSCAN-02M”. Ultrasonic pulses are excited by the pulses of Qswitched Nd:YAG laser with the time duration of 10 ns and with the energy of 260 mJ. This energy can be reduced to 20 mJ by some light filters. The laser beam radius did not exceed 5 mm. As a result of the absorption of the laser pulse in the special material – the optoacoustic generator–the pulses of longitudinal ultrasonic waves are excited with the time duration of 100 ns and with the maximum pressure amplitude of 10 MPa. The immersion technique is used to measure the parameters of these ultrasonic pulses passed through a specimen, the immersion liquid is distilled water. The reference pulse passed through the cell with water has the compression and the rarefaction phases. The amplitude of the rarefaction phase is five times lower than that of the compression phase. The spectral range of the reference pulse reaches 10 MHz. The cubic-shaped specimens of the Karelian gabbro are studied with the rib length 3 cm. The ultimate strength of the specimens by the uniaxial compression is (300±10) MPa. As the reference pulse passes through the area of the specimen without cracks the compression phase decreases and the rarefaction one increases due to diffraction and scattering of ultrasound, so the ratio of these phases becomes 2.3:1. After preloading some horizontal cracks appear in the specimens. Their location is found by one-sided scanning of the specimen using the backward mode detection of the ultrasonic pulses reflected from the structure defects. Using the computer processing of these signals the images are obtained of the cross-sections of the specimens with cracks. By the increase of the reference pulse amplitude from 0.1 MPa to 5 MPa the nonlinear transformation of the ultrasonic pulse passed through the specimen with horizontal cracks results in the decrease by 2.5 times of the amplitude of the rarefaction phase and in the increase of its duration by 2.1 times. By the increase of the reference pulse amplitude from 5 MPa to 10 MPa the time splitting of the phases is observed for the bipolar pulse passed through the specimen. The compression and rarefaction phases propagate with different velocities. These features of the powerful broadband ultrasonic pulses passed through the rock specimens can be described by the hysteresis model of Preisach- Mayergoyz and can be used for the location of cracks in the optically opaque materials.Keywords: Cracks, geological materials, nonlinear evolution of ultrasonic pulses, rock.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18956517 Parameter Estimation using Maximum Likelihood Method from Flight Data at High Angles of Attack
Authors: Rakesh Kumar, A. K. Ghosh
Abstract:
The paper presents the modeling of nonlinear longitudinal aerodynamics using flight data of Hansa-3 aircraft at high angles of attack near stall. The Kirchhoff-s quasi-steady stall model has been used to incorporate nonlinear aerodynamic effects in the aerodynamic model used to estimate the parameters, thereby, making the aerodynamic model nonlinear. The Maximum Likelihood method has been applied to the flight data (at high angles of attack) for the estimation of parameters (aerodynamic and stall characteristics) using the nonlinear aerodynamic model. To improve the accuracy level of the estimates, an approach of fixing the strong parameters has also been presented.Keywords: Maximum Likelihood, nonlinear, parameters, stall.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22166516 Network Anomaly Detection using Soft Computing
Authors: Surat Srinoy, Werasak Kurutach, Witcha Chimphlee, Siriporn Chimphlee
Abstract:
One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.Keywords: Network security, intrusion detection, rough set, ICA, anomaly detection, independent component analysis, rough fuzzy .
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19556515 Automatic Thresholding for Data Gap Detection for a Set of Sensors in Instrumented Buildings
Authors: Houda Najeh, Stéphane Ploix, Mahendra Pratap Singh, Karim Chabir, Mohamed Naceur Abdelkrim
Abstract:
Building systems are highly vulnerable to different kinds of faults and failures. In fact, various faults, failures and human behaviors could affect the building performance. This paper tackles the detection of unreliable sensors in buildings. Different literature surveys on diagnosis techniques for sensor grids in buildings have been published but all of them treat only bias and outliers. Occurences of data gaps have also not been given an adequate span of attention in the academia. The proposed methodology comprises the automatic thresholding for data gap detection for a set of heterogeneous sensors in instrumented buildings. Sensor measurements are considered to be regular time series. However, in reality, sensor values are not uniformly sampled. So, the issue to solve is from which delay each sensor become faulty? The use of time series is required for detection of abnormalities on the delays. The efficiency of the method is evaluated on measurements obtained from a real power plant: an office at Grenoble Institute of technology equipped by 30 sensors.Keywords: Building system, time series, diagnosis, outliers, delay, data gap.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 903