Search results for: data validation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7719

Search results for: data validation

7419 Experimental and Numerical Study of A/C Outletsand Its Impact on Room Airflow Characteristics

Authors: Mohammed A. Aziz, Ibrahim A. M. Gad, El Shahat F. A. Mohammed, Ramy H. Mohammed

Abstract:

This paper investigates experimental and numerical study of the airflow characteristics for vortex, round and square ceiling diffusers and its effect on the thermal comfort in a ventilated room. Three different thermal comfort criteria namely; Mean Age of the Air (MAA), ventilation effectiveness (E), and Effective Draft Temperature (EDT) have been used to predict the thermal comfort zone inside the room. In experimental work, a sub-scale room is set-up to measure the temperature field in the room. In numerical analysis, unstructured grids have been used to discretize the numerical domain. Conservation equations are solved using FLUENT commercial flow solver. The code is validated by comparing the numerical results obtained from three different turbulence models with the available experimental data. The comparison between the various numerical models shows that the standard k-ε turbulence model can be used to simulate these cases successfully. After validation of the code, effect of supply air velocity on the flow and thermal field could be investigated and hence the thermal comfort. The results show that the pressure coefficient created by the square diffuser is 1.5 times greater than that created by the vortex diffuser. The velocity decay coefficient is nearly the same for square and round diffusers and is 2.6 times greater than that for the vortex diffuser.

Keywords: Ceiling diffuser, Thermal Comfort, MAA, EDT, Fluent, Turbulence model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2144
7418 Establishing Econometric Modeling Equations for Lumpy Skin Disease Outbreaks in the Nile Delta of Egypt under Current Climate Conditions

Authors: Abdelgawad, Salah El-Tahawy

Abstract:

This paper aimed to establish econometrical equation models for the Nile delta region in Egypt, which will represent a basement for future predictions of Lumpy skin disease outbreaks and its pathway in relation to climate change. Data of lumpy skin disease (LSD) outbreaks were collected from the cattle farms located in the provinces representing the Nile delta region during 1 January, 2015 to December, 2015. The obtained results indicated that there was a significant association between the degree of the LSD outbreaks and the investigated climate factors (temperature, wind speed, and humidity) and the outbreaks peaked during the months of June, July, and August and gradually decreased to the lowest rate in January, February, and December. The model obtained depicted that the increment of these climate factors were associated with evidently increment on LSD outbreaks on the Nile Delta of Egypt. The model validation process was done by the root mean square error (RMSE) and means bias (MB) which compared the number of LSD outbreaks expected with the number of observed outbreaks and estimated the confidence level of the model. The value of RMSE was 1.38% and MB was 99.50% confirming that this established model described the current association between the LSD outbreaks and the change on climate factors and also can be used as a base for predicting the of LSD outbreaks depending on the climatic change on the future.

Keywords: LSD, climate factors, econometric models, Nile Delta.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 963
7417 Addressing Data Security in the Cloud

Authors: Marinela Mircea

Abstract:

The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.

Keywords: cloud computing, data life cycle, data security, security assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2161
7416 Condensation of Moist Air in Heat Exchanger Using CFD

Authors: Jan Barák, Karel Fraňa, Jörg Stiller

Abstract:

This work presents results of moist air condensation in heat exchanger. It describes theoretical knowledge and definition of moist air. Model with geometry of square canal was created for better understanding and postprocessing of condensation phenomena. Different approaches were examined on this model to find suitable software and model. Obtained knowledge was applied to geometry of real heat exchanger and results from experiment were compared with numerical results. One of the goals is to solve this issue without creating any user defined function in the applied code. It also contains summary of knowledge and outlook for future work.

Keywords: Condensation, exchanger, experiment, validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5589
7415 Transcriptomics Analysis on Comparing Non-Small Cell Lung Cancer versus Normal Lung, and Early Stage Compared versus Late-Stages of Non-Small Cell Lung Cancer

Authors: Achitphol Chookaew, Paramee Thongsukhsai, Patamarerk Engsontia, Narongwit Nakwan, Pritsana Raugrut

Abstract:

Lung cancer is one of the most common malignancies and primary cause of death due to cancer worldwide. Non-small cell lung cancer (NSCLC) is the main subtype in which majority of patients present with advanced-stage disease. Herein, we analyzed differentially expressed genes to find potential biomarkers for lung cancer diagnosis as well as prognostic markers. We used transcriptome data from our 2 NSCLC patients and public data (GSE81089) composing of 8 NSCLC and 10 normal lung tissues. Differentially expressed genes (DEGs) between NSCLC and normal tissue and between early-stage and late-stage NSCLC were analyzed by the DESeq2. Pairwise correlation was used to find the DEGs with false discovery rate (FDR) adjusted p-value £ 0.05 and |log2 fold change| ³ 4 for NSCLC versus normal and FDR adjusted p-value £ 0.05 with |log2 fold change| ³ 2 for early versus late-stage NSCLC. Bioinformatic tools were used for functional and pathway analysis. Moreover, the top ten genes in each comparison group were verified the expression and survival analysis via GEPIA. We found 150 up-regulated and 45 down-regulated genes in NSCLC compared to normal tissues. Many immnunoglobulin-related genes e.g., IGHV4-4, IGHV5-10-1, IGHV4-31, IGHV4-61, and IGHV1-69D were significantly up-regulated. 22 genes were up-regulated, and five genes were down-regulated in late-stage compared to early-stage NSCLC. The top five DEGs genes were KRT6B, SPRR1A, KRT13, KRT6A and KRT5. Keratin 6B (KRT6B) was the most significantly increased gene in the late-stage NSCLC. From GEPIA analysis, we concluded that IGHV4-31 and IGKV1-9 might be used as diagnostic biomarkers, while KRT6B and KRT6A might be used as prognostic biomarkers. However, further clinical validation is needed.

Keywords: Bioinformatics, differentially expressed genes, non-small cell lung cancer, transcriptomics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 898
7414 A Network Traffic Prediction Algorithm Based On Data Mining Technique

Authors: D. Prangchumpol

Abstract:

This paper is a description approach to predict incoming and outgoing data rate in network system by using association rule discover, which is one of the data mining techniques. Information of incoming and outgoing data in each times and network bandwidth are network performance parameters, which needed to solve in the traffic problem. Since congestion and data loss are important network problems. The result of this technique can predicted future network traffic. In addition, this research is useful for network routing selection and network performance improvement.

Keywords: Traffic prediction, association rule, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3670
7413 Fuzzy Processing of Uncertain Data

Authors: Petr Morávek, Miloš Šeda

Abstract:

In practice, we often come across situations where it is necessary to make decisions based on incomplete or uncertain data. In control systems it may be due to the unknown exact mathematical model, or its excessive complexity (e.g. nonlinearity) when it is necessary to simplify it, respectively, to solve it using a rule base. In the case of databases, searching data we compare a similarity measure with of the requirements of the selection with stored data, where both the select query and the data itself may contain vague terms, for example in the form of linguistic qualifiers. In this paper, we focus on the processing of uncertain data in databases and demonstrate it on the example multi-criteria decision making in the selection of variants, specified by higher number of technical parameters.

Keywords: fuzzy logic, linguistic variable, multicriteria decision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418
7412 Automated Stereophotogrammetry Data Cleansing

Authors: Stuart Henry, Philip Morrow, John Winder, Bryan Scotney

Abstract:

The stereophotogrammetry modality is gaining more widespread use in the clinical setting. Registration and visualization of this data, in conjunction with conventional 3D volumetric image modalities, provides virtual human data with textured soft tissue and internal anatomical and structural information. In this investigation computed tomography (CT) and stereophotogrammetry data is acquired from 4 anatomical phantoms and registered using the trimmed iterative closest point (TrICP) algorithm. This paper fully addresses the issue of imaging artifacts around the stereophotogrammetry surface edge using the registered CT data as a reference. Several iterative algorithms are implemented to automatically identify and remove stereophotogrammetry surface edge outliers, improving the overall visualization of the combined stereophotogrammetry and CT data. This paper shows that outliers at the surface edge of stereophotogrammetry data can be successfully removed automatically.

Keywords: Data cleansing, stereophotogrammetry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842
7411 Surface Elevation Dynamics Assessment Using Digital Elevation Models, Light Detection and Ranging, GPS and Geospatial Information Science Analysis: Ecosystem Modelling Approach

Authors: Ali K. M. Al-Nasrawi, Uday A. Al-Hamdany, Sarah M. Hamylton, Brian G. Jones, Yasir M. Alyazichi

Abstract:

Surface elevation dynamics have always responded to disturbance regimes. Creating Digital Elevation Models (DEMs) to detect surface dynamics has led to the development of several methods, devices and data clouds. DEMs can provide accurate and quick results with cost efficiency, in comparison to the inherited geomatics survey techniques. Nowadays, remote sensing datasets have become a primary source to create DEMs, including LiDAR point clouds with GIS analytic tools. However, these data need to be tested for error detection and correction. This paper evaluates various DEMs from different data sources over time for Apple Orchard Island, a coastal site in southeastern Australia, in order to detect surface dynamics. Subsequently, 30 chosen locations were examined in the field to test the error of the DEMs surface detection using high resolution global positioning systems (GPSs). Results show significant surface elevation changes on Apple Orchard Island. Accretion occurred on most of the island while surface elevation loss due to erosion is limited to the northern and southern parts. Concurrently, the projected differential correction and validation method aimed to identify errors in the dataset. The resultant DEMs demonstrated a small error ratio (≤ 3%) from the gathered datasets when compared with the fieldwork survey using RTK-GPS. As modern modelling approaches need to become more effective and accurate, applying several tools to create different DEMs on a multi-temporal scale would allow easy predictions in time-cost-frames with more comprehensive coverage and greater accuracy. With a DEM technique for the eco-geomorphic context, such insights about the ecosystem dynamic detection, at such a coastal intertidal system, would be valuable to assess the accuracy of the predicted eco-geomorphic risk for the conservation management sustainability. Demonstrating this framework to evaluate the historical and current anthropogenic and environmental stressors on coastal surface elevation dynamism could be profitably applied worldwide.

Keywords: DEMs, eco-geomorphic-dynamic processes, geospatial information science. Remote sensing, surface elevation changes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1158
7410 An Improved Data Mining Method Applied to the Search of Relationship between Metabolic Syndrome and Lifestyles

Authors: Yi Chao Huang, Yu Ling Liao, Chiu Shuang Lin

Abstract:

A data cutting and sorting method (DCSM) is proposed to optimize the performance of data mining. DCSM reduces the calculation time by getting rid of redundant data during the data mining process. In addition, DCSM minimizes the computational units by splitting the database and by sorting data with support counts. In the process of searching for the relationship between metabolic syndrome and lifestyles with the health examination database of an electronics manufacturing company, DCSM demonstrates higher search efficiency than the traditional Apriori algorithm in tests with different support counts.

Keywords: Data mining, Data cutting and sorting method, Apriori algorithm, Metabolic syndrome

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1588
7409 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: Data mining, hybrid storage system, recurrent neural network, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1736
7408 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, data mining, Hadoop, Map Reduce, MongoDB, NoSQL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 694
7407 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: Critical success factors, data quality, data quality management, Delphi, Q-Sort.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1109
7406 Biotechonomy System Dynamics Modelling: Sustainability of Pellet Production

Authors: Andra Blumberga, Armands Gravelsins, Haralds Vigants, Dagnija Blumberga

Abstract:

The paper discovers biotechonomy development analysis by use of system dynamics modelling. The research is connected with investigations of biomass application for production of bioproducts with higher added value. The most popular bioresource is wood, and therefore, the main question today is about future development and eco-design of products. The paper emphasizes and evaluates energy sector which is open for use of wood logs, wood chips, wood pellets and so on. The main aim for this research study was to build a framework to analyse development perspectives for wood pellet production. To reach the goal, a system dynamics model of energy wood supplies, processing, and consumption is built. Production capacity, energy consumption, changes in energy and technology efficiency, required labour source, prices of wood, energy and labour are taken into account. Validation and verification tests with available data and information have been carried out and indicate that the model constitutes the dynamic hypothesis. It is found that the more is invested into pellets production, the higher the specific profit per production unit compared to wood logs and wood chips. As a result, wood chips production is decreasing dramatically and is replaced by wood pellets. The limiting factor for pellet industry growth is availability of wood sources. This is governed by felling limit set by the government based on sustainable forestry principles.

Keywords: Bioenergy, biotechonomy, system dynamics modelling, wood pellets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1170
7405 Secure Data Aggregation Using Clusters in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

Wireless sensor network can be applied to both abominable and military environments. A primary goal in the design of wireless sensor networks is lifetime maximization, constrained by the energy capacity of batteries. One well-known method to reduce energy consumption in such networks is data aggregation. Providing efcient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present privacy-preserving data aggregation scheme for additive aggregation functions. The Cluster-based Private Data Aggregation (CPDA)leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes.

Keywords: Aggregation, Clustering, Query Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734
7404 PYTHEIA: A Scale for Assessing Rehabilitation and Assistive Robotics

Authors: Yiannis Koumpouros, Effie Papageorgiou, Alexandra Karavasili, Foteini Koureta

Abstract:

The objective of the present study was to develop a scale called PYTHEIA. The PYTHEIA is a self-reported measure for the assessment of rehabilitation and assistive robotics and other assistive technology devices. The development of PYTHEIA faced the absence of a valid instrument that can be used to evaluate the assistive robotic devices both as a whole, as well as any of their individual components or functionalities implemented. According to the results presented, PYTHEIA is a valid and reliable scale able to be applied to different target groups for the subjective evaluation of various assistive technology devices.

Keywords: Rehabilitation, assistive technology, assistive robots, rehabilitation robots, scale, psychometric test, assessment, validation, user satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795
7403 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: Affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, Signal Detection Theory, student engagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262
7402 Rapid Determination of Biochemical Oxygen Demand

Authors: Mayur Milan Kale, Indu Mehrotra

Abstract:

Biochemical Oxygen Demand (BOD) is a measure of the oxygen used in bacteria mediated oxidation of organic substances in water and wastewater. Theoretically an infinite time is required for complete biochemical oxidation of organic matter, but the measurement is made over 5-days at 20 0C or 3-days at 27 0C test period with or without dilution. Researchers have worked to further reduce the time of measurement. The objective of this paper is to review advancement made in BOD measurement primarily to minimize the time and negate the measurement difficulties. Survey of literature review in four such techniques namely BOD-BARTTM, Biosensors, Ferricyanidemediated approach, luminous bacterial immobilized chip method. Basic principle, method of determination, data validation and their advantage and disadvantages have been incorporated of each of the methods. In the BOD-BARTTM method the time lag is calculated for the system to change from oxidative to reductive state. BIOSENSORS are the biological sensing element with a transducer which produces a signal proportional to the analyte concentration. Microbial species has its metabolic deficiencies. Co-immobilization of bacteria using sol-gel biosensor increases the range of substrate. In ferricyanidemediated approach, ferricyanide has been used as e-acceptor instead of oxygen. In Luminous bacterial cells-immobilized chip method, bacterial bioluminescence which is caused by lux genes was observed. Physiological responses is measured and correlated to BOD due to reduction or emission. There is a scope to further probe into the rapid estimation of BOD.

Keywords: BOD, Four methods, Rapid estimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3641
7401 A New Protocol for Concealed Data Aggregation in Wireless Sensor Networks

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735
7400 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500
7399 Computing Entropy for Ortholog Detection

Authors: Hsing-Kuo Pao, John Case

Abstract:

Biological sequences from different species are called or-thologs if they evolved from a sequence of a common ancestor species and they have the same biological function. Approximations of Kolmogorov complexity or entropy of biological sequences are already well known to be useful in extracting similarity information between such sequences -in the interest, for example, of ortholog detection. As is well known, the exact Kolmogorov complexity is not algorithmically computable. In prac-tice one can approximate it by computable compression methods. How-ever, such compression methods do not provide a good approximation to Kolmogorov complexity for short sequences. Herein is suggested a new ap-proach to overcome the problem that compression approximations may notwork well on short sequences. This approach is inspired by new, conditional computations of Kolmogorov entropy. A main contribution of the empir-ical work described shows the new set of entropy-based machine learning attributes provides good separation between positive (ortholog) and nega-tive (non-ortholog) data - better than with good, previously known alter-natives (which do not employ some means to handle short sequences well).Also empirically compared are the new entropy based attribute set and a number of other, more standard similarity attributes sets commonly used in genomic analysis. The various similarity attributes are evaluated by cross validation, through boosted decision tree induction C5.0, and by Receiver Operating Characteristic (ROC) analysis. The results point to the conclu-sion: the new, entropy based attribute set by itself is not the one giving the best prediction; however, it is the best attribute set for use in improving the other, standard attribute sets when conjoined with them.

Keywords: compression, decision tree, entropy, ortholog, ROC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1827
7398 Supersonic Flow around a Dihedral Airfoil: Modeling and Experimentation Investigation

Authors: A. Naamane, M. Hasnaoui

Abstract:

Numerical modeling of fluid flows, whether compressible or incompressible, laminar or turbulent presents a considerable contribution in the scientific and industrial fields. However, the development of an approximate model of a supersonic flow requires the introduction of specific and more precise techniques and methods. For this purpose, the object of this paper is modeling a supersonic flow of inviscid fluid around a dihedral airfoil. Based on the thin airfoils theory and the non-dimensional stationary Steichen equation of a two-dimensional supersonic flow in isentropic evolution, we obtained a solution for the downstream velocity potential of the oblique shock at the second order of relative thickness that characterizes a perturbation parameter. This result has been dealt with by the asymptotic analysis and characteristics method. In order to validate our model, the results are discussed in comparison with theoretical and experimental results. Indeed, firstly, the comparison of the results of our model has shown that they are quantitatively acceptable compared to the existing theoretical results. Finally, an experimental study was conducted using the AF300 supersonic wind tunnel. In this experiment, we have considered the incident upstream Mach number over a symmetrical dihedral airfoil wing. The comparison of the different Mach number downstream results of our model with those of the existing theoretical data (relative margin between 0.07% and 4%) and with experimental results (concordance for a deflection angle between 1° and 11°) support the validation of our model with accuracy.

Keywords: Asymptotic modelling, dihedral airfoil, supersonic flow, supersonic wind tunnel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 715
7397 The New Method of Concealed Data Aggregation in Wireless Sensor: A Case Study

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1768
7396 Peakwise Smoothing of Data Models using Wavelets

Authors: D Sudheer Reddy, N Gopal Reddy, P V Radhadevi, J Saibaba, Geeta Varadan

Abstract:

Smoothing or filtering of data is first preprocessing step for noise suppression in many applications involving data analysis. Moving average is the most popular method of smoothing the data, generalization of this led to the development of Savitzky-Golay filter. Many window smoothing methods were developed by convolving the data with different window functions for different applications; most widely used window functions are Gaussian or Kaiser. Function approximation of the data by polynomial regression or Fourier expansion or wavelet expansion also gives a smoothed data. Wavelets also smooth the data to great extent by thresholding the wavelet coefficients. Almost all smoothing methods destroys the peaks and flatten them when the support of the window is increased. In certain applications it is desirable to retain peaks while smoothing the data as much as possible. In this paper we present a methodology called as peak-wise smoothing that will smooth the data to any desired level without losing the major peak features.

Keywords: smoothing, moving average, peakwise smoothing, spatialdensity models, planar shape models, wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
7395 A New Precautionary Method for Measurement and Improvement the Data Quality

Authors: Seyed Mohammad Hossein Moossavizadeh, Mehran Mohsenzadeh, Nasrin Arshadi

Abstract:

the data quality is a kind of complex and unstructured concept, which is concerned by information systems managers. The reason of this attention is the high amount of Expenses for maintenance and cleaning of the inefficient data. Such a data more than its expenses of lack of quality, cause wrong statistics, analysis and decisions in organizations. Therefor the managers intend to improve the quality of their information systems' data. One of the basic subjects of quality improvement is the evaluation of the amount of it. In this paper, we present a precautionary method, which with its application the data of information systems would have a better quality. Our method would cover different dimensions of data quality; therefor it has necessary integrity. The presented method has tested on three dimensions of accuracy, value-added and believability and the results confirm the improvement and integrity of this method.

Keywords: Data quality, precaution, information system, measurement, improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468
7394 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960
7393 Dynamic Modeling and Simulation of Threephase Small Power Induction Motor

Authors: Nyein Nyein Soe, Thet Thet Han Yee, Soe Sandar Aung

Abstract:

This paper is proposed the dynamic simulation of small power induction motor based on Mathematical modeling. The dynamic simulation is one of the key steps in the validation of the design process of the motor drive systems and it is needed for eliminating inadvertent design mistakes and the resulting error in the prototype construction and testing. This paper demonstrates the simulation of steady-state performance of induction motor by MATLAB Program Three phase 3 hp induction motor is modeled and simulated with SIMULINK model.

Keywords: Squirrel cage induction motor, modeling andsimulation, MATLAB software, torque, speed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4520
7392 Weigh-in-Motion Data Analysis Software for Developing Traffic Data for Mechanistic Empirical Pavement Design

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

Currently, there are few user friendly Weigh-in- Motion (WIM) data analysis softwares available which can produce traffic input data for the recently developed AASHTOWare pavement Mechanistic-Empirical (ME) design software. However, these softwares have only rudimentary Quality Control (QC) processes. Therefore, they cannot properly deal with erroneous WIM data. As the pavement performance is highly sensible to the quality of WIM data, it is highly recommended to use more refined QC process on raw WIM data to get a good result. This study develops a userfriendly software, which can produce traffic input for the ME design software. This software takes the raw data (Class and Weight data) collected from the WIM station and processes it with a sophisticated QC procedure. Traffic data such as traffic volume, traffic distribution, axle load spectra, etc. can be obtained from this software; which can directly be used in the ME design software.

Keywords: Weigh-in-motion, software, axle load spectra, traffic distribution, AASHTOWare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1896
7391 Human Growth Curve Estimation through a Combination of Longitudinal and Cross-sectional Data

Authors: Sedigheh Mirzaei S., Debasis Sengupta

Abstract:

Parametric models have been quite popular for studying human growth, particularly in relation to biological parameters such as peak size velocity and age at peak size velocity. Longitudinal data are generally considered to be vital for fittinga parametric model to individual-specific data, and for studying the distribution of these biological parameters in a human population. However, cross-sectional data are easier to obtain than longitudinal data. In this paper, we present a method of combining longitudinal and cross-sectional data for the purpose of estimating the distribution of the biological parameters. We demonstrate, through simulations in the special case ofthePreece Baines model, how estimates based on longitudinal data can be improved upon by harnessing the information contained in cross-sectional data.We study the extent of improvement for different mixes of the two types of data, and finally illustrate the use of the method through data collected by the Indian Statistical Institute.

Keywords: Preece-Baines growth model, MCMC method, Mixed effect model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2139
7390 Semantic Support for Hypothesis-Based Research from Smart Environment Monitoring and Analysis Technologies

Authors: T. S. Myers, J. Trevathan

Abstract:

Improvements in the data fusion and data analysis phase of research are imperative due to the exponential growth of sensed data. Currently, there are developments in the Semantic Sensor Web community to explore efficient methods for reuse, correlation and integration of web-based data sets and live data streams. This paper describes the integration of remotely sensed data with web-available static data for use in observational hypothesis testing and the analysis phase of research. The Semantic Reef system combines semantic technologies (e.g., well-defined ontologies and logic systems) with scientific workflows to enable hypothesis-based research. A framework is presented for how the data fusion concepts from the Semantic Reef architecture map to the Smart Environment Monitoring and Analysis Technologies (SEMAT) intelligent sensor network initiative. The data collected via SEMAT and the inferred knowledge from the Semantic Reef system are ingested to the Tropical Data Hub for data discovery, reuse, curation and publication.

Keywords: Information architecture, Semantic technologies Sensor networks, Ontologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1715