Search results for: data protection
6634 Towards End-To-End Disease Prediction from Raw Metagenomic Data
Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker
Abstract:
Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.Keywords: Metagenomics, phenotype prediction, deep learning, embeddings, multiple instance learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9096633 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14416632 Establishing a Probabilistic Model of Extrapolated Wind Speed Data for Wind Energy Prediction
Authors: Mussa I. Mgwatu, Reuben R. M. Kainkwa
Abstract:
Wind is among the potential energy resources which can be harnessed to generate wind energy for conversion into electrical power. Due to the variability of wind speed with time and height, it becomes difficult to predict the generated wind energy more optimally. In this paper, an attempt is made to establish a probabilistic model fitting the wind speed data recorded at Makambako site in Tanzania. Wind speeds and direction were respectively measured using anemometer (type AN1) and wind Vane (type WD1) both supplied by Delta-T-Devices at a measurement height of 2 m. Wind speeds were then extrapolated for the height of 10 m using power law equation with an exponent of 0.47. Data were analysed using MINITAB statistical software to show the variability of wind speeds with time and height, and to determine the underlying probability model of the extrapolated wind speed data. The results show that wind speeds at Makambako site vary cyclically over time; and they conform to the Weibull probability distribution. From these results, Weibull probability density function can be used to predict the wind energy.Keywords: Probabilistic models, wind speed, wind energy
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23456631 Demographic Factors Influencing Employees’ Salary Expectations and Labor Turnover
Authors: M. Osipova
Abstract:
Thanks to informational technologies development every sphere of economics is becoming more and more datacentralized as people are generating huge datasets containing information on any aspect of their life. Applying research of such data to human resources management allows getting scarce statistics on labor market state including salary expectations and potential employees’ typical career behavior, and this information can become a reliable basis for management decisions. The following article presents results of career behavior research based on freely accessible resume data. Information used for study is much wider than one usually uses in human resources surveys. That is why there is enough data for statistically significant results even for subgroups analysis.
Keywords: Human resources management, labor market, salary expectations, statistics, turnover.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18456630 Mathematical Modeling to Predict Surface Roughness in CNC Milling
Authors: Ab. Rashid M.F.F., Gan S.Y., Muhammad N.Y.
Abstract:
Surface roughness (Ra) is one of the most important requirements in machining process. In order to obtain better surface roughness, the proper setting of cutting parameters is crucial before the process take place. This research presents the development of mathematical model for surface roughness prediction before milling process in order to evaluate the fitness of machining parameters; spindle speed, feed rate and depth of cut. 84 samples were run in this study by using FANUC CNC Milling α-Τ14ιE. Those samples were randomly divided into two data sets- the training sets (m=60) and testing sets(m=24). ANOVA analysis showed that at least one of the population regression coefficients was not zero. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the surface roughness is most influenced by the feed rate. By using Multiple Regression Method equation, the average percentage deviation of the testing set was 9.8% and 9.7% for training data set. This showed that the statistical model could predict the surface roughness with about 90.2% accuracy of the testing data set and 90.3% accuracy of the training data set.
Keywords: Surface roughness, regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21306629 Parameter Estimation using Maximum Likelihood Method from Flight Data at High Angles of Attack
Authors: Rakesh Kumar, A. K. Ghosh
Abstract:
The paper presents the modeling of nonlinear longitudinal aerodynamics using flight data of Hansa-3 aircraft at high angles of attack near stall. The Kirchhoff-s quasi-steady stall model has been used to incorporate nonlinear aerodynamic effects in the aerodynamic model used to estimate the parameters, thereby, making the aerodynamic model nonlinear. The Maximum Likelihood method has been applied to the flight data (at high angles of attack) for the estimation of parameters (aerodynamic and stall characteristics) using the nonlinear aerodynamic model. To improve the accuracy level of the estimates, an approach of fixing the strong parameters has also been presented.Keywords: Maximum Likelihood, nonlinear, parameters, stall.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22156628 Network Anomaly Detection using Soft Computing
Authors: Surat Srinoy, Werasak Kurutach, Witcha Chimphlee, Siriporn Chimphlee
Abstract:
One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.Keywords: Network security, intrusion detection, rough set, ICA, anomaly detection, independent component analysis, rough fuzzy .
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19546627 Automatic Thresholding for Data Gap Detection for a Set of Sensors in Instrumented Buildings
Authors: Houda Najeh, Stéphane Ploix, Mahendra Pratap Singh, Karim Chabir, Mohamed Naceur Abdelkrim
Abstract:
Building systems are highly vulnerable to different kinds of faults and failures. In fact, various faults, failures and human behaviors could affect the building performance. This paper tackles the detection of unreliable sensors in buildings. Different literature surveys on diagnosis techniques for sensor grids in buildings have been published but all of them treat only bias and outliers. Occurences of data gaps have also not been given an adequate span of attention in the academia. The proposed methodology comprises the automatic thresholding for data gap detection for a set of heterogeneous sensors in instrumented buildings. Sensor measurements are considered to be regular time series. However, in reality, sensor values are not uniformly sampled. So, the issue to solve is from which delay each sensor become faulty? The use of time series is required for detection of abnormalities on the delays. The efficiency of the method is evaluated on measurements obtained from a real power plant: an office at Grenoble Institute of technology equipped by 30 sensors.Keywords: Building system, time series, diagnosis, outliers, delay, data gap.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9016626 Daily and Seasonal Changes of Air Pollution in Kuwait
Authors: H. Ettouney, A. AL-Haddad, S. Saqer
Abstract:
This paper focuses on assessment of air pollution in Umm-Alhyman, Kuwait, which is located south to oil refineries, power station, oil field, and highways. The measurements were made over a period of four days in March and July in 2001, 2004, and 2008. The measured pollutants included methanated and nonmethanated hydrocarbons (MHC, NMHC), CO, CO2, SO2, NOX, O3, and PM10. Also, meteorological parameters were measured, which includes temperature, wind speed and direction, and solar radiation. Over the study period, data analysis showed increase in measured SO2, NOX and CO by factors of 1.2, 5.5 and 2, respectively. This is explained in terms of increase in industrial activities, motor vehicle density, and power generation. Predictions of the measured data were made by the ISC-AERMOD software package and by using the ISCST3 model option. Finally, comparison was made between measured data against international standards.
Keywords: Air pollution, Emission inventory, ISCST3 model, Modeling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24176625 Regular Data Broadcasting Plan with Grouping in Wireless Mobile Environment
Authors: John T. Tsiligaridis
Abstract:
The broadcast problem including the plan design is considered. The data are inserted and numbered at predefined order into customized size relations. The server ability to create a full, regular Broadcast Plan (RBP) with single and multiple channels after some data transformations is examined. The Regular Geometric Algorithm (RGA) prepares a RBP and enables the users to catch their items avoiding energy waste of their devices. Moreover, the Grouping Dimensioning Algorithm (GDA) based on integrated relations can guarantee the discrimination of services with a minimum number of channels. This last property among the selfmonitoring, self-organizing, can be offered by servers today providing also channel availability and less energy consumption by using smaller number of channels. Simulation results are provided.Keywords: Broadcast, broadcast plan, mobile computing, wireless networks, scheduling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14526624 Survival Model for Partly Interval-Censored Data with Application to Anti D in Rhesus D Negative Studies
Authors: F. A. M. Elfaki, Amar Abobakar, M. Azram, M. Usman
Abstract:
This paper discusses regression analysis of partly interval-censored failure time data, which is occur in many fields including demographical, epidemiological, financial, medical and sociological studies. For the problem, we focus on the situation where the survival time of interest can be described by the additive hazards model in the present of partly interval-censored. A major advantage of the approach is its simplicity and it can be easily implemented by using R software. Simulation studies are conducted which indicate that the approach performs well for practical situations and comparable to the existing methods. The methodology is applied to a set of partly interval-censored failure time data arising from anti D in Rhesus D negative studies.
Keywords: Anti D in Rhesus D negative, Cox’s model, EM algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16926623 Metabolic Predictive Model for PMV Control Based on Deep Learning
Authors: Eunji Choi, Borang Park, Youngjae Choi, Jinwoo Moon
Abstract:
In this study, a predictive model for estimating the metabolism (MET) of human body was developed for the optimal control of indoor thermal environment. Human body images for indoor activities and human body joint coordinated values were collected as data sets, which are used in predictive model. A deep learning algorithm was used in an initial model, and its number of hidden layers and hidden neurons were optimized. Lastly, the model prediction performance was analyzed after the model being trained through collected data. In conclusion, the possibility of MET prediction was confirmed, and the direction of the future study was proposed as developing various data and the predictive model.
Keywords: Deep learning, indoor quality, metabolism, predictive model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11926622 Optimising Data Transmission in Heterogeneous Sensor Networks
Authors: M. Hammerton, J. Trevathan, T. Myers, W. Read
Abstract:
The transfer rate of messages in distributed sensor network applications is a critical factor in a system's performance. The Sensor Abstraction Layer (SAL) is one such system. SAL is a middleware integration platform for abstracting sensor specific technology in order to integrate heterogeneous types of sensors in a network. SAL uses Java Remote Method Invocation (RMI) as its connection method, which has unsatisfying transfer rates, especially for streaming data. This paper analyses different connection methods to optimize data transmission in SAL by replacing RMI. Our results show that the most promising Java-based connections were frameworks for Java New Input/Output (NIO) including Apache MINA, JBoss Netty, and xSocket. A test environment was implemented to evaluate each respective framework based on transfer rate, resource usage, and scalability. Test results showed the most suitable connection method to improve data transmission in SAL JBoss Netty as it provides a performance enhancement of 68%.
Keywords: Wireless sensor networks, remote method invocation, transmission time.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20366621 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database
Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami
Abstract:
The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.
Keywords: Pattern recognition, partitional clustering, K-means clustering, Manhattan distance, terrorism data analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13586620 HelpMeBreathe: A Web-Based System for Asthma Management
Authors: Alia Al Rayssi, Mahra Al Marar, Alyazia Alkhaili, Reem Al Dhaheri, Shayma Alkobaisi, Hoda Amer
Abstract:
We present in this paper a web-based system called “HelpMeBreathe” for managing asthma. The proposed system provides analytical tools, which allow better understanding of environmental triggers of asthma, hence better support of data-driven decision making. The developed system provides warning messages to a specific asthma patient if the weather in his/her area might cause any difficulty in breathing or could trigger an asthma attack. HelpMeBreathe collects, stores, and analyzes individuals’ moving trajectories and health conditions as well as environmental data. It then processes and displays the patients’ data through an analytical tool that leads to an effective decision making by physicians and other decision makers.
Keywords: Asthma, environmental triggers, map interface, peak flow, web-based system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8686619 The Development of the Prototype of Bamboo Shading Device
Authors: N. Tuaycharoen, W. Konisranukul
Abstract:
The main aim of this research was to investigate a prototype bamboo shading device. There were two objectives to this study: first, to investigate the effects of non-chemical treatments on bamboo shading devices damaged by powder-post beetles and fungi, and second to develop a prototype bamboo shading device. This study of the effects of non-chemical treatments on bamboo shading devices damage by powder-post beetles in the laboratory showed that, among seven treatments tested, wood vinegar treatment can protect powder-post beetles better than the original method by up to 92.91%. It was also found that wood vinegar treatment shows the best performance in fungi protection and works better than the original method by up to 40%. A second experiment was carried out by constructing four bamboo shading devices and installing them on a building for 28 days. All aspects of shading device were investigated in terms of their beauty, durability, and ease of construction and assembly. The final prototype was developed from the lessons learned from the test results. In conclusion, this study showed the effectiveness of some natural preservatives against insect and fungi damage, and it also illustrated the characteristics of a prototype bamboo shading device that can be constructed by rural workers within one week.
Keywords: Bamboo, shading device, energy conservation, alternative materials.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24756618 Application of Neural Networks for 24-Hour-Ahead Load Forecasting
Authors: Fatemeh Mosalman Yazdi
Abstract:
One of the most important requirements for the operation and planning activities of an electrical utility is the prediction of load for the next hour to several days out, known as short term load forecasting. This paper presents the development of an artificial neural network based short-term load forecasting model. The model can forecast daily load profiles with a load time of one day for next 24 hours. In this method can divide days of year with using average temperature. Groups make according linearity rate of curve. Ultimate forecast for each group obtain with considering weekday and weekend. This paper investigates effects of temperature and humidity on consuming curve. For forecasting load curve of holidays at first forecast pick and valley and then the neural network forecast is re-shaped with the new data. The ANN-based load models are trained using hourly historical. Load data and daily historical max/min temperature and humidity data. The results of testing the system on data from Yazd utility are reported.Keywords: Artificial neural network, Holiday forecasting, pickand valley load forecasting, Short-term load-forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21916617 On the Joint Optimization of Performance and Power Consumption in Data Centers
Authors: Samee Ullah Khan, C. Ardil
Abstract:
We model the process of a data center as a multi- objective problem of mapping independent tasks onto a set of data center machines that simultaneously minimizes the energy consump¬tion and response time (makespan) subject to the constraints of deadlines and architectural requirements. A simple technique based on multi-objective goal programming is proposed that guarantees Pareto optimal solution with excellence in convergence process. The proposed technique also is compared with other traditional approach. The simulation results show that the proposed technique achieves superior performance compared to the min-min heuristics, and com¬petitive performance relative to the optimal solution implemented in UNDO for small-scale problems.
Keywords: Energy-efficient computing, distributed systems, multi-objective optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16906616 The Effect of Sowing Time on Phytopathogenic Characteristics and Yield of Sunflower Hybrids
Authors: Adrienn Novák
Abstract:
The field research was carried out at the Látókép AGTC KIT research area of the University of Debrecen in Eastern-Hungary, on the area of the aeolain loess of the Hajdúság. We examined the effects of the sowing time on the phytopathogenic characteristics and yield production by applying various fertilizer treatments on two different sunflower genotypes (NK Ferti, PR64H42) in 2012 and 2013. We applied three different sowing times (early, optimal, late) and two different treatment levels of fungicides (control = no fungicides applied, double fungicide protection).
During our investigations, the studied cropyears were of different sowing time optimum in terms of yield amount (2012: early, 2013: average). By Pearson’s correlation analysis, we have found that delaying the sowing time pronouncedly decreased the extent of infection in both crop years (Diaporthe: r=0.663**, r=0.681**, Sclerotinia: r=0.465**, r=0.622**). The fungicide treatment not only decreased the extent of infection, but had yield increasing effect too (2012: r=0.498**, 2013: r=0.603**). In 2012, delaying of the sowing time increased (r=0.600**), but in 2013, it decreased (r= 0.356*) the yield amount.
Keywords: Fungicide treatment, genotypes, sowing time, yield, sunflower.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18246615 Preparing Data for Calibration of Mechanistic-Empirical Pavement Design Guide in Central Saudi Arabia
Authors: Abdulraaof H. Alqaili, Hamad A. Alsoliman
Abstract:
Through progress in pavement design developments, a pavement design method was developed, which is titled the Mechanistic Empirical Pavement Design Guide (MEPDG). Nowadays, the evolution in roads network and highways is observed in Saudi Arabia as a result of increasing in traffic volume. Therefore, the MEPDG currently is implemented for flexible pavement design by the Saudi Ministry of Transportation. Implementation of MEPDG for local pavement design requires the calibration of distress models under the local conditions (traffic, climate, and materials). This paper aims to prepare data for calibration of MEPDG in Central Saudi Arabia. Thus, the first goal is data collection for the design of flexible pavement from the local conditions of the Riyadh region. Since, the modifying of collected data to input data is needed; the main goal of this paper is the analysis of collected data. The data analysis in this paper includes processing each: Trucks Classification, Traffic Growth Factor, Annual Average Daily Truck Traffic (AADTT), Monthly Adjustment Factors (MAFi), Vehicle Class Distribution (VCD), Truck Hourly Distribution Factors, Axle Load Distribution Factors (ALDF), Number of axle types (single, tandem, and tridem) per truck class, cloud cover percent, and road sections selected for the local calibration. Detailed descriptions of input parameters are explained in this paper, which leads to providing of an approach for successful implementation of MEPDG. Local calibration of MEPDG to the conditions of Riyadh region can be performed based on the findings in this paper.
Keywords: Mechanistic-empirical pavement design guide, traffic characteristics, materials properties, climate, Riyadh.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12196614 Bayesian Decision Approach to Protection on the Flood Event in Upper Ayeyarwady River, Myanmar
Authors: Min Min Swe Zin
Abstract:
This paper introduces the foundations of Bayesian probability theory and Bayesian decision method. The main goal of Bayesian decision theory is to minimize the expected loss of a decision or minimize the expected risk. The purposes of this study are to review the decision process on the issue of flood occurrences and to suggest possible process for decision improvement. This study examines the problem structure of flood occurrences and theoretically explicates the decision-analytic approach based on Bayesian decision theory and application to flood occurrences in Environmental Engineering. In this study, we will discuss about the flood occurrences upon an annual maximum water level in cm, 43-year record available from 1965 to 2007 at the gauging station of Sagaing on the Ayeyarwady River with the drainage area - 120193 sq km by using Bayesian decision method. As a result, we will discuss the loss and risk of vast areas of agricultural land whether which will be inundated or not in the coming year based on the two standard maximum water levels during 43 years. And also we forecast about that lands will be safe from flood water during the next 10 years.
Keywords: Bayesian decision method, conditional binomial distribution, minimax rules, prior beta distribution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15786613 Economized Sensor Data Processing with Vehicle Platooning
Authors: Henry Hexmoor, Kailash Yelasani
Abstract:
We present vehicular platooning as a special case of crowd-sensing framework where sharing sensory information among a crowd is used for their collective benefit. After offering an abstract policy that governs processes involving a vehicular platoon, we review several common scenarios and components surrounding vehicular platooning. We then present a simulated prototype that illustrates efficiency of road usage and vehicle travel time derived from platooning. We have argued that one of the paramount benefits of platooning that is overlooked elsewhere, is the substantial computational savings (i.e., economizing benefits) in acquisition and processing of sensory data among vehicles sharing the road. The most capable vehicle can share data gathered from its sensors with nearby vehicles grouped into a platoon.
Keywords: Cloud network, collaboration, Internet of Things, social network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7086612 Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process
Authors: Jan Stodt, Christoph Reich
Abstract:
The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.Keywords: Audit, machine learning, assessment, metrics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10226611 Evaluation of Antioxidant Activities of Cabbage (Brassica oleracea L. var. capitata L.)
Authors: Rutanachai Thaipratum
Abstract:
At present, it is widely-known that free radicals are the causes of illness such as cancers, coronary heart disease, Alzheimer’s disease and aging. One method of protection from free radical is the consumption of antioxidant-containing foods or herbs. Several analytical methods have been used for qualitative and quantitative determination of antioxidants. This project aimed to evaluate antioxidant activity of ethanolic and aqueous extracts from cabbage (Brassicca oleracea L. var. capitata L.) measured by DPPH and Hydroxyl radical scavenging method. The results show that averaged antioxidant activity measured in ethanolic extract (µmol Ascorbic acid equivalent/g fresh mass) were 7.316 ± 0.715 and 4.66 ± 1.029 as determined by DPPH and Hydroxyl radical scavenging activity assays respectively. Averaged antioxidant activity measured in aqueous extract (µmol Ascorbic acid equivalent/g fresh mass) were 15.141 ± 2.092 and 4.955 ± 1.975 as determined by DPPH and Hydroxyl radical scavenging activity assays respectively.
Keywords: Free radical, antioxidant, cabbage, Brassicca oleracea L. var. capitata L.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22786610 Information Seeking through Assimilation Process in Thai Organization
Authors: Pornprom Chomngam
Abstract:
The purpose of this study is to examine employee assessments of the usefulness/value of different types of information available to those employees during the process of organizational assimilation. Participants in the study were 247 “new" employees at Bangkok Bank. Bangkok Bank considers employees whose length of stay with the bank has been less than 18 months as new employees. Questionnaires were administered to all of the Bank-s new employees to obtain the data for this study. Repeated measures analysis was used to analyze the data. The data were summed and coded by using Statistical Package for Social Science. Newcomers indicate that social information is the most useful information, followed by job (technical, referent, and appraisal information), political, normative, and organizational information. Essentially, social, job, and political information are evaluated by newcomers as highly useful, while normative and organizational information are rated as moderately useful.
Keywords: Information seeking, organization assimilation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16596609 Image Steganography Using Least Significant Bit Technique
Authors: Preeti Kumari, Ridhi Kapoor
Abstract:
In any communication, security is the most important issue in today’s world. In this paper, steganography is the process of hiding the important data into other data, such as text, audio, video, and image. The interest in this topic is to provide availability, confidentiality, integrity, and authenticity of data. The steganographic technique that embeds hides content with unremarkable cover media so as not to provoke eavesdropper’s suspicion or third party and hackers. In which many applications of compression, encryption, decryption, and embedding methods are used for digital image steganography. Due to compression, the nose produces in the image. To sustain noise in the image, the LSB insertion technique is used. The performance of the proposed embedding system with respect to providing security to secret message and robustness is discussed. We also demonstrate the maximum steganography capacity and visual distortion.Keywords: Steganography, LSB, encoding, information hiding, color image.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10916608 Pavement Roughness Prediction Systems: A Bump Integrator Approach
Authors: Manish Pal, Rumi Sutradhar
Abstract:
Pavement surface unevenness plays a pivotal role on roughness index of road which affects on riding comfort ability. Comfort ability refers to the degree of protection offered to vehicle occupants from uneven elements in the road surface. So, it is preferable to have a lower roughness index value for a better riding quality of road users. Roughness is generally defined as an expression of irregularities in the pavement surface which can be measured using different equipments like MERLIN, Bump integrator, Profilometer etc. Among them Bump Integrator is quite simple and less time consuming in case of long road sections. A case study is conducted on low volume roads in West District in Tripura to determine roughness index (RI) using Bump Integrator at the standard speed of 32 km/h. But it becomes too tough to maintain the requisite standard speed throughout the road section. The speed of Bump Integrator (BI) has to lower or higher in some distinctive situations. So, it becomes necessary to convert these roughness index values of other speeds to the standard speed of 32 km/h. This paper highlights on that roughness index conversional model. Using SPSS (Statistical Package of Social Sciences) software a generalized equation is derived among the RI value at standard speed of 32 km/h and RI value at other speed conditions.
Keywords: Bump Integrator, Pavement Distresses, Roughness Index, SPSS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 66696607 Implementation of the Outputs of Computer Simulation to Support Decision-Making Processes
Authors: Jiří Barta
Abstract:
At the present time, awareness, education, computer simulation and information systems protection are very serious and relevant topics. The article deals with perspectives and possibilities of implementation of emergence or natural hazard threats into the system which is developed for communication among members of crisis management staffs. The Czech Hydro-Meteorological Institute with its System of Integrated Warning Service resents the largest usable base of information. National information systems are connected to foreign systems, especially to flooding emergency systems of neighboring countries, systems of European Union and international organizations where the Czech Republic is a member. Use of outputs of particular information systems and computer simulations on a single communication interface of information system for communication among members of crisis management staff and setting the site interoperability in the net will lead to time savings in decision-making processes in solving extraordinary events and crisis situations. Faster managing of an extraordinary event or a crisis situation will bring positive effects and minimize the impact of negative effects on the environment.Keywords: Computer simulation, communication, continuity, critical infrastructure, information systems, safety.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17186606 Using Data Mining Techniques for Finding Cardiac Outlier Patients
Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi
Abstract:
In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.Keywords: Data Mining, Clustering, Classification, Drug Utilization..
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18976605 Retail Strategy to Reduce Waste Keeping High Profit Utilizing Taylor's Law in Point-of-Sales Data
Authors: Gen Sakoda, Hideki Takayasu, Misako Takayasu
Abstract:
Waste reduction is a fundamental problem for sustainability. Methods for waste reduction with point-of-sales (POS) data are proposed, utilizing the knowledge of a recent econophysics study on a statistical property of POS data. Concretely, the non-stationary time series analysis method based on the Particle Filter is developed, which considers abnormal fluctuation scaling known as Taylor's law. This method is extended for handling incomplete sales data because of stock-outs by introducing maximum likelihood estimation for censored data. The way for optimal stock determination with pricing the cost of waste reduction is also proposed. This study focuses on the examination of the methods for large sales numbers where Taylor's law is obvious. Numerical analysis using aggregated POS data shows the effectiveness of the methods to reduce food waste maintaining a high profit for large sales numbers. Moreover, the way of pricing the cost of waste reduction reveals that a small profit loss realizes substantial waste reduction, especially in the case that the proportionality constant of Taylor’s law is small. Specifically, around 1% profit loss realizes half disposal at =0.12, which is the actual value of processed food items used in this research. The methods provide practical and effective solutions for waste reduction keeping a high profit, especially with large sales numbers.
Keywords: Food waste reduction, particle filter, point of sales, sustainable development goals, Taylor's Law, time series analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 870