Search results for: Mekelle outlier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 72

Search results for: Mekelle outlier

42 Density-based Denoising of Point Cloud

Authors: Faisal Zaman, Ya Ping Wong, Boon Yian Ng

Abstract:

Point cloud source data for surface reconstruction is usually contaminated with noise and outliers. To overcome this, we present a novel approach using modified kernel density estimation (KDE) technique with bilateral filtering to remove noisy points and outliers. First we present a method for estimating optimal bandwidth of multivariate KDE using particle swarm optimization technique which ensures the robust performance of density estimation. Then we use mean-shift algorithm to find the local maxima of the density estimation which gives the centroid of the clusters. Then we compute the distance of a certain point from the centroid. Points belong to outliers then removed by automatic thresholding scheme which yields an accurate and economical point surface. The experimental results show that our approach comparably robust and efficient.

Keywords: point preprocessing, outlier removal, surface reconstruction, kernel density estimation

Procedia PDF Downloads 328
41 Single-Cell Visualization with Minimum Volume Embedding

Authors: Zhenqiu Liu

Abstract:

Visualizing the heterogeneity within cell-populations for single-cell RNA-seq data is crucial for studying the functional diversity of a cell. However, because of the high level of noises, outlier, and dropouts, it is very challenging to measure the cell-to-cell similarity (distance), visualize and cluster the data in a low-dimension. Minimum volume embedding (MVE) projects the data into a lower-dimensional space and is a promising tool for data visualization. However, it is computationally inefficient to solve a semi-definite programming (SDP) when the sample size is large. Therefore, it is not applicable to single-cell RNA-seq data with thousands of samples. In this paper, we develop an efficient algorithm with an accelerated proximal gradient method and visualize the single-cell RNA-seq data efficiently. We demonstrate that the proposed approach separates known subpopulations more accurately in single-cell data sets than other existing dimension reduction methods.

Keywords: single-cell RNA-seq, minimum volume embedding, visualization, accelerated proximal gradient method

Procedia PDF Downloads 208
40 An Efficient Fundamental Matrix Estimation for Moving Object Detection

Authors: Yeongyu Choi, Ju H. Park, S. M. Lee, Ho-Youl Jung

Abstract:

In this paper, an improved method for estimating fundamental matrix is proposed. The method is applied effectively to monocular camera based moving object detection. The method consists of corner points detection, moving object’s motion estimation and fundamental matrix calculation. The corner points are obtained by using Harris corner detector, motions of moving objects is calculated from pyramidal Lucas-Kanade optical flow algorithm. Through epipolar geometry analysis using RANSAC, the fundamental matrix is calculated. In this method, we have improved the performances of moving object detection by using two threshold values that determine inlier or outlier. Through the simulations, we compare the performances with varying the two threshold values.

Keywords: corner detection, optical flow, epipolar geometry, RANSAC

Procedia PDF Downloads 383
39 Literature Review on Text Comparison Techniques: Analysis of Text Extraction, Main Comparison and Visual Representation Tools

Authors: Andriana Mkrtchyan, Vahe Khlghatyan

Abstract:

The choice of a profession is one of the most important decisions people make throughout their life. With the development of modern science, technologies, and all the spheres existing in the modern world, more and more professions are being arisen that complicate even more the process of choosing. Hence, there is a need for a guiding platform to help people to choose a profession and the right career path based on their interests, skills, and personality. This review aims at analyzing existing methods of comparing PDF format documents and suggests that a 3-stage approach is implemented for the comparison, that is – 1. text extraction from PDF format documents, 2. comparison of the extracted text via NLP algorithms, 3. comparison representation using special shape and color psychology methodology.

Keywords: color psychology, data acquisition/extraction, data augmentation, disambiguation, natural language processing, outlier detection, semantic similarity, text-mining, user evaluation, visual search

Procedia PDF Downloads 46
38 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine

Procedia PDF Downloads 489
37 Internet Purchases in European Union Countries: Multiple Linear Regression Approach

Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić

Abstract:

This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.

Keywords: European union, Internet purchases, multiple linear regression model, outlier

Procedia PDF Downloads 283
36 Hybrid Robust Estimation via Median Filter and Wavelet Thresholding with Automatic Boundary Correction

Authors: Alsaidi M. Altaher, Mohd Tahir Ismail

Abstract:

Wavelet thresholding has been a power tool in curve estimation and data analysis. In the presence of outliers this non parametric estimator can not suppress the outliers involved. This study proposes a new two-stage combined method based on the use of the median filter as primary step before applying wavelet thresholding. After suppressing the outliers in a signal through the median filter, the classical wavelet thresholding is then applied for removing the remaining noise. We use automatic boundary corrections; using a low order polynomial model or local polynomial model as a more realistic rule to correct the bias at the boundary region; instead of using the classical assumptions such periodic or symmetric. A simulation experiment has been conducted to evaluate the numerical performance of the proposed method. Results show strong evidences that the proposed method is extremely effective in terms of correcting the boundary bias and eliminating outlier’s sensitivity.

Keywords: boundary correction, median filter, simulation, wavelet thresholding

Procedia PDF Downloads 406
35 A Machine Learning-Based Analysis of Autism Prevalence Rates across US States against Multiple Potential Explanatory Variables

Authors: Ronit Chakraborty, Sugata Banerji

Abstract:

There has been a marked increase in the reported prevalence of Autism Spectrum Disorder (ASD) among children in the US over the past two decades. This research has analyzed the growth in state-level ASD prevalence against 45 different potentially explanatory factors, including socio-economic, demographic, healthcare, public policy, and political factors. The goal was to understand if these factors have adequate predictive power in modeling the differential growth in ASD prevalence across various states and if they do, which factors are the most influential. The key findings of this study include (1) the confirmation that the chosen feature set has considerable power in predicting the growth in ASD prevalence, (2) the identification of the most influential predictive factors, (3) given the nature of the most influential predictive variables, an indication that a considerable portion of the reported ASD prevalence differentials across states could be attributable to over and under diagnosis, and (4) identification of Florida as a key outlier state pointing to a potential under-diagnosis of ASD there.

Keywords: autism spectrum disorder, clustering, machine learning, predictive modeling

Procedia PDF Downloads 79
34 Evaluation of Fusion Sonar and Stereo Camera System for 3D Reconstruction of Underwater Archaeological Object

Authors: Yadpiroon Onmek, Jean Triboulet, Sebastien Druon, Bruno Jouvencel

Abstract:

The objective of this paper is to develop the 3D underwater reconstruction of archaeology object, which is based on the fusion between a sonar system and stereo camera system. The underwater images are obtained from a calibrated camera system. The multiples image pairs are input, and we first solve the problem of image processing by applying the well-known filter, therefore to improve the quality of underwater images. The features of interest between image pairs are selected by well-known methods: a FAST detector and FLANN descriptor. Subsequently, the RANSAC method is applied to reject outlier points. The putative inliers are matched by triangulation to produce the local sparse point clouds in 3D space, using a pinhole camera model and Euclidean distance estimation. The SFM technique is used to carry out the global sparse point clouds. Finally, the ICP method is used to fusion the sonar information with the stereo model. The final 3D models have a précised by measurement comparing with the real object.

Keywords: 3D reconstruction, archaeology, fusion, stereo system, sonar system, underwater

Procedia PDF Downloads 284
33 Prevalence of SARS-CoV-2 Infection and Associated Risk Factors in Selected Health Facilities of Tigray, Ethiopia: Cross-Sectional Study Design, 2023

Authors: Weldegerima Gebremedhin Hagos

Abstract:

Background: The Coronavirus disease of 2019 (COVID-19) is a catastrophic emerging global health threat caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). COVID-19 has a wide range of complications and sequels. It is devastating in developing countries, causing serious health and socioeconomic crises as a result of the increasingly overburdened healthcare system. Ethiopia reported the first case of SARS-CoV-2 on 13th March 2020, with community transmission ensuing by mid-May. The aim of this study was conducted to determine the prevalence of SARS-CoV-2 infection in Tigray, Ethiopia. Methods: Facility-based correctional study designs were used on a total of 380 study participants from March 2023 up to May 2023 in two general hospitals and one comprehensive specialized hospital in Tigray, Ethiopia. A pre-structured questionnaire was used to assess information regarding the socio-demographic, clinical data and other risk factors. A nasal swap was taken by trained health professionals, and the laboratory analysis was done by RT-PCR (quant studio 7-flex, applied biosystems) in Tigrai Health Research Institute and Mekelle University Medical Microbiology Research Laboratory. Result: The mean age of the study participants was 31 (SD+/-3.5) years, with 65% being male and 35% female. The overall seropositivity of sars-cov-2 among the study participants was 5.5%. The prevalence was higher in males (6.2%) than females which were (4.7%). Sars-cov-2 infection was significantly associated with a history of lack of vaccination (p-value 0.002). There was no significant association between seropositivity and demographic factors (P > 0.05). Conclusion: The seroprevalence of SARS-CoV-2 among the study participants is high. Those study participants with a previous history of vaccination have a low probability of developing COVID-19 infection. A low SARS-CoV-2 infection rate was recorded in those who frequently use masks.

Keywords: prevalence, SARS-CoV-2, infection, risk factors

Procedia PDF Downloads 32
32 Moving Object Detection Using Histogram of Uniformly Oriented Gradient

Authors: Wei-Jong Yang, Yu-Siang Su, Pau-Choo Chung, Jar-Ferr Yang

Abstract:

Moving object detection (MOD) is an important issue in advanced driver assistance systems (ADAS). There are two important moving objects, pedestrians and scooters in ADAS. In real-world systems, there exist two important challenges for MOD, including the computational complexity and the detection accuracy. The histogram of oriented gradient (HOG) features can easily detect the edge of object without invariance to changes in illumination and shadowing. However, to reduce the execution time for real-time systems, the image size should be down sampled which would lead the outlier influence to increase. For this reason, we propose the histogram of uniformly-oriented gradient (HUG) features to get better accurate description of the contour of human body. In the testing phase, the support vector machine (SVM) with linear kernel function is involved. Experimental results show the correctness and effectiveness of the proposed method. With SVM classifiers, the real testing results show the proposed HUG features achieve better than classification performance than the HOG ones.

Keywords: moving object detection, histogram of oriented gradient, histogram of uniformly-oriented gradient, linear support vector machine

Procedia PDF Downloads 569
31 Introduction of Robust Multivariate Process Capability Indices

Authors: Behrooz Khalilloo, Hamid Shahriari, Emad Roghanian

Abstract:

Process capability indices (PCIs) are important concepts of statistical quality control and measure the capability of processes and how much processes are meeting certain specifications. An important issue in statistical quality control is parameter estimation. Under the assumption of multivariate normality, the distribution parameters, mean vector and variance-covariance matrix must be estimated, when they are unknown. Classic estimation methods like method of moment estimation (MME) or maximum likelihood estimation (MLE) makes good estimation of the population parameters when data are not contaminated. But when outliers exist in the data, MME and MLE make weak estimators of the population parameters. So we need some estimators which have good estimation in the presence of outliers. In this work robust M-estimators for estimating these parameters are used and based on robust parameter estimators, robust process capability indices are introduced. The performances of these robust estimators in the presence of outliers and their effects on process capability indices are evaluated by real and simulated multivariate data. The results indicate that the proposed robust capability indices perform much better than the existing process capability indices.

Keywords: multivariate process capability indices, robust M-estimator, outlier, multivariate quality control, statistical quality control

Procedia PDF Downloads 268
30 PCR Detection, Histopathological Characterization, and Autogenous Immunization of Bovine Papillomatosis (Wart) in Cattle, in Mekelle, Northern Ethiopia

Authors: Kidane Workelul, Yohans Tekle, Guesh Negash, Haftay Abraha, Nigus Abebe Shumuye, Yisehak Tsegaye Redda

Abstract:

Bovine papillomatosis (wart) is one of the economically important bovine skin diseases worldwide, caused by a group of viruses named papillomaviruses (PVs). However, it has often been misdiagnosed as other skin diseases and remained untreated. In order to determine the status of the diseases, twenty-two farms were visited, and fourteen infected cattle with cutaneous papillomatosis were identified from a total of 235. Papilloma biopsies were taken for molecular and histopathological characterization, the therapeutic trial of an autogenous vaccine was evaluated on infected animals. The overall status of bovine papillomatosis in this study was calculated as 5.96% (14/235). The disease was found to be statistically significant in the age groups less than two years (X² = 26.69, P = 0.0001). The more prominent histologically characterized lesions in the sampled tissue were identified as squamous papilloma and fibro-papilloma. The Polymerase Chain Reaction (PCR) based identification revealed that all the clinically and histo-pathologically characterized papillomatosis cases were found to be infected with Bovine Papilloma Virus1(BPV1), indicating that BPV1 was the most common and sole causative agent of the diseases in the study area. In immunizing active bovine papillomatosis, an autogenous vaccine therapeutic trial demonstrated excellent results, with practically full recovery and no recurrence of the infection. Hence, it is concluded that bovine papillomatosis is an economically important disease of young age group cattle as well as a treatable disease. So, the production of marketable autogenous vaccines against bovine papillomatosis should be started and given at an early stage.

Keywords: autogenous vaccine, bovine papillomatosis, bovine papilloma virus1 clinical-pathology, polymerase chine reaction, wart

Procedia PDF Downloads 57
29 Market Chain Analysis of Onion: The Case of Northern Ethiopia

Authors: Belayneh Yohannes

Abstract:

In Ethiopia, onion production is increasing from time to time mainly due to its high profitability per unit area. Onion has a significant contribution to generating cash income for farmers in the Raya Azebo district. Therefore, enhancing onion producers’ access to the market and improving market linkage is an essential issue. Hence, this study aimed to analyze structure-conduct-performance of onion market and identifying factors affecting the market supply of onion producers. Data were collected from both primary and secondary sources. Primary data were collected from 150 farm households and 20 traders. Four onion marketing channels were identified in the study area. The highest total gross margin is 27.6 in channel IV. The highest gross marketing margin of producers of the onion market is 88% in channel II. The result from the analysis of market concentration indicated that the onion market is characterized by a strong oligopolistic market structure, with the buyers’ concentration ratio of 88.7 in Maichew town and 82.7 in Mekelle town. Lack of capital, licensing problems, and seasonal supply was identified as the major entry barrier to onion marketing. Market conduct shows that the price of onion is set by traders while producers are price takers. Multiple linear regression model results indicated that family size in adult equivalent, irrigated land size, access to information, frequency of extension contact, and ownership of transport significantly determined the quantity of onion supplied to the market. It is recommended that strengthening and diversifying extension services in information, marketing, post-harvest handling, irrigation application, and water harvest technology is highly important.

Keywords: oligopoly, onion, market chain, multiple linear regression

Procedia PDF Downloads 111
28 Comparison Between a Droplet Digital PCR and Real Time PCR Method in Quantification of HBV DNA

Authors: Surangrat Srisurapanon, Chatchawal Wongjitrat, Navin Horthongkham, Ruengpung Sutthent

Abstract:

HBV infection causes a potential serious public health problem. The ability to detect the HBV DNA concentration is of the importance and improved continuously. By using quantitative Polymerase Chain Reaction (qPCR), several factors in standardized; source of material, calibration standard curve and PCR efficiency are inconsistent. Digital PCR (dPCR) is an alternative PCR-based technique for absolute quantification using Poisson's statistics without requiring a standard curve. Therefore, the aim of this study is to compare the data set of HBV DNA generated between dPCR and qPCR methods. All samples were quantified by Abbott’s real time PCR and 54 samples with 2 -6 log10 HBV DNA were selected for comparison with dPCR. Of these 54 samples, there were two outlier samples defined as negative by dPCR. Of these two, samples were defined as negative by dPCR, whereas 52 samples were positive by both the tests. The difference between the two assays was less than 0.25 log IU/mL in 24/52 samples (46%) of paired samples; less than 0.5 log IU/mL in 46/52 samples (88%) and less than 1 log in 50/52 samples (96%). The correlation coefficient was r=0.788 and P-value <0.0001. Comparison to qPCR, data generated by dPCR tend to be the overestimation in the sample with low HBV DNA concentration and underestimated in the sample with high viral load. The variation in DNA by dPCR measurement might be due to the pre-amplification bias, template. Moreover, a minor drawback of dPCR is the large quantity of DNA had to be used when compare to the qPCR. Since the technology is relatively new, the limitations of this assay will be improved.

Keywords: hepatitis B virus, real time PCR, digital PCR, DNA quantification

Procedia PDF Downloads 461
27 A Robust and Adaptive Unscented Kalman Filter for the Air Fine Alignment of the Strapdown Inertial Navigation System/GPS

Authors: Jian Shi, Baoguo Yu, Haonan Jia, Meng Liu, Ping Huang

Abstract:

Adapting to the flexibility of war, a large number of guided weapons launch from aircraft. Therefore, the inertial navigation system loaded in the weapon needs to undergo an alignment process in the air. This article proposes the following methods to the problem of inaccurate modeling of the system under large misalignment angles, the accuracy reduction of filtering caused by outliers, and the noise changes in GPS signals: first, considering the large misalignment errors of Strapdown Inertial Navigation System (SINS)/GPS, a more accurate model is made rather than to make a small-angle approximation, and the Unscented Kalman Filter (UKF) algorithms are used to estimate the state; then, taking into account the impact of GPS noise changes on the fine alignment algorithm, the innovation adaptive filtering algorithm is introduced to estimate the GPS’s noise in real-time; at the same time, in order to improve the anti-interference ability of the air fine alignment algorithm, a robust filtering algorithm based on outlier detection is combined with the air fine alignment algorithm to improve the robustness of the algorithm. The algorithm can improve the alignment accuracy and robustness under interference conditions, which is verified by simulation.

Keywords: air alignment, fine alignment, inertial navigation system, integrated navigation system, UKF

Procedia PDF Downloads 142
26 Battery Grading Algorithm in 2nd-Life Repurposing LI-Ion Battery System

Authors: Ya L. V., Benjamin Ong Wei Lin, Wanli Niu, Benjamin Seah Chin Tat

Abstract:

This article introduces a methodology that improves reliability and cyclability of 2nd-life Li-ion battery system repurposed as an energy storage system (ESS). Most of the 2nd-life retired battery systems in the market have module/pack-level state-of-health (SOH) indicator, which is utilized for guiding appropriate depth-of-discharge (DOD) in the application of ESS. Due to the lack of cell-level SOH indication, the different degrading behaviors among various cells cannot be identified upon reaching retired status; in the end, considering end-of-life (EOL) loss and pack-level DOD, the repurposed ESS has to be oversized by > 1.5 times to complement the application requirement of reliability and cyclability. This proposed battery grading algorithm, using non-invasive methodology, is able to detect outlier cells based on historical voltage data and calculate cell-level historical maximum temperature data using semi-analytic methodology. In this way, the individual battery cell in the 2nd-life battery system can be graded in terms of SOH on basis of the historical voltage fluctuation and estimated historical maximum temperature variation. These grades will have corresponding DOD grades in the application of the repurposed ESS to enhance system reliability and cyclability. In all, this introduced battery grading algorithm is non-invasive, compatible with all kinds of retired Li-ion battery systems which lack of cell-level SOH indication, as well as potentially being embedded into battery management software for preventive maintenance and real-time cyclability optimization.

Keywords: battery grading algorithm, 2nd-life repurposing battery system, semi-analytic methodology, reliability and cyclability

Procedia PDF Downloads 180
25 Understanding Tourism Innovation through Fuzzy Measures

Authors: Marcella De Filippo, Delio Colangelo, Luca Farnia

Abstract:

In recent decades, the hyper-competition of tourism scenario has implicated the maturity of many businesses, attributing a central role to innovative processes and their dissemination in the economy of company management. At the same time, it has defined the need for monitoring the application of innovations, in order to govern and improve the performance of companies and destinations. The study aims to analyze and define the innovation in the tourism sector. The research actions have concerned, on the one hand, some in-depth interviews with experts, identifying innovation in terms of process and product, digitalization, sustainability policies and, on the other hand, to evaluate the interaction between these factors, in terms of substitutability and complementarity in management scenarios, in order to identify which one is essential to be competitive in the global scenario. Fuzzy measures and Choquet integral were used to elicit Experts’ preferences. This method allows not only to evaluate the relative importance of each pillar, but also and more interestingly, the level of interaction, ranging from complementarity to substitutability, between pairs of factors. The results of the survey are the following: in terms of Shapley values, Experts assert that Innovation is the most important factor (32.32), followed by digitalization (31.86), Network (20.57) and Sustainability (15.25). In terms of Interaction indices, given the low degree of consensus among experts, the interaction between couples of criteria on average could be ignored; however, it is worth to note that the factors innovations and digitalization are those in which experts express the highest degree of interaction. However for some of them, these factors have a moderate level of complementarity (with a pick of 57.14), and others consider them moderately substitutes (with a pick of -39.58). Another example, although outlier is the interaction between network and digitalization, in which an expert consider them markedly substitutes (-77.08).

Keywords: innovation, business model, tourism, fuzzy

Procedia PDF Downloads 246
24 InSAR Times-Series Phase Unwrapping for Urban Areas

Authors: Hui Luo, Zhenhong Li, Zhen Dong

Abstract:

The analysis of multi-temporal InSAR (MTInSAR) such as persistent scatterer (PS) and small baseline subset (SBAS) techniques usually relies on temporal/spatial phase unwrapping (PU). Unfortunately, it always fails to unwrap the phase for two reasons: 1) spatial phase jump between adjacent pixels larger than π, such as layover and high discontinuous terrain; 2) temporal phase discontinuities such as time varied atmospheric delay. To overcome these limitations, a least-square based PU method is introduced in this paper, which incorporates baseline-combination interferograms and adjacent phase gradient network. Firstly, permanent scatterers (PS) are selected for study. Starting with the linear baseline-combination method, we obtain equivalent 'small baseline inteferograms' to limit the spatial phase difference. Then, phase different has been conducted between connected PSs (connected by a specific networking rule) to suppress the spatial correlated phase errors such as atmospheric artifact. After that, interval phase difference along arcs can be computed by least square method and followed by an outlier detector to remove the arcs with phase ambiguities. Then, the unwrapped phase can be obtained by spatial integration. The proposed method is tested on real data of TerraSAR-X, and the results are also compared with the ones obtained by StaMPS(a software package with 3D PU capabilities). By comparison, it shows that the proposed method can successfully unwrap the interferograms in urban areas even when high discontinuities exist, while StaMPS fails. At last, precise DEM errors can be got according to the unwrapped interferograms.

Keywords: phase unwrapping, time series, InSAR, urban areas

Procedia PDF Downloads 128
23 Development of Energy Benchmarks Using Mandatory Energy and Emissions Reporting Data: Ontario Post-Secondary Residences

Authors: C. Xavier Mendieta, J. J McArthur

Abstract:

Governments are playing an increasingly active role in reducing carbon emissions, and a key strategy has been the introduction of mandatory energy disclosure policies. These policies have resulted in a significant amount of publicly available data, providing researchers with a unique opportunity to develop location-specific energy and carbon emission benchmarks from this data set, which can then be used to develop building archetypes and used to inform urban energy models. This study presents the development of such a benchmark using the public reporting data. The data from Ontario’s Ministry of Energy for Post-Secondary Educational Institutions are being used to develop a series of building archetype dynamic building loads and energy benchmarks to fill a gap in the currently available building database. This paper presents the development of a benchmark for college and university residences within ASHRAE climate zone 6 areas in Ontario using the mandatory disclosure energy and greenhouse gas emissions data. The methodology presented includes data cleaning, statistical analysis, and benchmark development, and lessons learned from this investigation are presented and discussed to inform the development of future energy benchmarks from this larger data set. The key findings from this initial benchmarking study are: (1) the importance of careful data screening and outlier identification to develop a valid dataset; (2) the key features used to develop a model of the data are building age, size, and occupancy schedules and these can be used to estimate energy consumption; and (3) policy changes affecting the primary energy generation significantly affected greenhouse gas emissions, and consideration of these factors was critical to evaluate the validity of the reported data.

Keywords: building archetypes, data analysis, energy benchmarks, GHG emissions

Procedia PDF Downloads 284
22 A Case Study on the Drivers of Household Water Consumption for Different Socio-Economic Classes in Selected Communities of Metro Manila, Philippines

Authors: Maria Anjelica P. Ancheta, Roberto S. Soriano, Erickson L. Llaguno

Abstract:

The main purpose of this study is to examine whether there is a significant relationship between socio-economic class and household water supply demand, through determining or verifying the factors governing water use consumption patterns of households from a sampling from different socio-economic classes in Metro Manila, the national capital region of the Philippines. This study is also an opportunity to augment the lack of local academic literature due to the very few publications on urban household water demand after 1999. In over 600 Metro Manila households, a rapid survey was conducted on their average monthly water consumption and habits on household water usage. The questions in the rapid survey were based on an extensive review of literature on urban household water demand. Sample households were divided into socio-economic classes A-B and C-D. Cluster analysis, dummy coding and outlier tests were done to prepare the data for regression analysis. Subsequently, backward stepwise regression analysis was used in order to determine different statistical models to describe the determinants of water consumption. The key finding of this study is that the socio-economic class of a household in Metro Manila is a significant factor in water consumption. A-B households consume more water in contrast to C-D families based on the mean average water consumption for A-B and C-D households are 36.75 m3 and 18.92 m3, respectively. The most significant proxy factors of socio-economic class that were related to household water consumption were examined in order to suggest improvements in policy formulation and household water demand management.

Keywords: household water uses, socio-economic classes, urban planning, urban water demand management

Procedia PDF Downloads 275
21 A Visual Analytics Tool for the Structural Health Monitoring of an Aircraft Panel

Authors: F. M. Pisano, M. Ciminello

Abstract:

Aerospace, mechanical, and civil engineering infrastructures can take advantages from damage detection and identification strategies in terms of maintenance cost reduction and operational life improvements, as well for safety scopes. The challenge is to detect so called “barely visible impact damage” (BVID), due to low/medium energy impacts, that can progressively compromise the structure integrity. The occurrence of any local change in material properties, that can degrade the structure performance, is to be monitored using so called Structural Health Monitoring (SHM) systems, in charge of comparing the structure states before and after damage occurs. SHM seeks for any "anomalous" response collected by means of sensor networks and then analyzed using appropriate algorithms. Independently of the specific analysis approach adopted for structural damage detection and localization, textual reports, tables and graphs describing possible outlier coordinates and damage severity are usually provided as artifacts to be elaborated for information extraction about the current health conditions of the structure under investigation. Visual Analytics can support the processing of monitored measurements offering data navigation and exploration tools leveraging the native human capabilities of understanding images faster than texts and tables. Herein, a SHM system enrichment by integration of a Visual Analytics component is investigated. Analytical dashboards have been created by combining worksheets, so that a useful Visual Analytics tool is provided to structural analysts for exploring the structure health conditions examined by a Principal Component Analysis based algorithm.

Keywords: interactive dashboards, optical fibers, structural health monitoring, visual analytics

Procedia PDF Downloads 107
20 Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data

Authors: Cheng Zeng, George Michailidis, Hitoshi Iyatomi, Leo L. Duan

Abstract:

The conditional density characterizes the distribution of a response variable y given other predictor x and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts as a motivating starting point. In this work, the authors extend NF neural networks when external x is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional y and a latent z that comprises two components [zₚ, zₙ]. The zₚ component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for x, such as logistic/linear regression. The zₙ component is a high-dimensional independent Gaussian vector, which explains the variations in y not or less related to x. Unlike existing CDE methods, the proposed approach coined Augmented Posterior CDE (AP-CDE) only requires a simple modification of the common normalizing flow framework while significantly improving the interpretation of the latent component since zₚ represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of 𝑥-related variations due to factors such as lighting condition and subject id from the other random variations. Further, the experiments show that an unconditional NF neural network based on an unsupervised model of z, such as a Gaussian mixture, fails to generate interpretable results.

Keywords: conditional density estimation, image generation, normalizing flow, supervised dimension reduction

Procedia PDF Downloads 74
19 A Deep Learning Model with Greedy Layer-Wise Pretraining Approach for Optimal Syngas Production by Dry Reforming of Methane

Authors: Maryam Zarabian, Hector Guzman, Pedro Pereira-Almao, Abraham Fapojuwo

Abstract:

Dry reforming of methane (DRM) has sparked significant industrial and scientific interest not only as a viable alternative for addressing the environmental concerns of two main contributors of the greenhouse effect, i.e., carbon dioxide (CO₂) and methane (CH₄), but also produces syngas, i.e., a mixture of hydrogen (H₂) and carbon monoxide (CO) utilized by a wide range of downstream processes as a feedstock for other chemical productions. In this study, we develop an AI-enable syngas production model to tackle the problem of achieving an equivalent H₂/CO ratio [1:1] with respect to the most efficient conversion. Firstly, the unsupervised density-based spatial clustering of applications with noise (DBSAN) algorithm removes outlier data points from the original experimental dataset. Then, random forest (RF) and deep neural network (DNN) models employ the error-free dataset to predict the DRM results. DNN models inherently would not be able to obtain accurate predictions without a huge dataset. To cope with this limitation, we employ reusing pre-trained layers’ approaches such as transfer learning and greedy layer-wise pretraining. Compared to the other deep models (i.e., pure deep model and transferred deep model), the greedy layer-wise pre-trained deep model provides the most accurate prediction as well as similar accuracy to the RF model with R² values 1.00, 0.999, 0.999, 0.999, 0.999, and 0.999 for the total outlet flow, H₂/CO ratio, H₂ yield, CO yield, CH₄ conversion, and CO₂ conversion outputs, respectively.

Keywords: artificial intelligence, dry reforming of methane, artificial neural network, deep learning, machine learning, transfer learning, greedy layer-wise pretraining

Procedia PDF Downloads 66
18 Identifying Controlling Factors for the Evolution of Shallow Groundwater Chemistry of Ellala Catchment, Northern Ethiopia

Authors: Grmay Kassa Brhane, Hailemariam Siyum Mekonen

Abstract:

This study was designed to identify the hydrogeochemical and anthropogenic processes controlling the evaluation of groundwater chemistry in the Ellala catchment which covers about 296.5 km2 areal extent. The chemical analysis revealed that the major ions in the groundwater are Ca2+, Mg2+, Na+, and K+ (cations) and HCO3-, PO43-, Cl-, NO3-, and SO42-(anions). Most of the groundwater samples (68.42%) revealed that the groundwater in the catchment is non-alkaline. In addition to the contribution of aquifer material, the solid materials and liquid wastes discharged from different sources can be the main sources of pH and EC in the groundwater. It is observed that the EC of the groundwater is fairly correlated with the DTS. This indicates that high mineralized water is more conductor than water with low concentration. The degree of salinity of the groundwater increases along the groundwater flow path from East to West; then, areas surrounding Mekelle City are highly saline due to the liquid and solid wastes discharged from the city and the industries. The groundwater facies in the catchment are predominated with calcium, magnesium, and bicarbonate which are labeled as Ca-Mg-HCO3 and Mg-Ca-HCO3. The main geochemical process controlling the evolution of the groundwater chemistry in the catchment is rock-water interaction, particularly carbonate dissolution. Due to the clay layer in the aquifer, the reverse is ion exchange. Non-significant silicate weathering and halite dissolution also contribute to the evolution of groundwater chemistry in the catchment. The groundwater in the catchment is dominated by the meteoritic origin although it needs further groundwater chemistry study with isotope dating analysis. The groundwater is under-saturated with calcite, dolomite, and aragonite minerals; hence, the more these minerals encounter the groundwater, the more the minerals dissolve. The main source of calcium and magnesium in groundwater is the dissolution of carbonate minerals (calcite and dolomite) since carbonate rocks are the dominant aquifer materials in the catchment. In addition to this, the weathering of dolerite rock is a possible source of magnesium ions. The relatively higher concentration of sodium over chloride indicates that the source of sodium-ion is reverse ion exchange and/or weathering of sodium-bearing materials, such as shale and dolerite rather than halite dissolution. High concentration of phosphate, nitrate, and chloride in the groundwater is the main anthropogenic source that needs treatment, quality control, and management in the catchment. From the Base Exchange Index Analysis, it is possible to understand that, in the catchment, the groundwater is dominated by the meteoritic origin, although it needs further groundwater chemistry study with isotope dating analysis.

Keywords: Ellala catchment, factor, chemistry, geochemical, groundwater

Procedia PDF Downloads 50
17 Outcome of Using Penpat Pinyowattanasilp Equation for Prediction of 24-Hour Uptake, First and Second Therapeutic Doses Calculation in Graves’ Disease Patient

Authors: Piyarat Parklug, Busaba Supawattanaobodee, Penpat Pinyowattanasilp

Abstract:

The radioactive iodine thyroid uptake (RAIU) has been widely used to differentiate the cause of thyrotoxicosis and treatment. Twenty-four hours RAIU is routinely used to calculate the dose of radioactive iodine (RAI) therapy; however, 2 days protocol is required. This study aims to evaluate the modification of Penpat Pinyowattanasilp equation application by the exclusion of outlier data, 3 hours RAIU less than 20% and more than 80%, to improve prediction of 24-hour uptake. The equation is predicted 24 hours RAIU (P24RAIU) = 32.5+0.702 (3 hours RAIU). Then calculating separation first and second therapeutic doses in Graves’ disease patients. Methods; This study was a retrospective study at Faculty of Medicine Vajira Hospital in Bangkok, Thailand. Inclusion were Graves’ disease patients who visited RAI clinic between January 2014-March 2019. We divided subjects into 2 groups according to first and second therapeutic doses. Results; Our study had a total of 151 patients. The study was done in 115 patients with first RAI dose and 36 patients with second RAI dose. The P24RAIU are highly correlated with actual 24-hour RAIU in first and second therapeutic doses (r = 0.913, 95% CI = 0.876 to 0.939 and r = 0.806, 95% CI = 0.649 to 0.897). Bland-Altman plot shows that mean differences between predictive and actual 24 hours RAI in the first dose and second dose were 2.14% (95%CI 0.83-3.46) and 1.37% (95%CI -1.41-4.14). The mean first actual and predictive therapeutic doses are 8.33 ± 4.93 and 7.38 ± 3.43 milliCuries (mCi) respectively. The mean second actual and predictive therapeutic doses are 6.51 ± 3.96 and 6.01 ± 3.11 mCi respectively. The predictive therapeutic doses are highly correlated with the actual dose in first and second therapeutic doses (r = 0.907, 95% CI = 0.868 to 0.935 and r = 0.953, 95% CI = 0.909 to 0.976). Bland-Altman plot shows that mean difference between predictive and actual P24RAIU in the first dose and second dose were less than 1 mCi (-0.94 and -0.5 mCi). This modification equation application is simply used in clinical practice especially patient with 3 hours RAIU in range of 20-80% in a Thai population. Before use, this equation for other population should be tested for the correlation.

Keywords: equation, Graves’disease, prediction, 24-hour uptake

Procedia PDF Downloads 122
16 Implementation of Algorithm K-Means for Grouping District/City in Central Java Based on Macro Economic Indicators

Authors: Nur Aziza Luxfiati

Abstract:

Clustering is partitioning data sets into sub-sets or groups in such a way that elements certain properties have shared property settings with a high level of similarity within one group and a low level of similarity between groups. . The K-Means algorithm is one of thealgorithmsclustering as a grouping tool that is most widely used in scientific and industrial applications because the basic idea of the kalgorithm is-means very simple. In this research, applying the technique of clustering using the k-means algorithm as a method of solving the problem of national development imbalances between regions in Central Java Province based on macroeconomic indicators. The data sample used is secondary data obtained from the Central Java Provincial Statistics Agency regarding macroeconomic indicator data which is part of the publication of the 2019 National Socio-Economic Survey (Susenas) data. score and determine the number of clusters (k) using the elbow method. After the clustering process is carried out, the validation is tested using themethodsBetween-Class Variation (BCV) and Within-Class Variation (WCV). The results showed that detection outlier using z-score normalization showed no outliers. In addition, the results of the clustering test obtained a ratio value that was not high, namely 0.011%. There are two district/city clusters in Central Java Province which have economic similarities based on the variables used, namely the first cluster with a high economic level consisting of 13 districts/cities and theclustersecondwith a low economic level consisting of 22 districts/cities. And in the cluster second, namely, between low economies, the authors grouped districts/cities based on similarities to macroeconomic indicators such as 20 districts of Gross Regional Domestic Product, with a Poverty Depth Index of 19 districts, with 5 districts in Human Development, and as many as Open Unemployment Rate. 10 districts.

Keywords: clustering, K-Means algorithm, macroeconomic indicators, inequality, national development

Procedia PDF Downloads 142
15 Methodology for the Multi-Objective Analysis of Data Sets in Freight Delivery

Authors: Dale Dzemydiene, Aurelija Burinskiene, Arunas Miliauskas, Kristina Ciziuniene

Abstract:

Data flow and the purpose of reporting the data are different and dependent on business needs. Different parameters are reported and transferred regularly during freight delivery. This business practices form the dataset constructed for each time point and contain all required information for freight moving decisions. As a significant amount of these data is used for various purposes, an integrating methodological approach must be developed to respond to the indicated problem. The proposed methodology contains several steps: (1) collecting context data sets and data validation; (2) multi-objective analysis for optimizing freight transfer services. For data validation, the study involves Grubbs outliers analysis, particularly for data cleaning and the identification of statistical significance of data reporting event cases. The Grubbs test is often used as it measures one external value at a time exceeding the boundaries of standard normal distribution. In the study area, the test was not widely applied by authors, except when the Grubbs test for outlier detection was used to identify outsiders in fuel consumption data. In the study, the authors applied the method with a confidence level of 99%. For the multi-objective analysis, the authors would like to select the forms of construction of the genetic algorithms, which have more possibilities to extract the best solution. For freight delivery management, the schemas of genetic algorithms' structure are used as a more effective technique. Due to that, the adaptable genetic algorithm is applied for the description of choosing process of the effective transportation corridor. In this study, the multi-objective genetic algorithm methods are used to optimize the data evaluation and select the appropriate transport corridor. The authors suggest a methodology for the multi-objective analysis, which evaluates collected context data sets and uses this evaluation to determine a delivery corridor for freight transfer service in the multi-modal transportation network. In the multi-objective analysis, authors include safety components, the number of accidents a year, and freight delivery time in the multi-modal transportation network. The proposed methodology has practical value in the management of multi-modal transportation processes.

Keywords: multi-objective, analysis, data flow, freight delivery, methodology

Procedia PDF Downloads 163
14 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 239
13 Collaborative Data Refinement for Enhanced Ionic Conductivity Prediction in Garnet-Type Materials

Authors: Zakaria Kharbouch, Mustapha Bouchaara, F. Elkouihen, A. Habbal, A. Ratnani, A. Faik

Abstract:

Solid-state lithium-ion batteries have garnered increasing interest in modern energy research due to their potential for safer, more efficient, and sustainable energy storage systems. Among the critical components of these batteries, the electrolyte plays a pivotal role, with LLZO garnet-based electrolytes showing significant promise. Garnet materials offer intrinsic advantages such as high Li-ion conductivity, wide electrochemical stability, and excellent compatibility with lithium metal anodes. However, optimizing ionic conductivity in garnet structures poses a complex challenge, primarily due to the multitude of potential dopants that can be incorporated into the LLZO crystal lattice. The complexity of material design, influenced by numerous dopant options, requires a systematic method to find the most effective combinations. This study highlights the utility of machine learning (ML) techniques in the materials discovery process to navigate the complex range of factors in garnet-based electrolytes. Collaborators from the materials science and ML fields worked with a comprehensive dataset previously employed in a similar study and collected from various literature sources. This dataset served as the foundation for an extensive data refinement phase, where meticulous error identification, correction, outlier removal, and garnet-specific feature engineering were conducted. This rigorous process substantially improved the dataset's quality, ensuring it accurately captured the underlying physical and chemical principles governing garnet ionic conductivity. The data refinement effort resulted in a significant improvement in the predictive performance of the machine learning model. Originally starting at an accuracy of 0.32, the model underwent substantial refinement, ultimately achieving an accuracy of 0.88. This enhancement highlights the effectiveness of the interdisciplinary approach and underscores the substantial potential of machine learning techniques in materials science research.

Keywords: lithium batteries, all-solid-state batteries, machine learning, solid state electrolytes

Procedia PDF Downloads 36