Search results for: Missing values
2435 A Testbed for the Experiments Performed in Missing Value Treatments
Authors: Dias de J. C. Lilian, Lobato M. F. Fábio, de Santana L. Ádamo
Abstract:
The occurrence of missing values in database is a serious problem for Data Mining tasks, responsible for degrading data quality and accuracy of analyses. In this context, the area has shown a lack of standardization for experiments to treat missing values, introducing difficulties to the evaluation process among different researches due to the absence in the use of common parameters. This paper proposes a testbed intended to facilitate the experiments implementation and provide unbiased parameters using available datasets and suited performance metrics in order to optimize the evaluation and comparison between the state of art missing values treatments.
Keywords: Data imputation, data mining, missing values treatment, testbed.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15122434 A Distance Function for Data with Missing Values and Its Application
Authors: Loai AbdAllah, Ilan Shimshoni
Abstract:
Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.
Keywords: Missing values, Distance metric, Bhattacharyya distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27502433 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance
Authors: Loai AbdAllah, Mahmoud Kaiyal
Abstract:
Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.Keywords: Missing values, distance metric, Bhattacharyya distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7802432 Adjusted Ratio and Regression Type Estimators for Estimation of Population Mean when some Observations are missing
Authors: Nuanpan Nangsue
Abstract:
Ratio and regression type estimators have been used by previous authors to estimate a population mean for the principal variable from samples in which both auxiliary x and principal y variable data are available. However, missing data are a common problem in statistical analyses with real data. Ratio and regression type estimators have also been used for imputing values of missing y data. In this paper, six new ratio and regression type estimators are proposed for imputing values for any missing y data and estimating a population mean for y from samples with missing x and/or y data. A simulation study has been conducted to compare the six ratio and regression type estimators with a previous estimator of Rueda. Two population sizes N = 1,000 and 5,000 have been considered with sample sizes of 10% and 30% and with correlation coefficients between population variables X and Y of 0.5 and 0.8. In the simulations, 10 and 40 percent of sample y values and 10 and 40 percent of sample x values were randomly designated as missing. The new ratio and regression type estimators give similar mean absolute percentage errors that are smaller than the Rueda estimator for all cases. The new estimators give a large reduction in errors for the case of 40% missing y values and sampling fraction of 30%.
Keywords: Auxiliary variable, missing data, ratio and regression type estimators.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17312431 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset
Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland
Abstract:
In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.Keywords: feature selection, missing values, classification, clinical dataset, heart failure.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32092430 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes
Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin
Abstract:
Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.Keywords: Missing data, Imputation, Missing Data Techniques.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16662429 Use of Bayesian Network in Information Extraction from Unstructured Data Sources
Authors: Quratulain N. Rajput, Sajjad Haider
Abstract:
This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22302428 Deadline Missing Prediction for Mobile Robots through the Use of Historical Data
Authors: Edwaldo R. B. Monteiro, Patricia D. M. Plentz, Edson R. De Pieri
Abstract:
Mobile robotics is gaining an increasingly important role in modern society. Several potentially dangerous or laborious tasks for human are assigned to mobile robots, which are increasingly capable. Many of these tasks need to be performed within a specified period, i.e, meet a deadline. Missing the deadline can result in financial and/or material losses. Mechanisms for predicting the missing of deadlines are fundamental because corrective actions can be taken to avoid or minimize the losses resulting from missing the deadline. In this work we propose a simple but reliable deadline missing prediction mechanism for mobile robots through the use of historical data and we use the Pioneer 3-DX robot for experiments and simulations, one of the most popular robots in academia.
Keywords: Deadline missing, historical data, mobile robots, prediction mechanism.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18082427 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values
Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi
Abstract:
A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.
Keywords: eXtreme Gradient Boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impairment, multiclass classification, ADNI, support vector machine, random forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9572426 Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —
Authors: Tetsuro Saeki, Yuichi Kato, Shoutarou Mizuno
Abstract:
STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data
Keywords: Rule induction, decision table, missing data, noise.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14622425 Imputation Technique for Feature Selection in Microarray Data Set
Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam
Abstract:
Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.
Keywords: DNA microarray, feature selection, missing data, bioinformatics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27902424 Rotation Invariant Fusion of Partial Image Parts in Vista Creation using Missing View Regeneration
Authors: H. B. Kekre, Sudeep D. Thepade
Abstract:
The automatic construction of large, high-resolution image vistas (mosaics) is an active area of research in the fields of photogrammetry [1,2], computer vision [1,4], medical image processing [4], computer graphics [3] and biometrics [8]. Image stitching is one of the possible options to get image mosaics. Vista Creation in image processing is used to construct an image with a large field of view than that could be obtained with a single photograph. It refers to transforming and stitching multiple images into a new aggregate image without any visible seam or distortion in the overlapping areas. Vista creation process aligns two partial images over each other and blends them together. Image mosaics allow one to compensate for differences in viewing geometry. Thus they can be used to simplify tasks by simulating the condition in which the scene is viewed from a fixed position with single camera. While obtaining partial images the geometric anomalies like rotation, scaling are bound to happen. To nullify effect of rotation of partial images on process of vista creation, we are proposing rotation invariant vista creation algorithm in this paper. Rotation of partial image parts in the proposed method of vista creation may introduce some missing region in the vista. To correct this error, that is to fill the missing region further we have used image inpainting method on the created vista. This missing view regeneration method also overcomes the problem of missing view [31] in vista due to cropping, irregular boundaries of partial image parts and errors in digitization [35]. The method of missing view regeneration generates the missing view of vista using the information present in vista itself.Keywords: Vista, Overlap Estimation, Rotation Invariance, Missing View Regeneration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17212423 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area
Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim
Abstract:
In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.Keywords: Data Estimation, link data, machine learning, road network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15032422 Recovery of Missing Samples in Multi-channel Oversampling of Multi-banded Signals
Authors: J. M. Kim, K. H. Kwon
Abstract:
We show that in a two-channel sampling series expansion of band-pass signals, any finitely many missing samples can always be recovered via oversampling in a larger band-pass region. We also obtain an analogous result for multi-channel oversampling of harmonic signals.Keywords: oversampling, multi-channel sampling, recovery of missing samples, band-pass signal, harmonic signal
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12822421 Comparison of Machine Learning Techniques for Single Imputation on Audiograms
Authors: Sarah Beaver, Renee Bryce
Abstract:
Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.
Keywords: Machine Learning, audiograms, data imputations, single imputations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552420 Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs
Authors: Pilar Rey-del-Castillo, Jesús Cardeñosa
Abstract:
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Keywords: Classifier, imputation techniques, fuzzy systems, fuzzy min-max neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17782419 A Large Dataset Imputation Approach Applied to Country Conflict Prediction Data
Authors: Benjamin D. Leiby, Darryl K. Ahner
Abstract:
This study demonstrates an alternative stochastic imputation approach for large datasets when preferred commercial packages struggle to iterate due to numerical problems. A large country conflict dataset motivates the search to impute missing values well over a common threshold of 20% missingness. The methodology capitalizes on correlation while using model residuals to provide the uncertainty in estimating unknown values. Examination of the methodology provides insight toward choosing linear or nonlinear modeling terms. Static tolerances common in most packages are replaced with tailorable tolerances that exploit residuals to fit each data element. The methodology evaluation includes observing computation time, model fit, and the comparison of known values to replaced values created through imputation. Overall, the country conflict dataset illustrates promise with modeling first-order interactions, while presenting a need for further refinement that mimics predictive mean matching.
Keywords: Correlation, country conflict, imputation, stochastic regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4152418 Development of a Performance Measurement System for Forwarders
Authors: K. Schmidt, Z. Miodrag, C. Geiger
Abstract:
Performance Measurement is still a difficult task for forwarding companies. This is caused on the one hand by missing resources and on the other hand by missing tools. The research project “Management Information System for Logistics Service Providers" aims for closing the gap between needed and disposable solutions. Core of the project is the development
Keywords: Forwarder, Logistics, Management Information, Performance Measurement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13132417 On Deterministic Chaos: Disclosing the Missing Mathematics from the Lorenz-Haken Equations
Authors: Belkacem Meziane
Abstract:
The original 3D Lorenz-Haken equations -which describe laser dynamics- are converted into 2-second-order differential equations out of which the so far missing mathematics is extracted. Leaning on high-order trigonometry, important outcomes are pulled out: A fundamental result attributes chaos to forbidden periodic solutions, inside some precisely delimited region of the control parameter space that governs self-pulsing.
Keywords: chaos, Lorenz-Haken equations, laser dynamics, nonlinearities
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6092416 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models
Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand
Abstract:
Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models, on two different real-world electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.
Keywords: EHR, Machine Learning, imputation, laboratory variables, algorithmic bias.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1692415 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization
Authors: Hironori Karachi, Haruka Yamashita
Abstract:
Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.
Keywords: Data science, non-negative matrix factorization, missing data, quality of services.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4522414 Reusing Assessments Tests by Generating Arborescent Test Groups Using a Genetic Algorithm
Authors: Ovidiu Domşa, Nicolae Bold
Abstract:
Using Information and Communication Technologies (ICT) notions in education and three basic processes of education (teaching, learning and assessment) can bring benefits to the pupils and the professional development of teachers. In this matter, we refer to these notions as concepts taken from the informatics area and apply them to the domain of education. These notions refer to genetic algorithms and arborescent structures, used in the specific process of assessment or evaluation. This paper uses these kinds of notions to generate subtrees from a main tree of tests related between them by their degree of difficulty. These subtrees must contain the highest number of connections between the nodes and the lowest number of missing edges (which are subtrees of the main tree) and, in the particular case of the non-existence of a subtree with no missing edges, the subtrees which have the lowest (minimal) number of missing edges between the nodes, where a node is a test and an edge is a direct connection between two tests which differs by one degree of difficulty. The subtrees are represented as sequences. The tests are the same (a number coding a test represents that test in every sequence) and they are reused for each sequence of tests.
Keywords: Chromosome, genetic algorithm, subtree, test.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7132413 Array Signal Processing: DOA Estimation for Missing Sensors
Authors: Lalita Gupta, R. P. Singh
Abstract:
Array signal processing involves signal enumeration and source localization. Array signal processing is centered on the ability to fuse temporal and spatial information captured via sampling signals emitted from a number of sources at the sensors of an array in order to carry out a specific estimation task: source characteristics (mainly localization of the sources) and/or array characteristics (mainly array geometry) estimation. Array signal processing is a part of signal processing that uses sensors organized in patterns or arrays, to detect signals and to determine information about them. Beamforming is a general signal processing technique used to control the directionality of the reception or transmission of a signal. Using Beamforming we can direct the majority of signal energy we receive from a group of array. Multiple signal classification (MUSIC) is a highly popular eigenstructure-based estimation method of direction of arrival (DOA) with high resolution. This Paper enumerates the effect of missing sensors in DOA estimation. The accuracy of the MUSIC-based DOA estimation is degraded significantly both by the effects of the missing sensors among the receiving array elements and the unequal channel gain and phase errors of the receiver.
Keywords: Array Signal Processing, Beamforming, ULA, Direction of Arrival, MUSIC
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30192412 Molecular Docking on Recomposed versus Crystallographic Structures of Zn-Dependent Enzymes and their Natural Inhibitors
Authors: Tudor Petreuş, Andrei Neamţu, Cristina Dascălu, Paul Dan Sîrbu, Carmen E. Cotrutz
Abstract:
Matrix metalloproteinases (MMP) are a class of structural and functional related enzymes involved in altering the natural elements of the extracellular matrix. Most of the MMP structures are cristalographycally determined and published in WorldWide ProteinDataBank, isolated, in full structure or bound to natural or synthetic inhibitors. This study proposes an algorithm to replace missing crystallographic structures in PDB database. We have compared the results of a chosen docking algorithm with a known crystallographic structure in order to validate enzyme sites reconstruction there where crystallographic data are missing.Keywords: matrix metalloproteinases, molecular docking, structure superposition, surface complementarity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16082411 An Obesity Index Derived from Waist and Hip Circumferences Well-Matched with Other Indices in Children with Obesity
Authors: Mustafa M. Donma, Orkide Donma
Abstract:
Indices derived from anthropometric measurements [waist-to-hip ratio (WHR)] or body fat mass compositions [trunk-to-leg fat ratio (TLFR)] are used for the evaluation of obesity. The best for clinical practices is still being investigated. The aim of this study is to derive an index, which best suits the purpose for the discrimination of children with normal body mass index (N-BMI) from obese (OB) children. 83 children participated in the study. Groups 1 and 2 comprised 42 children with N-BMI and 41 OB children, whose age- and sex-adjusted BMI percentile values vary between 15-85 and 95-99, respectively. The institutional ethics committee approved the study protocol. Informed consent forms were filled by the parents of the participants. Anthropometric measurements (weight, height (Ht), waist circumference (WC), hip circumference (HC), neck circumference (NC) values) were taken. BMI, WHR, (WC+HC)/2, WC/Ht, (WC/HC)/Ht, WC*NC were calculated. Bioelectrical impedance analysis was performed to obtain body’s fat compartments in terms of total fat, trunk fat, leg fat, arm fat masses. TLFR, trunk-to-appendicular fat ratio (TAFR), (trunk fat+leg fat)/2 ((TF+LF)/2), fat mass index (FMI) and diagnostic obesity notation model assessment-II (D2I) index values were calculated. Statistical analysis was performed. Significantly higher values of (WC+HC)/2, (TF+LF)/2, D2I and FMI were observed in OB group than N-BMI group. Significant correlations were found between BMI and WC, (WC+HC)/2, (TF+LF)/2, TLFR, TAFR, D2I, FMI in both groups. Similar correlations were obtained for WC. (WC+HC)/2 was correlated with TLFR, TAFR, (TF+LF)/2, D2I and FMI in N-BMI group. In OB group, the correlations were the same except those with TLFR and TAFR. These correlations were not present with WHR. Correlations were observed between TLFR as well as TAFR and BMI, WC, (WC+HC)/2, (TF+LF)/2, D2I, FMI in N-BMI group. In OB group, correlations between TLFR or TAFR and BMI, WC as well as (WC+HC)/2 were missing. None was noted with WHR. In conclusion, the only correlation valid in both groups was that exists between (TF+LF)/2 and (WC+HC)/2, which was suggested as a link between fat-based and anthropometric indices. (WC+HC)/2, but not WHR, was much more suitable as an anthropometric obesity index.
Keywords: Children, hip circumference, obesity, waist circumference.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4272410 Review and Experiments on SDMSCue
Authors: Ashraf Anwar
Abstract:
In this work, I present a review on Sparse Distributed Memory for Small Cues (SDMSCue), a variant of Sparse Distributed Memory (SDM) that is capable of handling small cues. I then conduct and show some cognitive experiments on SDMSCue to test its cognitive soundness compared to SDM. Small cues refer to input cues that are presented to memory for reading associations; but have many missing parts or fields from them. The original SDM failed to handle such a problem. SDMSCue handles and overcomes this pitfall. The main idea in SDMSCue; is the repeated projection of the semantic space on smaller subspaces; that are selected based on the input cue length and pattern. This process allows for Read/Write operations using an input cue that is missing a large portion. SDMSCue is augmented with the use of genetic algorithms for memory allocation and initialization. I claim that SDM functionality is a subset of SDMSCue functionality.Keywords: Artificial intelligence, recall, recognition, SDM, SDMSCue.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13712409 Value Analysis of Islamic Banking and Conventional Banking to Measure Value Co-creation
Authors: Amna Javed, Hisashi Masuda, Youji Kohda
Abstract:
This study examines the value analysis in Islamic and conventional banking services in Pakistan. Many scholars have focused on co-creation of values in services but mainly economic values not non-economic.
Keywords: Economic values, Islamic banking, Non-economic values, Value system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32562408 Neural Network Imputation in Complex Survey Design
Authors: Safaa R. Amer
Abstract:
Missing data yields many analysis challenges. In case of complex survey design, in addition to dealing with missing data, researchers need to account for the sampling design to achieve useful inferences. Methods for incorporating sampling weights in neural network imputation were investigated to account for complex survey designs. An estimate of variance to account for the imputation uncertainty as well as the sampling design using neural networks will be provided. A simulation study was conducted to compare estimation results based on complete case analysis, multiple imputation using a Markov Chain Monte Carlo, and neural network imputation. Furthermore, a public-use dataset was used as an example to illustrate neural networks imputation under a complex survey design
Keywords: Complex survey, estimate, imputation, neural networks, variance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19712407 Remaining Useful Life Prediction Using Elliptical Basis Function Network and Markov Chain
Authors: Yi Yu, Lin Ma, Yong Sun, Yuantong Gu
Abstract:
This paper presents a novel method for remaining useful life prediction using the Elliptical Basis Function (EBF) network and a Markov chain. The EBF structure is trained by a modified Expectation-Maximization (EM) algorithm in order to take into account the missing covariate set. No explicit extrapolation is needed for internal covariates while a Markov chain is constructed to represent the evolution of external covariates in the study. The estimated external and the unknown internal covariates constitute an incomplete covariate set which are then used and analyzed by the EBF network to provide survival information of the asset. It is shown in the case study that the method slightly underestimates the remaining useful life of an asset which is a desirable result for early maintenance decision and resource planning.Keywords: Elliptical Basis Function Network, Markov Chain, Missing Covariates, Remaining Useful Life
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16612406 The Potential Involvement of Platelet Indices in Insulin Resistance in Morbid Obese Children
Authors: Orkide Donma, Mustafa M. Donma
Abstract:
Association between insulin resistance (IR) and hematological parameters has long been a matter of interest. Within this context, body mass index (BMI), red blood cells, white blood cells and platelets were involved in this discussion. Parameters related to platelets associated with IR may be useful indicators for the identification of IR. Platelet indices such as mean platelet volume (MPV), platelet distribution width (PDW) and plateletcrit (PCT) are being questioned for their possible association with IR. The aim of this study was to investigate the association between platelet (PLT) count as well as PLT indices and the surrogate indices used to determine IR in morbid obese (MO) children. A total of 167 children participated in the study. Three groups were constituted. The number of cases was 34, 97 and 36 children in the normal BMI, MO and metabolic syndrome (MetS) groups, respectively. Sex- and age-dependent BMI-based percentile tables prepared by World Health Organization were used for the definition of morbid obesity. MetS criteria were determined. BMI values, homeostatic model assessment for IR (HOMA-IR), alanine transaminase-to-aspartate transaminase ratio (ALT/AST) and diagnostic obesity notation model assessment laboratory (DONMA-lab) index values were computed. PLT count and indices were analyzed using automated hematology analyzer. Data were collected for statistical analysis using SPSS for Windows. Arithmetic mean and standard deviation were calculated. Mean values of PLT-related parameters in both control and study groups were compared by one-way ANOVA followed by Tukey post hoc tests to determine whether a significant difference exists among the groups. The correlation analyses between PLT as well as IR indices were performed. Statistically significant difference was accepted as p-value < 0.05. Increased values were detected for PLT (p < 0.01) and PCT (p > 0.05) in MO group compared to those observed in children with N-BMI. Significant increases for PLT (p < 0.01) and PCT (p < 0.05) were observed in MetS group in comparison with the values obtained in children with N-BMI (p < 0.01). Significantly lower MPV and PDW values were obtained in MO group compared to the control group (p < 0.01). HOMA-IR (p < 0.05), DONMA-lab index (p < 0.001) and ALT/AST (p < 0.001) values in MO and MetS groups were significantly increased compared to the N-BMI group. On the other hand, DONMA-lab index values also differed between MO and MetS groups (p < 0.001). In the MO group, PLT was negatively correlated with MPV and PDW values. These correlations were not observed in the N-BMI group. None of the IR indices exhibited a correlation with PLT and PLT indices in the N-BMI group. HOMA-IR showed significant correlations both with PLT and PCT in the MO group. All of the three IR indices were well-correlated with each other in all groups. These findings point out the missing link between IR and PLT activation. In conclusion, PLT and PCT may be related to IR in addition to their identities as hemostasis markers during morbid obesity. Our findings have suggested that DONMA-lab index appears as the best surrogate marker for IR due to its discriminative feature between morbid obesity and MetS.
Keywords: Children, insulin resistance, metabolic syndrome, plateletcrit, platelet indices.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 673