Search results for: incomplete data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24257

Search results for: incomplete data

24257 Survival Data with Incomplete Missing Categorical Covariates

Authors: Madaki Umar Yusuf, Mohd Rizam B. Abubakar

Abstract:

The survival censored data with incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With model when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights. The survival outcome for the class of generalized linear model is applied and this method requires the estimation of the parameters of the distribution of the covariates. In this paper, we propose some clinical trials with ve covariates, four of which have some missing values which clearly show that they were fully censored data.

Keywords: EM algorithm, incomplete categorical covariates, ignorable missing data, missing at random (MAR), Weibull Distribution

Procedia PDF Downloads 374
24256 Managing Incomplete PSA Observations in Prostate Cancer Data: Key Strategies and Best Practices for Handling Loss to Follow-Up and Missing Data

Authors: Madiha Liaqat, Rehan Ahmed Khan, Shahid Kamal

Abstract:

Multiple imputation with delta adjustment is a versatile and transparent technique for addressing univariate missing data in the presence of various missing mechanisms. This approach allows for the exploration of sensitivity to the missing-at-random (MAR) assumption. In this review, we outline the delta-adjustment procedure and illustrate its application for assessing the sensitivity to deviations from the MAR assumption. By examining diverse missingness scenarios and conducting sensitivity analyses, we gain valuable insights into the implications of missing data on our analyses, enhancing the reliability of our study's conclusions. In our study, we focused on assessing logPSA, a continuous biomarker in incomplete prostate cancer data, to examine the robustness of conclusions against plausible departures from the MAR assumption. We introduced several approaches for conducting sensitivity analyses, illustrating their application within the pattern mixture model (PMM) under the delta adjustment framework. This proposed approach effectively handles missing data, particularly loss to follow-up.

Keywords: loss to follow-up, incomplete response, multiple imputation, sensitivity analysis, prostate cancer

Procedia PDF Downloads 52
24255 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: missing values, incomplete data, distance, incomplete diabetes data

Procedia PDF Downloads 183
24254 A Pragmatic Reading of the Verb "Kana" and Its Meanings

Authors: Manal M. H. Said Najjar

Abstract:

Arab Grammarians stood at variance with regard to the definition of kana (which might equal was, were, the past form of “be” in English). Kana was considered as a verb, a particle, or a quasi-verb by different scholars; others saw it as an auxiliary verb; while some other scholars categorized kana as one of the incomplete verbs or (Afa’al naqisa) based on two different claims: first, a considerable group of grammarians saw kana as fie’l naqis or an incomplete verb since it indicates time, but not the event or action itself. Second, kana requires a predicate (xabar) to complete the meaning, i.e., it does not suffice itself with a noun in the nominal sentence. This study argues that categorizing the verb kana as fie’l naqis or an incomplete verb is inaccurate and confusing since the term “incomplete” does not agree with its characteristics, meanings, and temporal indications. Moreover, interpreting kana as a past verb is also inaccurate. kana كان (derived from the absolute action of being كون) is considered unique and the most comprehensive verb, encompassing all tenses of the past, present, and future within the dimensions of continuity and eternity of all possible actions under “being”.

Keywords: pragmatics, kana, context, Arab grammarians, meaning, fie’l naqis

Procedia PDF Downloads 57
24253 A PROMETHEE-BELIEF Approach for Multi-Criteria Decision Making Problems with Incomplete Information

Authors: H. Moalla, A. Frikha

Abstract:

Multi-criteria decision aid methods consider decision problems where numerous alternatives are evaluated on several criteria. These methods are used to deal with perfect information. However, in practice, it is obvious that this information requirement is too much strict. In fact, the imperfect data provided by more or less reliable decision makers usually affect decision results since any decision is closely linked to the quality and availability of information. In this paper, a PROMETHEE-BELIEF approach is proposed to help multi-criteria decisions based on incomplete information. This approach solves problems with incomplete decision matrix and unknown weights within PROMETHEE method. On the base of belief function theory, our approach first determines the distributions of belief masses based on PROMETHEE’s net flows and then calculates weights. Subsequently, it aggregates the distribution masses associated to each criterion using Murphy’s modified combination rule in order to infer a global belief structure. The final action ranking is obtained via pignistic probability transformation. A case study of real-world application concerning the location of a waste treatment center from healthcare activities with infectious risk in the center of Tunisia is studied to illustrate the detailed process of the BELIEF-PROMETHEE approach.

Keywords: belief function theory, incomplete information, multiple criteria analysis, PROMETHEE method

Procedia PDF Downloads 130
24252 A Review of Methods for Handling Missing Data in the Formof Dropouts in Longitudinal Clinical Trials

Authors: A. Satty, H. Mwambi

Abstract:

Much clinical trials data-based research are characterized by the unavoidable problem of dropout as a result of missing or erroneous values. This paper aims to review some of the various techniques to address the dropout problems in longitudinal clinical trials. The fundamental concepts of the patterns and mechanisms of dropout are discussed. This study presents five general techniques for handling dropout: (1) Deletion methods; (2) Imputation-based methods; (3) Data augmentation methods; (4) Likelihood-based methods; and (5) MNAR-based methods. Under each technique, several methods that are commonly used to deal with dropout are presented, including a review of the existing literature in which we examine the effectiveness of these methods in the analysis of incomplete data. Two application examples are presented to study the potential strengths or weaknesses of some of the methods under certain dropout mechanisms as well as to assess the sensitivity of the modelling assumptions.

Keywords: incomplete longitudinal clinical trials, missing at random (MAR), imputation, weighting methods, sensitivity analysis

Procedia PDF Downloads 378
24251 [Keynote Talk]: Evidence Fusion in Decision Making

Authors: Mohammad Abdullah-Al-Wadud

Abstract:

In the current era of automation and artificial intelligence, different systems have been increasingly keeping on depending on decision-making capabilities of machines. Such systems/applications may range from simple classifiers to sophisticated surveillance systems based on traditional sensors and related equipment which are becoming more common in the internet of things (IoT) paradigm. However, the available data for such problems are usually imprecise and incomplete, which leads to uncertainty in decisions made based on traditional probability-based classifiers. This requires a robust fusion framework to combine the available information sources with some degree of certainty. The theory of evidence can provide with such a method for combining evidence from different (may be unreliable) sources/observers. This talk will address the employment of the Dempster-Shafer Theory of evidence in some practical applications.

Keywords: decision making, dempster-shafer theory, evidence fusion, incomplete data, uncertainty

Procedia PDF Downloads 390
24250 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 109
24249 Structural Behavior of Incomplete Box Girder Bridges Subjected to Unpredicted Loads

Authors: E. H. N. Gashti, J. Razzaghi, K. Kujala

Abstract:

In general, codes and regulations consider seismic loads only for completed structures of the bridges while, evaluation of incomplete structure of bridges, especially those constructed by free cantilever method, under these loads is also of great importance. Hence, this research tried to study the behavior of incomplete structure of common bridge type (box girder bridge), in construction phase under vertical seismic loads. Subsequently, the paper provided suitable guidelines and solutions to withstand this destructive phenomena. Research results proved that use of preventive methods can significantly reduce the stresses resulted from vertical seismic loads in box cross sections to an acceptable range recommended by design codes.

Keywords: box girder bridges, prestress loads, free cantilever method, seismic loads, construction phase

Procedia PDF Downloads 310
24248 Zero Cross-Correlation Codes Based on Balanced Incomplete Block Design: Performance Analysis and Applications

Authors: Garadi Ahmed, Boubakar S. Bouazza

Abstract:

The Zero Cross-Correlation (C, w) code is a family of binary sequences of length C and constant Hamming-weight, the cross correlation between any two sequences equal zero. In this paper, we evaluate the performance of ZCC code based on Balanced Incomplete Block Design (BIBD) for Spectral Amplitude Coding Optical Code Division Multiple Access (SAC-OCDMA) system using direct detection. The BER obtained is better than 10-9 for five simultaneous users.

Keywords: spectral amplitude coding-optical code-division-multiple-access (SAC-OCDMA), phase induced intensity noise (PIIN), balanced incomplete block design (BIBD), zero cross-correlation (ZCC)

Procedia PDF Downloads 337
24247 Imputation of Incomplete Large-Scale Monitoring Count Data via Penalized Estimation

Authors: Mohamed Dakki, Genevieve Robin, Marie Suet, Abdeljebbar Qninba, Mohamed A. El Agbani, Asmâa Ouassou, Rhimou El Hamoumi, Hichem Azafzaf, Sami Rebah, Claudia Feltrup-Azafzaf, Nafouel Hamouda, Wed a.L. Ibrahim, Hosni H. Asran, Amr A. Elhady, Haitham Ibrahim, Khaled Etayeb, Essam Bouras, Almokhtar Saied, Ashrof Glidan, Bakar M. Habib, Mohamed S. Sayoud, Nadjiba Bendjedda, Laura Dami, Clemence Deschamps, Elie Gaget, Jean-Yves Mondain-Monval, Pierre Defos Du Rau

Abstract:

In biodiversity monitoring, large datasets are becoming more and more widely available and are increasingly used globally to estimate species trends and con- servation status. These large-scale datasets challenge existing statistical analysis methods, many of which are not adapted to their size, incompleteness and heterogeneity. The development of scalable methods to impute missing data in incomplete large-scale monitoring datasets is crucial to balance sampling in time or space and thus better inform conservation policies. We developed a new method based on penalized Poisson models to impute and analyse incomplete monitoring data in a large-scale framework. The method al- lows parameterization of (a) space and time factors, (b) the main effects of predic- tor covariates, as well as (c) space–time interactions. It also benefits from robust statistical and computational capability in large-scale settings. The method was tested extensively on both simulated and real-life waterbird data, with the findings revealing that it outperforms six existing methods in terms of missing data imputation errors. Applying the method to 16 waterbird species, we estimated their long-term trends for the first time at the entire North African scale, a region where monitoring data suffer from many gaps in space and time series. This new approach opens promising perspectives to increase the accuracy of species-abundance trend estimations. We made it freely available in the r package ‘lori’ (https://CRAN.R-project.org/package=lori) and recommend its use for large- scale count data, particularly in citizen science monitoring programmes.

Keywords: biodiversity monitoring, high-dimensional statistics, incomplete count data, missing data imputation, waterbird trends in North-Africa

Procedia PDF Downloads 116
24246 Structural Damage Detection via Incomplete Model Data Using Output Data Only

Authors: Ahmed Noor Al-qayyim, Barlas Özden Çağlayan

Abstract:

Structural failure is caused mainly by damage that often occurs on structures. Many researchers focus on obtaining very efficient tools to detect the damage in structures in the early state. In the past decades, a subject that has received considerable attention in literature is the damage detection as determined by variations in the dynamic characteristics or response of structures. This study presents a new damage identification technique. The technique detects the damage location for the incomplete structure system using output data only. The method indicates the damage based on the free vibration test data by using “Two Points - Condensation (TPC) technique”. This method creates a set of matrices by reducing the structural system to two degrees of freedom systems. The current stiffness matrices are obtained from optimization of the equation of motion using the measured test data. The current stiffness matrices are compared with original (undamaged) stiffness matrices. High percentage changes in matrices’ coefficients lead to the location of the damage. TPC technique is applied to the experimental data of a simply supported steel beam model structure after inducing thickness change in one element. Where two cases are considered, the method detects the damage and determines its location accurately in both cases. In addition, the results illustrate that these changes in stiffness matrix can be a useful tool for continuous monitoring of structural safety using ambient vibration data. Furthermore, its efficiency proves that this technique can also be used for big structures.

Keywords: damage detection, optimization, signals processing, structural health monitoring, two points–condensation

Procedia PDF Downloads 334
24245 The Effect of Size and Tumor Depth on Histological Clearance Margins of Basal Cell Carcinomas

Authors: Martin Van, Mohammed Javed, Sarah Hemington-Gorse

Abstract:

Aim: Our aim was to determine the effect of size and tumor depth of basal cell carcinomas (BCCs) on surgical margin clearance. Methods: A retrospective study was conducted at the Welsh Centre for Burns and Plastic Surgery (WCBPS), Morriston Hospital between 1 Jan 2016 – 31 July 2016. Only patients with confirmed BCC on histopathological analysis were included. Patient data including anatomical region treated, lesion size, histopathological clearance margins and histological sub-types were recorded. An independent T-test was performed determine statistical significance. Results: A total of 228 BCCs were excised in 160 patients. Eleven lesions (4.8%) were incompletely excised. The nose area had the highest rate of incomplete excision. The mean diameter of incompletely excised lesions was 11.4mm vs 11.5mm in completely excised lesions (p=0.959) and the mean histological depth of incompletely excised lesions was 4.1mm vs. 2.5mm for completely excised BCCs (p < 0.05). Conclusions: BCC tumor depth of > 4.1 mm was associated with high rate of incomplete margin clearance. Hence, in prospective patients, a BCC tumor depth (>4 mm) on tissue biopsy should alert the surgeon of potentially higher risk of incomplete excision of lesion.

Keywords: basal cell carcinoma, excision margins, plastic surgery, treatment

Procedia PDF Downloads 209
24244 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices

Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu

Abstract:

Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.

Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction

Procedia PDF Downloads 70
24243 Electrochemical Regeneration of GIC Adsorbent in a Continuous Electrochemical Reactor

Authors: S. N. Hussain, H. M. A. Asghar, H. Sattar, E. P. L. Roberts

Abstract:

Arvia™ introduced a novel technology consisting of adsorption followed by electrochemical regeneration with a graphite intercalation compound adsorbent that takes place in a single unit. The adsorbed species may lead to the formation of intermediate by-products products due to incomplete mineralization during electrochemical regeneration. Therefore, the investigation of breakdown products due to incomplete oxidation is of great concern regarding the commercial applications of this process. In the present paper, the formation of the chlorinated breakdown products during continuous process of adsorption and electrochemical regeneration based on a graphite intercalation compound adsorbent has been investigated.

Keywords: GIC, adsorption, electrochemical regeneration, chlorphenols

Procedia PDF Downloads 272
24242 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 400
24241 A Systematic Review of the Methodological and Reporting Quality of Case Series in Surgery

Authors: Riaz A. Agha, Alexander J. Fowler, Seon-Young Lee, Buket Gundogan, Katharine Whitehurst, Harkiran K. Sagoo, Kyung Jin Lee Jeong, Douglas G. Altman, Dennis P. Orgill

Abstract:

Introduction: Case Series are an important and common study type. Currently, no guideline exists for reporting case series and there is evidence of key data being missed from such reports. We propose to develop a reporting guideline for case series using a methodologically robust technique. The first step in this process is a systematic review of literature relevant to the reporting deficiencies of case series. Methods: A systematic review of methodological and reporting quality in surgical case series was performed. The electronic search strategy was developed by an information specialist and included MEDLINE, EMBASE, Cochrane Methods Register, Science Citation index and Conference Proceedings Citation index, from the start of indexing until 5th November 2014. Independent screening, eligibility assessments and data extraction was performed. Included articles were analyzed for five areas of deficiency: failure to use standardized definitions missing or selective data transparency or incomplete reporting whether alternate study designs were considered. Results: The database searching identified 2,205 records. Through the process of screening and eligibility assessments, 92 articles met inclusion criteria. Frequency of methodological and reporting issues identified was a failure to use standardized definitions (57%), missing or selective data (66%), transparency, or incomplete reporting (70%), whether alternate study designs were considered (11%) and other issues (52%). Conclusion: The methodological and reporting quality of surgical case series needs improvement. Our data shows that clear evidence-based guidelines for the conduct and reporting of a case series may be useful to those planning or conducting them.

Keywords: case series, reporting quality, surgery, systematic review

Procedia PDF Downloads 334
24240 A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data

Authors: S. Nickolas, Shobha K.

Abstract:

The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.

Keywords: ART2, data imputation, clustering, missing data, neural network, pre-processing

Procedia PDF Downloads 245
24239 Use of In-line Data Analytics and Empirical Model for Early Fault Detection

Authors: Hyun-Woo Cho

Abstract:

Automatic process monitoring schemes are designed to give early warnings for unusual process events or abnormalities as soon as possible. For this end, various techniques have been developed and utilized in various industrial processes. It includes multivariate statistical methods, representation skills in reduced spaces, kernel-based nonlinear techniques, etc. This work presents a nonlinear empirical monitoring scheme for batch type production processes with incomplete process measurement data. While normal operation data are easy to get, unusual fault data occurs infrequently and thus are difficult to collect. In this work, noise filtering steps are added in order to enhance monitoring performance by eliminating irrelevant information of the data. The performance of the monitoring scheme was demonstrated using batch process data. The results showed that the monitoring performance was improved significantly in terms of detection success rate of process fault.

Keywords: batch process, monitoring, measurement, kernel method

Procedia PDF Downloads 291
24238 Examining the Skills of Establishing Number and Space Relations of Science Students with the 'Integrative Perception Test'

Authors: Ni̇sa Yeni̇kalayci, Türkan Aybi̇ke Akarca

Abstract:

The ability of correlation the number and space relations, one of the basic scientific process skills, is being used in the transformation of a two-dimensional object into a three-dimensional image or in the expression of symmetry axes of the object. With this research, it is aimed to determine the ability of science students to establish number and space relations. The research was carried out with a total of 90 students studying in the first semester of the Science Education program of a state university located in the Turkey’s Black Sea Region in the fall semester of 2017-2018 academic year. An ‘Integrative Perception Test (IPT)’ was designed by the researchers to collect the data. Within the scope of IPT, the courses and workbooks specific to the field of science were scanned and the ones without symmetrical structure from the visual items belonging to the ‘Physics - Chemistry – Biology’ sub-fields were selected and listed. During the application, it was expected that students would imagine and draw images of the missing half of the visual items that were given incomplete in the first place. The data obtained from the test in which there are 30 images or pictures in total (f Physics = 10, f Chemistry = 10, f Biology = 10) were analyzed descriptively based on the drawings created by the students as ‘complete (2 points), incomplete/wrong (1 point), empty (0 point)’. For the teaching of new concepts in small aged groups, images or pictures showing symmetrical structures and similar applications can also be used.

Keywords: integrative perception, number and space relations, science education, scientific process skills

Procedia PDF Downloads 126
24237 Supervised Learning for Cyber Threat Intelligence

Authors: Jihen Bennaceur, Wissem Zouaghi, Ali Mabrouk

Abstract:

The major aim of cyber threat intelligence (CTI) is to provide sophisticated knowledge about cybersecurity threats to ensure internal and external safeguards against modern cyberattacks. Inaccurate, incomplete, outdated, and invaluable threat intelligence is the main problem. Therefore, data analysis based on AI algorithms is one of the emergent solutions to overcome the threat of information-sharing issues. In this paper, we propose a supervised machine learning-based algorithm to improve threat information sharing by providing a sophisticated classification of cyber threats and data. Extensive simulations investigate the accuracy, precision, recall, f1-score, and support overall to validate the designed algorithm and to compare it with several supervised machine learning algorithms.

Keywords: threat information sharing, supervised learning, data classification, performance evaluation

Procedia PDF Downloads 109
24236 Predicting Seoul Bus Ridership Using Artificial Neural Network Algorithm with Smartcard Data

Authors: Hosuk Shin, Young-Hyun Seo, Eunhak Lee, Seung-Young Kho

Abstract:

Currently, in Seoul, users have the privilege to avoid riding crowded buses with the installation of Bus Information System (BIS). BIS has three levels of on-board bus ridership level information (spacious, normal, and crowded). However, there are flaws in the system due to it being real time which could provide incomplete information to the user. For example, a bus comes to the station, and on the BIS it shows that the bus is crowded, but on the stop that the user is waiting many people get off, which would mean that this station the information should show as normal or spacious. To fix this problem, this study predicts the bus ridership level using smart card data to provide more accurate information about the passenger ridership level on the bus. An Artificial Neural Network (ANN) is an interconnected group of nodes, that was created based on the human brain. Forecasting has been one of the major applications of ANN due to the data-driven self-adaptive methods of the algorithm itself. According to the results, the ANN algorithm was stable and robust with somewhat small error ratio, so the results were rational and reasonable.

Keywords: smartcard data, ANN, bus, ridership

Procedia PDF Downloads 137
24235 Modelling the Indonesian Goverment Securities Yield Curve Using Nelson-Siegel-Svensson and Support Vector Regression

Authors: Jamilatuzzahro, Rezzy Eko Caraka

Abstract:

The yield curve is the plot of the yield to maturity of zero-coupon bonds against maturity. In practice, the yield curve is not observed but must be extracted from observed bond prices for a set of (usually) incomplete maturities. There exist many methodologies and theory to analyze of yield curve. We use two methods (the Nelson-Siegel Method, the Svensson Method, and the SVR method) in order to construct and compare our zero-coupon yield curves. The objectives of this research were: (i) to study the adequacy of NSS model and SVR to Indonesian government bonds data, (ii) to choose the best optimization or estimation method for NSS model and SVR. To obtain that objective, this research was done by the following steps: data preparation, cleaning or filtering data, modeling, and model evaluation.

Keywords: support vector regression, Nelson-Siegel-Svensson, yield curve, Indonesian government

Procedia PDF Downloads 212
24234 An Audit of Local Guidance Compliance For Stereotactic Core Biopsy For DCIS In The Breast Screening Programme

Authors: Aisling Eves, Andrew Pieri, Ross McLean, Nerys Forester

Abstract:

Background: The breast unit local guideline recommends that 12 cores should be used in a stereotactic-guided biopsy to diagnose DCIS. Twelve cores are regarded to provide good diagnostic value without removing more breast tissue than necessary. This study aimed to determine compliance with guidelines and investigated how the number of cores impacted upon the re-excision rate and size discrepancies. Methods: This single-centre retrospective cohort study of 72 consecutive breast screened patients with <15mm DCIS on radiological report underwent stereotactic-guided core biopsy and subsequent surgical excision. Clinical, radiological, and histological data were collected over 5 years, and ASCO guidelines for margin involvement of <2mm was used to guide the need for re-excision. Results: Forty-six (63.9%) patients had <12 cores taken, and 26 (36.1%) patients had ≥12 cores taken. Only six (8.3%) patients had 12 cores taken in their stereotactic biopsy. Incomplete surgical excision was seen in 17 patients overall (23.6%), and of these patients, twelve (70.6%) had fewer than 12 cores taken (p=0.55 for the difference between groups). Mammogram and biopsy underestimated the size of the DCIS in this subgroup by a median of 15mm (range: 6-135mm). Re-excision was required in 9 patients (12.5%), and five patients (6.9%) were found to have invasive ductal carcinoma on excision (80% had <12 cores, p=0.43). Discussion: There is poor compliance with the breast unit local guidelines and higher rates of re-excision in patients who did not have ≥12 cores taken. Taking ≥12 cores resulted in fewer missed invasive cancers lower incomplete excision and re-excision rates.

Keywords: stereotactic core biopsy, DCIS, breast screening, Re-excision rates, core biopsy

Procedia PDF Downloads 93
24233 Incomplete Existing Algebra to Support Mathematical Computations

Authors: Ranjit Biswas

Abstract:

The existing subject Algebra is incomplete to support mathematical computations being done by scientists of all areas: Mathematics, Physics, Statistics, Chemistry, Space Science, Cosmology etc. even starting from the era of great Einstein. A huge hidden gap in the subject ‘Algebra’ is unearthed. All the scientists today, including mathematicians, physicists, chemists, statisticians, cosmologists, space scientists, and economists, even starting from the great Einstein, are lucky that they got results without facing any contradictions or without facing computational errors. Most surprising is that the results of all scientists, including Nobel Prize winners, were proved by them by doing experiments too. But in this paper, it is rigorously justified that they all are lucky. An algebraist can define an infinite number of new algebraic structures. The objective of the work in this paper is not just for the sake of defining a distinct algebraic structure, but to recognize and identify a major gap of the subject ‘Algebra’ lying hidden so far in the existing vast literature of it. The objective of this work is to fix the unearthed gap. Consequently, a different algebraic structure called ‘Region’ has been introduced, and its properties are studied.

Keywords: region, ROR, RORR, region algebra

Procedia PDF Downloads 14
24232 Rehabilitation Robot in Primary Walking Pattern Training for SCI Patient at Home

Authors: Taisuke Sakaki, Toshihiko Shimokawa, Nobuhiro Ushimi, Koji Murakami, Yong-Kwun Lee, Kazuhiro Tsuruta, Kanta Aoki, Kaoru Fujiie, Ryuji Katamoto, Atsushi Sugyo

Abstract:

Recently attention has been focused on incomplete spinal cord injuries (SCI) to the central spine caused by pressure on parts of the white matter conduction pathway, such as the pyramidal tract. In this paper, we focus on a training robot designed to assist with primary walking-pattern training. The target patient for this training robot is relearning the basic functions of the usual walking pattern; it is meant especially for those with incomplete-type SCI to the central spine, who are capable of standing by themselves but not of performing walking motions. From the perspective of human engineering, we monitored the operator’s actions to the robot and investigated the movement of joints of the lower extremities, the circumference of the lower extremities, and exercise intensity with the machine. The concept of the device was to provide mild training without any sudden changes in heart rate or blood pressure, which will be particularly useful for the elderly and disabled. The mechanism of the robot is modified to be simple and lightweight with the expectation that it will be used at home.

Keywords: training, rehabilitation, SCI patient, welfare, robot

Procedia PDF Downloads 397
24231 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models

Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand

Abstract:

Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models on two different realworld electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.

Keywords: EHR, machine learning, imputation, laboratory variables, algorithmic bias

Procedia PDF Downloads 46
24230 Bayesian Analysis of Topp-Leone Generalized Exponential Distribution

Authors: Najrullah Khan, Athar Ali Khan

Abstract:

The Topp-Leone distribution was introduced by Topp- Leone in 1955. In this paper, an attempt has been made to fit Topp-Leone Generalized exponential (TPGE) distribution. A real survival data set is used for illustrations. Implementation is done using R and JAGS and appropriate illustrations are made. R and JAGS codes have been provided to implement censoring mechanism using both optimization and simulation tools. The main aim of this paper is to describe and illustrate the Bayesian modelling approach to the analysis of survival data. Emphasis is placed on the modeling of data and the interpretation of the results. Crucial to this is an understanding of the nature of the incomplete or 'censored' data encountered. Analytic approximation and simulation tools are covered here, but most of the emphasis is on Markov chain based Monte Carlo method including independent Metropolis algorithm, which is currently the most popular technique. For analytic approximation, among various optimization algorithms and trust region method is found to be the best. In this paper, TPGE model is also used to analyze the lifetime data in Bayesian paradigm. Results are evaluated from the above mentioned real survival data set. The analytic approximation and simulation methods are implemented using some software packages. It is clear from our findings that simulation tools provide better results as compared to those obtained by asymptotic approximation.

Keywords: Bayesian Inference, JAGS, Laplace Approximation, LaplacesDemon, posterior, R Software, simulation

Procedia PDF Downloads 498
24229 Ponticuli of Atlas Vertebra: A Study in South Coastal Region of Andhra Pradesh

Authors: Hema Lattupalli

Abstract:

Introduction: A bony bridge extends from the lateral mass of the atlas to postero medial margin of vertebral artery groove, termed as a posterior bridge of atlas or posterior ponticulus. The foramen formed by the bridge is called as arcuate foramen or retroarticulare superior. Another bony bridge sometimes extends laterally from lateral mass to posterior root of transverse foramen forming and additional groove for vertebral artery, above and behind foramen transversarium called Lateral bridge or ponticulus lateralis. When both posterior and lateral are present together it is called as Posterolateral ponticuli. Aim and Objectives: The aim of the present study is to detect the presence of such Bridge or Ponticuli called as Lateral, Posterior and Posterolateral reported by earlier investigators in atlas vertebrae. Material and Methods: The study was done on 100 Atlas vertebrae from the Department of Anatomy Narayana Medical College Nellore, and also from SVIMS Tirupati was collected over a period of 2 years. The parameters that were studied include the presence of ponticuli, complete and incomplete and right and left side ponticuli. They were observed for all these parameters and the results were documented and photographed. Results: Ponticuli were observed in 25 (25%) of atlas vertebrae. Posterior ponticuli were found in 16 (16%), Lateral in 01 (01%) and Posterolateral in 08(08%) of the atlas vertebrae. Complete ponticuli were present in 09 (09%) and incomplete ponticuli in 16 (16%) of the atlas vertebrae. Bilateral ponticuli were seen in 10 (10%) and unilateral ponticuli were seen in 15 (15%) of the atlas vertebrae. Right side ponticuli were seen in 04 (04%) and Left side ponticuli in 05 (05%) of the atlas vertebrae respectively. Interpretation and Conclusion: In the present study posterior complete ponticuli were said to be more than the lateral complete ponticuli. The presence of Bilateral Incomplete Posterior ponticuli is higher and also Atlantic ponticuli. The present study is to say that knowledge of normal anatomy and variations in the atlas vertebra is very much essential to the neurosurgeons giving a message that utmost care is needed to perform surgeries related to craniovertebral regions. This is additional information to the Anatomists, Neurosurgeons and Radiologist. This adds an extra page to the literature.

Keywords: atlas vertebra, ponticuli, posterior arch, arcuate foramen

Procedia PDF Downloads 342
24228 Rapid Assessment the Ability of Forest Vegetation in Kulonprogo to Store Carbon Using Multispectral Satellite Imagery and Vegetation Index

Authors: Ima Rahmawati, Nur Hafizul Kalam

Abstract:

Development of industrial and economic sectors in various countries very rapidly caused raising the greenhouse gas (GHG) emissions. Greenhouse gases are dominated by carbon dioxide (CO2) and methane (CH4) in the atmosphere that make the surface temperature of the earth always increase. The increasing gases caused by incomplete combustion of fossil fuels such as petroleum and coals and also high rate of deforestation. Yogyakarta Special Province which every year always become tourist destination, has a great potency in increasing of greenhouse gas emissions mainly from the incomplete combustion. One of effort to reduce the concentration of gases in the atmosphere is keeping and empowering the existing forests in the Province of Yogyakarta, especially forest in Kulonprogro is to be maintained the greenness so that it can absorb and store carbon maximally. Remote sensing technology can be used to determine the ability of forests to absorb carbon and it is connected to the density of vegetation. The purpose of this study is to determine the density of the biomass of forest vegetation and determine the ability of forests to store carbon through Photo-interpretation and Geographic Information System approach. Remote sensing imagery that used in this study is LANDSAT 8 OLI year 2015 recording. LANDSAT 8 OLI imagery has 30 meters spatial resolution for multispectral bands and it can give general overview the condition of the carbon stored from every density of existing vegetation. The method is the transformation of vegetation index combined with allometric calculation of field data then doing regression analysis. The results are model maps of density and capability level of forest vegetation in Kulonprogro, Yogyakarta in storing carbon.

Keywords: remote sensing, carbon, kulonprogo, forest vegetation, vegetation index

Procedia PDF Downloads 361