Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24455

Search results for: categorical data

24455 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 243

24454 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 34

24453 Survival Data with Incomplete Missing Categorical Covariates

Authors: Madaki Umar Yusuf, Mohd Rizam B. Abubakar

Abstract:

The survival censored data with incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. With model when the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM by the method of weights. The survival outcome for the class of generalized linear model is applied and this method requires the estimation of the parameters of the distribution of the covariates. In this paper, we propose some clinical trials with ve covariates, four of which have some missing values which clearly show that they were fully censored data.

Keywords: EM algorithm, incomplete categorical covariates, ignorable missing data, missing at random (MAR), Weibull Distribution

Procedia PDF Downloads 386

24452 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

Authors: Rishabh Srivastav, Divyam Sharma

Abstract:

We propose an algorithm to cluster categorical data named as ‘Genetic algorithm initialized rough set based fuzzy K-Modes for categorical data’. We propose an amalgamation of the simple K-modes algorithm, the Rough and Fuzzy set based K-modes and the Genetic Algorithm to form a new algorithm,which we hypothesise, will provide better Centroid Model clustering results, than existing standard algorithms. In the proposed algorithm, the initialization and updation of modes is done by the use of genetic algorithms while the membership values are calculated using the rough set and fuzzy logic.

Keywords: categorical data, fuzzy logic, genetic algorithm, K modes clustering, rough sets

Procedia PDF Downloads 229

24451 Determination Power and Sample Size Zero-Inflated Negative Binomial Dependent Death Rate of Age Model (ZINBD): Regression Analysis Mortality Acquired Immune Deficiency Deciency Syndrome (AIDS)

Authors: Mohd Asrul Affendi Bin Abdullah

Abstract:

Sample size calculation is especially important for zero inflated models because a large sample size is required to detect a significant effect with this model. This paper verify how to present percentage of power approximation for categorical and then extended to zero inflated models. Wald test was chosen to determine power sample size of AIDS death rate because it is frequently used due to its approachability and its natural for several major recent contribution in sample size calculation for this test. Power calculation can be conducted when covariates are used in the modeling ‘excessing zero’ data and assist categorical covariate. Analysis of AIDS death rate study is used for this paper. Aims of this study to determine the power of sample size (N = 945) categorical death rate based on parameter estimate in the simulation of the study.

Keywords: power sample size, Wald test, standardize rate, ZINBDR

Procedia PDF Downloads 423

24450 Qualitative Data Analysis for Health Care Services

Authors: Taner Ersoz, Filiz Ersoz

Abstract:

This study was designed enable application of multivariate technique in the interpretation of categorical data for measuring health care services satisfaction in Turkey. The data was collected from a total of 17726 respondents. The establishment of the sample group and collection of the data were carried out by a joint team from The Ministry of Health and Turkish Statistical Institute (Turk Stat) of Turkey. The multiple correspondence analysis (MCA) was used on the data of 2882 respondents who answered the questionnaire in full. The multiple correspondence analysis indicated that, in the evaluation of health services females, public employees, younger and more highly educated individuals were more concerned and complainant than males, private sector employees, older and less educated individuals. Overall 53 % of the respondents were pleased with the improvements in health care services in the past three years. This study demonstrates the public consciousness in health services and health care satisfaction in Turkey. It was found that most the respondents were pleased with the improvements in health care services over the past three years. Awareness of health service quality increases with education levels. Older individuals and males would appear to have lower expectancies in health services.

Keywords: multiple correspondence analysis, multivariate categorical data, health care services, health satisfaction survey

Procedia PDF Downloads 218

24449 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin

Authors: M. A. Wanzala, L. A. Ogallo, F. J. Opijah, J. N. Mutemi

Abstract:

The usefulness and limitations in climate information are due to uncertainty inherent in the climate system. For any given region to have sustainable development it is important to apply climate information into its socio-economic strategic plans. The overall objective of the study was to assess the performance of the Coupled Model Inter-comparison Project (CMIP5) over the Lake Victoria Basin. The datasets used included the observed point station data, gridded rainfall data from Climate Research Unit (CRU) and hindcast data from eight CMIP5. The methodology included trend analysis, spatial analysis, correlation analysis, Principal Component Analysis (PCA) regression analysis, and categorical statistical skill score. Analysis of the trends in the observed rainfall records indicated an increase in rainfall variability both in space and time for all the seasons. The spatial patterns of the individual models output from the models of MPI, MIROC, EC-EARTH and CNRM were closest to the observed rainfall patterns.

Keywords: categorical statistics, coupled model inter-comparison project, principal component analysis, statistical downscaling

Procedia PDF Downloads 352

24448 The Univalence Principle: Equivalent Mathematical Structures Are Indistinguishable

Authors: Michael Shulman, Paige North, Benedikt Ahrens, Dmitris Tsementzis

Abstract:

The Univalence Principle is the statement that equivalent mathematical structures are indistinguishable. We prove a general version of this principle that applies to all set-based, categorical, and higher-categorical structures defined in a non-algebraic and space-based style, as well as models of higher-order theories such as topological spaces. In particular, we formulate a general definition of indiscernibility for objects of any such structure, and a corresponding univalence condition that generalizes Rezk’s completeness condition for Segal spaces and ensures that all equivalences of structures are levelwise equivalences. Our work builds on Makkai’s First-Order Logic with Dependent Sorts, but is expressed in Voevodsky’s Univalent Foundations (UF), extending previous work on the Structure Identity Principle and univalent categories in UF. This enables indistinguishability to be expressed simply as identification, and yields a formal theory that is interpretable in classical homotopy theory, but also in other higher topos models. It follows that Univalent Foundations is a fully equivalence-invariant foundation for higher-categorical mathematics, as intended by Voevodsky.

Keywords: category theory, higher structures, inverse category, univalence

Procedia PDF Downloads 132

24447 Syllogistic Reasoning with 108 Inference Rules While Case Quantities Change

Authors: Mikhail Zarechnev, Bora I. Kumova

Abstract:

A syllogism is a deductive inference scheme used to derive a conclusion from a set of premises. In a categorical syllogisms, there are only two premises and every premise and conclusion is given in form of a quantified relationship between two objects. The different order of objects in premises give classification known as figures. We have shown that the ordered combinations of 3 generalized quantifiers with certain figure provide in total of 108 syllogistic moods which can be considered as different inference rules. The classical syllogistic system allows to model human thought and reasoning with syllogistic structures always attracted the attention of cognitive scientists. Since automated reasoning is considered as part of learning subsystem of AI agents, syllogistic system can be applied for this approach. Another application of syllogistic system is related to inference mechanisms on the Semantic Web applications. In this paper we proposed the mathematical model and algorithm for syllogistic reasoning. Also the model of iterative syllogistic reasoning in case of continuous flows of incoming data based on case–based reasoning and possible applications of proposed system were discussed.

Keywords: categorical syllogism, case-based reasoning, cognitive architecture, inference on the semantic web, syllogistic reasoning

Procedia PDF Downloads 399

24446 A Comparison of Caesarean Section Indications and Characteristics in 2009 and 2020 in a Saudi Tertiary Hospital

Authors: Sarah K. Basudan, Ragad I. Al Jazzar, Zeinah Sulaihim, Hanan M. Al-Kadri

Abstract:

Background: Cesarean section has been increasing in recent years, with a wide range of etiologies contributing to this rise. This study aimed to assess the indications, outcomes, and complications in Riyadh, Saudi Arabia. Methods: A Retrospective Cohort study was conducted at King Abdulaziz medical city. The study includes two cohorts: G1 (2009) and G2 (2020) groups who met the inclusion criteria. The data was transferred to the SPSS (statistical package for social sciences) version 24 for analysis. The initial descriptive statistics were run for all variables, including numerical and categorical data. The numerical data were reported as median, and standard deviation and categorical data were reported as frequencies and percentages. Results: The data were collected from 399 women who were divided into two groups, G1(199) and G2(200). The mean age of all participants is 32+-6; G1 and G2 had significant differences in age means with 30+-6 and 34+-5, respectively, with a p-value of <0.001, which indicates delayed fertility by four years. Moreover, a breech presentation was less likely to occur in G2 (OR 0.64, CI: 0.21-0.62. P<0.001). Nonetheless, maternal causes such as repeated C-sections and maternal medical conditions were more likely to happen in G2 (OR 1.5, CI: 1.04-2.38, p=0.03) and (OR 5.4, CI: 1.12-23.9, P=0.01), respectively. Furthermore, postpartum hemorrhage showed an increase of 12% in G2 (OR 5.4, CI: 2.2-13.4, p<0.001). G2 was more likely to be admitted to the neonatal intensive care unit (NICU) (OR 16, CI: 7.4-38.7) and to special care baby (SCB) (OR 7.2, CI: 3.9-13.1), both with a p-value<0.001 compared to regular nursery admission. Conclusion: There are multiple factors that are contributing to the increase in c section rate in a Saudi tertiary hospitals. The factors were suggested to be previous c-sections, abnormal fetal heart rate, malpresentation, and maternal or fetal medical conditions.

Keywords: cesarean sections, maternal indications, maternal complications, neonatal condition

Procedia PDF Downloads 61

24445 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

Authors: Shahadut Hossain

Abstract:

Measurement errors in continuous covariates and/or misclassifications in categorical covariates are common in epidemiological studies. Regression analysis ignoring such mismeasurements seriously biases the estimated main and interaction effects of covariates on the outcome of interest. Thus, adjustments for such mismeasurements are necessary. In this research, we propose a Bayesian parametric framework for eliminating deleterious impacts of covariate mismeasurements in logistic regression. The proposed adjustment method is unified and thus can be applied to any generalized linear and non-linear regression models. Furthermore, adjustment for covariate mismeasurements requires validation data usually in the form of either gold standard measurements or replicates of the mismeasured covariates on a subset of the study population. Initial investigation shows that adequacy of such adjustment depends on the sizes of main and validation samples, especially when prevalences of the categorical covariates are low. Thus, we investigate the impact of main and validation sample sizes on the adjusted estimates, and provide a general guideline about these sample sizes based on simulation studies.

Keywords: measurement errors, misclassification, mismeasurement, validation sample, Bayesian adjustment

Procedia PDF Downloads 395

24444 Acute Hepatitis A Outbreak in Men Who Has Sex with Men in a Medical Center in Northern Taiwan

Authors: Yu-Tzu Hsu, Alice Wu, Hsiang-Kuang Tseng

Abstract:

Introduction: Hepatitis A virus causes acute hepatitis and is usually transmitted by a fecal-oral route of food contamination, which is more prevalent in areas with poor hygienic practices. However, we described a hepatitis A outbreak associated with a fecal-oral route of sexual behavior in men who has sex with men (MSM) in Northern Taiwan. Methods: We retrospectively collected patients with acute HAV infection in MacKay Memorial Hospital, Taipei, Taiwan between July 2015 and November 2016. Demographic data (age, gender, onset time and infection risk), laboratory data (GOT, GPT, bilirubin, HIV status, HBsAg, HCV antibody and syphilis), clinical symptoms and travel history with a foreign tour were analyzed. We compared variables between HIV and non-HIV group. Unless otherwise stated, continuous variables were expressed as mean ± SD, and categorical variables were expressed as number (percentage) for each item. The t test for continuous variables was applied for the comparison between two groups and chi-square for categorical variables were applied for measures of association. Results: We collected 80 cases during the study period. Among them, 54 (67.5%) cases were MSM and 43 (53.8%) cases were HIV positive. The average age was 32.6±7.59 years-old. The average value of initial liver function was 1324 IU/L for AST (GOT), 2100 IU/L for ALT (GPT), and 5.82 mg/dL for bilirubin. We found seven (8.6%) cases were in the status of HBV carrier, five (6.3%) cases were positive for HCV antibody, and 15 (18.6%) cases were co-infected with syphilis. With regards to associated symptoms, 32 (40%) had fever, 46 (57.5%) had nausea, 34 (42.5%) had abdominal discomfort and 46 (57.5%) had general malaise. To compare the non-HIV patients with HIV patients, HIV patients were more likely to be male (p=0.008), MSM (p=0.000), co-infected syphilis (p=0.000) and slowly improving liver function of transaminases (p=0.033, 0.027). Conclusion: The HAV outbreak in Northern Taiwan was mainly occurred in MSM population. Hereafter, our cohort data support a policy in Taiwan to provide one dose of free HAV vaccine shot in this population. Hopefully, the outbreak could be stop by the free vaccine policy and public education.

Keywords: acute hepatitis A, men who has sex with men, human immunodeficiency virus, vaccine

Procedia PDF Downloads 188

24443 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

Authors: Mohd Asrul Affendi Bi Abdullah, Nyi Nyi Naing

Abstract:

The purpose of this study was to identify comparable manner between negative binomial death rate (NBDR) and zero inflated negative binomial death rate (ZINBDR) with died patients with (HIV + T B+) and (HIV + T B−). HIV and TB is a serious world wide problem in the developing country. Data were analyzed with applying NBDR and ZINBDR to make comparison which a favorable model is better to used. The ZINBDR model is able to account for the disproportionately large number of zero within the data and is shown to be a consistently better fit than the NBDR model. Hence, as a results ZINBDR model is a superior fit to the data than the NBDR model and provides additional information regarding the died mechanisms HIV+TB. The ZINBDR model is shown to be a use tool for analysis death rate according age categorical.

Keywords: zero inflated negative binomial death rate, HIV and TB, AIC and BIC, death rate

Procedia PDF Downloads 409

24442 Detecting Overdispersion for Mortality AIDS in Zero-inflated Negative Binomial Death Rate (ZINBDR) Co-infection Patients in Kelantan

Authors: Mohd Asrul Affedi, Nyi Nyi Naing

Abstract:

Overdispersion is present in count data, and basically when a phenomenon happened, a Negative Binomial (NB) is commonly used to replace a standard Poisson model. Analysis of count data event, such as mortality cases basically Poisson regression model is appropriate. Hence, the model is not appropriate when existing a zero values. The zero-inflated negative binomial model is appropriate. In this article, we modelled the mortality cases as a dependent variable by age categorical. The objective of this study to determine existing overdispersion in mortality data of AIDS co-infection patients in Kelantan.

Keywords: negative binomial death rate, overdispersion, zero-inflation negative binomial death rate, AIDS

Procedia PDF Downloads 449

24441 Prevalence of Knee Pain and Risk Factors and Its Impact on Functional Impairment among Saudi Adolescents

Authors: Ali H.Alyami, Hussam Darraj, Faisal Hakami, Mohammed Awaf, Sulaiman Hamdi, Nawaf Bakri, Abdulaziz Saber, Khalid Hakami, Almuhanad Alyami, Mohammed khashab

Abstract:

Introduction: Adolescents frequently self-report pain, according to epidemiological research. The knee is one of the sites where the pain is most common. One of the main factors contributing to the number of years people spend disabled and having substantial personal, societal, and economic burdens globally are musculoskeletal disorders. Adolescents may have knee pain due to an abrupt, traumatic injury or an insidious, slowly building onset that neither the adolescent nor the parent is aware of. Objectives: The present study’s authors aimed to estimate the prevalence of knee pain in Saudi adolescents. Methods: This cross-sectional survey, carried out from June to November 2022, included 676 adolescents ages 10 to 18. Data are presented as frequencies and percentages for categorical variables. Analysis of variance (ANOVA) was used to compare means between groups, while the chi-square test was used for the comparison of categorical variables. Statistical significance was set at P< 0.05.Result: Adolescents were invited to take part in the study. 57.5% were girls, and 42.5% were males,68.8% were 676 aged between 15 and 18. The prevalence of knee pain was considerably high among females (26%), while it was 19.2% among males. Moreover, age was a significant predictor for knee pain; also BMI was significant for knee pain. Conclusion: Our study noted a high rate of knee pain among adolescents, so we need to raise awareness about risk factors. Adolescent knee pain can be prevented with conservative methods and some minor lifestyle/activity modifications.

Keywords: knee pain, prevalence of knee pain, exercise training, physical activity

Procedia PDF Downloads 88

24440 Direct Phoenix Identification and Antimicrobial Susceptibility Testing from Positive Blood Culture Broths

Authors: Waad Al Saleemi, Badriya Al Adawi, Zaaima Al Jabri, Sahim Al Ghafri, Jalila Al Hadhramia

Abstract:

Objectives: Using standard lab methods, a positive blood culture requires a minimum of two days (two occasions of overnight incubation) to obtain a final identification (ID) and antimicrobial susceptibility results (AST) report. In this study, we aimed to evaluate the accuracy and precision of identification and antimicrobial susceptibility testing of an alternative method (direct method) that will reduce the turnaround time by 24 hours. This method involves the direct inoculation of positive blood culture broths into the Phoenix system using serum separation tubes (SST). Method: This prospective study included monomicrobial-positive blood cultures obtained from January 2022 to May 2023 in SQUH. Blood cultures containing a mixture of organisms, fungi, or anaerobic organisms were excluded from this study. The result of the new “direct method” under study was compared with the current “standard method” used in the lab. The accuracy and precision were evaluated for the ID and AST using Clinical and Laboratory Standards Institute (CLSI) recommendations. The categorical agreement, essential agreement, and the rates of very major errors (VME), major errors (ME), and minor errors (MIE) for both gram-negative and gram-positive bacteria were calculated. Passing criteria were set according to CLSI. Result: The results of ID and AST were available for a total of 158 isolates. Of 77 isolates of gram-negative bacteria, 71 (92%) were correctly identified at the species level. Of 70 isolates of gram-positive bacteria, 47(67%) isolates were correctly identified. For gram-negative bacteria, the essential agreement of the direct method was ≥92% when compared to the standard method, while the categorical agreement was ≥91% for all tested antibiotics. The precision of ID and AST were noted to be 100% for all tested isolates. For gram-positive bacteria, the essential agreement was >93%, while the categorical agreement was >92% for all tested antibiotics except moxifloxacin. Many antibiotics were noted to have an unacceptable higher rate of very major errors including penicillin, cotrimoxazole, clindamycin, ciprofloxacin, and moxifloxacin. However, no error was observed in the results of vancomycin, linezolid, and daptomycin. Conclusion: The direct method of ID and AST for positive blood cultures using SST is reliable for gram negative bacteria. It will significantly decrease the turnaround time and will facilitate antimicrobial stewardship.

Keywords: bloodstream infection, oman, direct ast, blood culture, rapid identification, antimicrobial susceptibility, phoenix, direct inoculation

Procedia PDF Downloads 44

24439 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 133

24438 Stress, Anxiety and Its Associated Factors Within the Transgender Population of Delhi: A Cross-Sectional Study

Authors: Annie Singh, Ishaan Singh

Abstract:

Background: Transgenders are people who have a gender identity different from their sex assigned at birth. Their gender behaviour doesn’t match their body anatomy. The community faces discrimination due to their gender identity all across the world. The term transgender is an umbrella term for many people non-conformal to their biological identity; note that the term transgender is different from gender dysphoria, which is a DSM-5 disorder defined as problems faced by an individual due to their non-conforming gender identity. Transgender people have been a part of Indian culture for ages yet have continued to face exclusion and discrimination in society. This has led to the low socio-economic status of the community. Various studies done across the world have established the role of discrimination, harassment and exclusion in the development of psychological disorders. The study is aimed to assess the frequency of stress and anxiety in the transgender population and understand the various factors affecting the same. Methodology: A cross-sectional survey of self consenting transgender individuals above the age of 18 residing in Delhi was done to assess their socioeconomic status and experiential ecology. Recruitment of participants was done with the help of NGOs. The survey was constructed GAD-7 and PSS-10, two well-known scales were used to assess the stress and anxiety levels. Medians, means and ranges are used for reporting continuous data wherever required, while frequencies and percentages are used for categorical data. For associations and comparison between groups in categorical data, the Chi-square test was used, while the Kruskal-Wallis H test was employed for associations involving multiple ordinal groups. SPSS v28.0 was used to perform the statistical analysis for this study. Results: The survey showed that the frequency of stress and anxiety is high in the transgender population. A demographic survey indicates a low socio-economic background. 44% of participants reported facing discrimination on a daily basis; the frequency of discrimination is higher in transwomen than in transmen. Stress and anxiety levels are similar among both transmen and transwomen. Only 34.5% of participants said they had receptive family or friends. The majority of participants (72.7%) reported a positive or neutral experience with healthcare workers. The prevalence of discrimination is significantly lower in the higher educated groups. Analysis of data shows a positive impact of acceptance and reception on mental health, while discrimination is correlated with higher levels of stress and anxiety. Conclusion: The prevalence of widespread transphobia and discrimination faced by the transgender community has culminated in high levels of stress and anxiety in the transgender population and shows variance according to multiple socio-demographic factors. Educating people about the LGBT community formation of support groups, policies and laws are required to establish trust and promote integration.

Keywords: transgender, gender, stress, anxiety, mental health, discrimination, exclusion

Procedia PDF Downloads 97

24437 Q-Map: Clinical Concept Mining from Clinical Documents

Authors: Sheikh Shams Azam, Manoj Raju, Venkatesh Pagidimarri, Vamsi Kasivajjala

Abstract:

Over the past decade, there has been a steep rise in the data-driven analysis in major areas of medicine, such as clinical decision support system, survival analysis, patient similarity analysis, image analytics etc. Most of the data in the field are well-structured and available in numerical or categorical formats which can be used for experiments directly. But on the opposite end of the spectrum, there exists a wide expanse of data that is intractable for direct analysis owing to its unstructured nature which can be found in the form of discharge summaries, clinical notes, procedural notes which are in human written narrative format and neither have any relational model nor any standard grammatical structure. An important step in the utilization of these texts for such studies is to transform and process the data to retrieve structured information from the haystack of irrelevant data using information retrieval and data mining techniques. To address this problem, the authors present Q-Map in this paper, which is a simple yet robust system that can sift through massive datasets with unregulated formats to retrieve structured information aggressively and efficiently. It is backed by an effective mining technique which is based on a string matching algorithm that is indexed on curated knowledge sources, that is both fast and configurable. The authors also briefly examine its comparative performance with MetaMap, one of the most reputed tools for medical concepts retrieval and present the advantages the former displays over the latter.

Keywords: information retrieval, unified medical language system, syntax based analysis, natural language processing, medical informatics

Procedia PDF Downloads 114

24436 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 435

24435 Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yiping Zhuang, Agus Sudjianto

Abstract:

Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.

Keywords: conditional generative adversarial net, market and credit risk management, neural network, time series

Procedia PDF Downloads 123

24434 The Impact of Innovation Efficiency on the Production of New Knowledge: A Manufacturing Firm Level Perspective

Authors: Vasilios Kanellopoulos

Abstract:

The present paper examines the effect of innovation efficiency on the production of new knowledge from a firm level perspective. It resorts to the Greek version of community innovation survey (CIS 2012-2014 microdata) and employs 1274 firms of the manufacturing, which constitutes the main sector of examination. It assumes a knowledge production function (KPF) and finds that R&D spillovers related to the expenditures on innovation activities, internal R&D, external R&D, skilled labor, and the expenditures in the acquisition of machinery have a positive and significant effect on the production of new knowledge when OLS techniques are applied. However, innovation efficiency comes from a Banker and Morey (1986) data envelopment analysis (DEA) with categorical variables has a statistically insignificant impact on the production of new knowledge measured by firm’s turnover.

Keywords: firms, innovation efficiency, production of new knowledge, R&D spillovers

Procedia PDF Downloads 121

24433 Factors Affecting Cesarean Section among Women in Qatar Using Multiple Indicator Cluster Survey Database

Authors: Sahar Elsaleh, Ghada Farhat, Shaikha Al-Derham, Fasih Alam

Abstract:

Background: Cesarean section (CS) delivery is one of the major concerns both in developing and developed countries. The rate of CS deliveries are on the rise globally, and especially in Qatar. Many socio-economic, demographic, clinical and institutional factors play an important role for cesarean sections. This study aims to investigate factors affecting the prevalence of CS among women in Qatar using the UNICEF’s Multiple Indicator Cluster Survey (MICS) 2012 database. Methods: The study has focused on the women’s questionnaire of the MICS, which was successfully distributed to 5699 participants. Following study inclusion and exclusion criteria, a final sample of 761 women aged 19- 49 years who had at least one delivery of giving birth in their lifetime before the survey were included. A number of socio-economic, demographic, clinical and institutional factors, identified through literature review and available in the data, were considered for the analyses. Bivariate and multivariate logistic regression models, along with a multi-level modeling to investigate clustering effect, were undertaken to identify the factors that affect CS prevalence in Qatar. Results: From the bivariate analyses the study has shown that, a number of categorical factors are statistically significantly associated with the dependent variable (CS). When identifying the factors from a multivariate logistic regression, the study found that only three categorical factors -‘age of women’, ‘place at delivery’ and ‘baby weight’ appeared to be significantly affecting the CS among women in Qatar. Although the MICS dataset is based on a cluster survey, an exploratory multi-level analysis did not show any clustering effect, i.e. no significant variation in results at higher level (households), suggesting that all analyses at lower level (individual respondent) are valid without any significant bias in results. Conclusion: The study found a statistically significant association between the dependent variable (CS delivery) and age of women, frequency of TV watching, assistance at birth and place of birth. These results need to be interpreted cautiously; however, it can be used as evidence-base for further research on cesarean section delivery in Qatar.

Keywords: cesarean section, factors, multiple indicator cluster survey, MICS database, Qatar

Procedia PDF Downloads 99

24432 Community Perception and Knowledge on Oral Cancer Screening Methods in Kuwait

Authors: Lavanya Dharmendran, Shenuka Singh, Sona Baburathanam

Abstract:

The aim of the study is to understand the level of awareness in a community of a specific region of Kuwait regarding oral cancer and its screening methods so as to enhance the uptake of oral cancer screening methods. This is a cross-sectional study comprising 100 adult participants residing in the governate of Farwaniya, Kuwait. Participants of above 18 years of both genders will be selected using convenience sampling. Data collection includes the administration of a self-administered questionnaire. The questionnaire comprises three sections, each section assessing the knowledge, attitudes and practices of the participants’ opinions about oral cancer and screening methods. Data will be analyzed using Humphris Oral Cancer Knowledge Scale. Inferential statistics will be done using Chi-Square or Fisher’s exact test for categorical data. A level of p<.05 will be established as being significant. All ethical considerations, such as respect for personal confidentiality and informed consent, will be applied in this study. This study revealed that although respondents were aware of the term oral cancer, more than half of the study participants were unaware of the symptoms associated with this condition. Smoking and alcohol were identified as risk factors for oral cancer, but the majority of participants did not identify the Human Papilloma Virus (HPV) as an added risk factor. This suggests a greater need for dental practitioners to include educational strategies in routine dental visits to ensure greater awareness of oral cancer.

Keywords: oral cancer, oral screening, oral public health, oral health

Procedia PDF Downloads 56

24431 Traumatic Brain Injury in Cameroon: A Prospective Observational Study in a Level 1 Trauma Centre

Authors: Franklin Chu Buh, Irene Ule Ngole Sumbele, Andrew I. R. Maas, Mathieu Motah, Jogi V. Pattisapu, Eric Youm, Basil Kum Meh, Firas H. Kobeissy, Kevin W. Wang, Peter J. A. Hutchinson, Germain Sotoing Taiwe

Abstract:

Introduction: Studying TBI characteristics and their relation to outcomes can identify initiatives to improve TBI prevention and care. The objective of this study was to define the features and outcomes of TBI patients seen over a 1-year period in a level-I trauma center in Cameroon. Methods: Data on demographics, causes, injury mechanisms, clinical aspects, and discharge status were prospectively collected over a period of 12 months. The Glasgow Outcome Scale-Extended (GOSE) and the Quality of Life Questionnaire after Brain Injury (QoLIBRI) were used to evaluate outcomes 6-months after TBI. Categorical variables were described as frequencies and percentages. Comparisons between 2 categorical variables were done using Pearson's Chi-square test or Fisher's exact test. Results: A total of 160 TBI patients participated in the study. The age group 15-45 years (78%; 125) was most represented. Males were more affected (90%; 144). Low educational level was recorded in 122 (76%) cases. Road traffic incidents (RTI) were the main cause of TBI (85%), with professional bike riders being frequently involved (27%, 43/160). Assaults (7.5%) and falls (2.5%) represent the second and third most common causes of TBI in Cameroon, respectively. Only 15 patients were transported to the hospital by ambulance, and 14 of these were from a referring hospital. CT-imaging was performed in 78% (125/160) of cases intracranial traumatic abnormality was identified in 77/125 (64%) cases. Financial constraints were the main reason for not performing a CT scan on 35 patients. A total of 46 (33%) patients were discharged against medical advice (DAMA) due to financial constraints. Mortality was 14% (22/160) but disproportionately high in patients with severe TBI (46%). DAMA had poor outcomes with QoLIBRI. Only 4 patients received post-injury physiotherapy services. Conclusion: TBI in Cameroon mainly results from RTIs and commonly affects young adult males, and low educational or socioeconomic status and commercial bike riding appear to be predisposing factors. Lack of pre-hospital care, financial constraints limiting both CT-scanning and medical care, and lack of acute physiotherapy services likely influenced care and outcomes adversely.

Keywords: characteristics, traumatic brain injury, outcome, disparities in care, prospective study

Procedia PDF Downloads 104

24430 Estimation of Transition and Emission Probabilities

Authors: Aakansha Gupta, Neha Vadnere, Tapasvi Soni, M. Anbarsi

Abstract:

Protein secondary structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine and biotechnology. Some aspects of protein functions and genome analysis can be predicted by secondary structure prediction. This is used to help annotate sequences, classify proteins, identify domains, and recognize functional motifs. In this paper, we represent protein secondary structure as a mathematical model. To extract and predict the protein secondary structure from the primary structure, we require a set of parameters. Any constants appearing in the model are specified by these parameters, which also provide a mechanism for efficient and accurate use of data. To estimate these model parameters there are many algorithms out of which the most popular one is the EM algorithm or called the Expectation Maximization Algorithm. These model parameters are estimated with the use of protein datasets like RS126 by using the Bayesian Probabilistic method (data set being categorical). This paper can then be extended into comparing the efficiency of EM algorithm to the other algorithms for estimating the model parameters, which will in turn lead to an efficient component for the Protein Secondary Structure Prediction. Further this paper provides a scope to use these parameters for predicting secondary structure of proteins using machine learning techniques like neural networks and fuzzy logic. The ultimate objective will be to obtain greater accuracy better than the previously achieved.

Keywords: model parameters, expectation maximization algorithm, protein secondary structure prediction, bioinformatics

Procedia PDF Downloads 457

24429 The Impact of Civilian Syrian War on Human Wellbeing as Inflected by Depression General Status Among Patients Treated in Royal Medical Services, Jordan

Authors: Zeyad Suleiman Bataineh

Abstract:

Introduction: civilian wars are associated with severe humanitarian effects that include loss of individuals and properties. Psychological dimensions are also included depression. Objectives: the main objectives of the present study were to investigate the depression level among Syrian patients who visited internal medicine clinics and other related variables. Methods and subjects: this study was conducted based on cross sectional study design. A total of 175 patients were involved. Patients were asked to fill a questionnaire to assess the level of depression that include demographic variables such as gender, age, educational level, and social status. Beck Aaron scale for depression was used. Participation in this study was voluntary, and all patients were informed about their rights to withdraw from the study without being negatively affected. Data were entered into excel spreading sheet for all participants. SPSS version 21 was used to analyze data. Data were described as means, the standard deviation for linear variables, frequencies, and percentages for categorical variables. The relationships between variables were evaluated using independent t test and One Way ANOVA test. Significance was considered at α≤0.05. Results: Depression was found in 152 (87%) of participants. The majority of participants with depression had moderate to severe depression. Depression was significantly associated gender, age, educational level, and social status (p<0.05). Conclusion: psychological rehabilitation is required for patients who experienced civilian wars.

Keywords: mental health, deprssion, health system, psychological dimension

Procedia PDF Downloads 107

24428 Prevalence and Characteristics of Torus Palatinus among Western Indonesian Population

Authors: Raka Aldy Nugraha, Kiwah Andanni, Aditya Indra Pratama, Aswin Guntara

Abstract:

Background: Torus palatinus is a bony protuberance in the hard palate. Sex and race are considered as influencing factors for the development of torus palatinus. Hence, the objective of this study was to determine the prevalence and characteristics of torus palatinus and its correlation with sex and ethnicity among Western Indonesian Population. Methods: We conducted a descriptive and analytical study employing cross-sectional design in 274 new students of Universitas Indonesia. Data were collected by using consecutive sampling method through questionnaire-filling and direct oral examination. Subject with racial background other than indigenous Indonesian Mongol were excluded from this study. Data were statistically analyzed using chi square test for categorical variables whereas logistic regression model was employed to assess the correlation between variables of interest with prevalence of torus palatinus. Results: Torus palatinus were found in 212 subjects (77.4%), mostly small in size (< 3 mm) and single in number, with percentage of 50.5% and 90.6%, respectively. The prevalence of torus palatinus were significantly higher in women (OR 2.88; 95% CI: 1.53-5.39; p = 0.001), dominated by medium-sized and single tori. There was no significant correlation between ethnicity and the occurrence of torus palatinus among Western Indonesian population. Conclusion: Torus palatinus was prevalent among Western Indonesian population. It showed significant positive correlation with sex, but not with ethnicity.

Keywords: characteristic, ethnicity, Indonesia, mongoloid, prevalence, sex, Torus palatinus

Procedia PDF Downloads 247

24427 Interface between Personal Values and Social Entrepreneurship in Social Projects That Develop Sports Practice

Authors: Leticia Lengler, Jefferson Oliveira, Vania Estivalete, Jordana Marques Kneipp

Abstract:

The context of social, economic and environmental transformations has driven innumerable changes in the organizational environment, influencing the social interactions that occur in this scenario. In this sense, social entrepreneurship emerges as a unique opportunity to challenge, question, rethink certain concepts and traditional theories widely discussed in relation to entrepreneurship. Therefore, the interest in studying personal values has been based on the idea that they might be predictors of the behavior of individuals. As an attempt to relate personal values with the characteristics of social entrepreneurs, this study aims to investigate the salient values and the social entrepreneurship perceptions that occur in two social projects responsible for developing sports skills among the students. For purposes of analysis, it is intended to consider: (i) a description of both Social Projects and their respective institutions, considering their history and relevance in the context; (ii) analysis of the personal values of the idealizers and teachers responsible for the projects, (iii) identification of the characteristics of social entrepreneurship manifested in the two projects, and (iv) discussion of similarities and disparities of the categories identified among the participants of the projects. Therefore, this study will carry a qualitative analysis from the interviews with 10 participants of each social project (named Projeto Remar/ASENA and Projeto Mãos Dadas/JUDÔ SANTA MARIA): 2 projects coordinators, 2 students, 2 parents of students, 2 physical education internships and 2 businessmen who stablished a partnership with each project. The data collection will be done through semi-structured interviews that are going to last around 30 minutes each, being recorded, transcribed and later analyzed, through the categorical analysis. The option for categorical analysis is supported by the fact that it is the best alternative when one wants to study values, opinions, attitudes and beliefs, through qualitative ones. In the present research, the pre-analysis phase consisted of an organization of the material collected during the research with Remar and Mãos Dadas Project, and a dynamic reading of this material, seeking to identify the characteristics of social entrepreneurship and values addressed in the study. In the analytical description phase, a more in-depth analysis of the material collected in the research will be carried out. The third phase, referred to as referential interpretation or treatment of results obtained will allow to verify the homogeneity and the heterogeneity among the participants' perceptions of the projects. Some preliminary results coming from the first interviews revealed the projects are guided by values such as cooperation, respect, well-being and nature preservation. These values are linked to the social entrepreneurship perception of the projects managers, who established their activities in behalf of the local community.

Keywords: personal values, social entrepreneurship, social projects, sports participants

Procedia PDF Downloads 346

24426 Predictive Analytics of Bike Sharing Rider Parameters

Authors: Bongs Lainjo

Abstract:

The evolution and escalation of bike-sharing programs (BSP) continue unabated. Since the sixties, many countries have introduced different models and strategies of BSP. These include variations ranging from dockless models to electronic real-time monitoring systems. Reasons for using this BSP include recreation, errands, work, etc. And there is all indication that complex, and more innovative rider-friendly systems are yet to be introduced. The objective of this paper is to analyze current variables established by different operators and streamline them identifying the most compelling ones using analytics. Given the contents of available databases, there is a lack of uniformity and common standard on what is required and what is not. Two factors appear to be common: user type (registered and unregistered, and duration of each trip). This article uses historical data provided by one operator based in the greater Washington, District of Columbia, USA area. Several variables including categorical and continuous data types were screened. Eight out of 18 were considered acceptable and significantly contribute to determining a useful and reliable predictive model. Bike-sharing systems have become popular in recent years all around the world. Although this trend has resulted in many studies on public cycling systems, there have been few previous studies on the factors influencing public bicycle travel behavior. A bike-sharing system is a computer-controlled system in which individuals can borrow bikes for a fee or free for a limited period. This study has identified unprecedented useful, and pragmatic parameters required in improving BSP ridership dynamics.

Keywords: sharing program, historical data, parameters, ridership dynamics, trip duration

Procedia PDF Downloads 120

Search results for: categorical data

24455 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

24454 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

24453 Survival Data with Incomplete Missing Categorical Covariates

24452 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

24451 Determination Power and Sample Size Zero-Inflated Negative Binomial Dependent Death Rate of Age Model (ZINBD): Regression Analysis Mortality Acquired Immune Deficiency De ciency Syndrome (AIDS)

24450 Qualitative Data Analysis for Health Care Services

24449 Performance of the Cmip5 Models in Simulation of the Present and Future Precipitation over the Lake Victoria Basin

24448 The Univalence Principle: Equivalent Mathematical Structures Are Indistinguishable

24447 Syllogistic Reasoning with 108 Inference Rules While Case Quantities Change

24446 A Comparison of Caesarean Section Indications and Characteristics in 2009 and 2020 in a Saudi Tertiary Hospital

24445 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

24444 Acute Hepatitis A Outbreak in Men Who Has Sex with Men in a Medical Center in Northern Taiwan

24443 Mixture statistical modeling for predecting mortality human immunodeficiency virus (HIV) and tuberculosis(TB) infection patients

24442 Detecting Overdispersion for Mortality AIDS in Zero-inflated Negative Binomial Death Rate (ZINBDR) Co-infection Patients in Kelantan

24441 Prevalence of Knee Pain and Risk Factors and Its Impact on Functional Impairment among Saudi Adolescents

24440 Direct Phoenix Identification and Antimicrobial Susceptibility Testing from Positive Blood Culture Broths

24439 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

24438 Stress, Anxiety and Its Associated Factors Within the Transgender Population of Delhi: A Cross-Sectional Study

24437 Q-Map: Clinical Concept Mining from Clinical Documents

24436 Application Difference between Cox and Logistic Regression Models

24435 Time Series Simulation by Conditional Generative Adversarial Net

24434 The Impact of Innovation Efficiency on the Production of New Knowledge: A Manufacturing Firm Level Perspective

24433 Factors Affecting Cesarean Section among Women in Qatar Using Multiple Indicator Cluster Survey Database

24432 Community Perception and Knowledge on Oral Cancer Screening Methods in Kuwait

24431 Traumatic Brain Injury in Cameroon: A Prospective Observational Study in a Level 1 Trauma Centre

24430 Estimation of Transition and Emission Probabilities

24429 The Impact of Civilian Syrian War on Human Wellbeing as Inflected by Depression General Status Among Patients Treated in Royal Medical Services, Jordan

24428 Prevalence and Characteristics of Torus Palatinus among Western Indonesian Population

24427 Interface between Personal Values and Social Entrepreneurship in Social Projects That Develop Sports Practice

24426 Predictive Analytics of Bike Sharing Rider Parameters

24451 Determination Power and Sample Size Zero-Inflated Negative Binomial Dependent Death Rate of Age Model (ZINBD): Regression Analysis Mortality Acquired Immune Deficiency Deciency Syndrome (AIDS)