Search results for: multivariate failure-time data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24830

Search results for: multivariate failure-time data

24170 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 338
24169 Dietary Vitamin D Intake and the Bladder Cancer Risk: A Pooled Analysis of Prospective Cohort Studies

Authors: Iris W. A. Boot, Anke Wesselius, Maurice P. Zeegers

Abstract:

Diet may play an essential role in the aetiology of bladder cancer (BC). Vitamin D is involved in various biological functions which have the potential to prevent BC development. Besides, vitamin D also influences the uptake of calcium and phosphorus , thereby possibly indirectly influencing the risk of BC. The aim of the present study was to investigate the relation between vitamin D intake and BC risk. Individual dietary data were pooled from three cohort studies. Food item intake was converted to daily intakes of vitamin D, calcium and phosphorus. Pooled multivariate hazard ratios (HRs), with corresponding 95% confidence intervals (CIs) were obtained using Cox-regression models. Analyses were adjusted for gender, age and smoking status (Model 1), and additionally for the food groups fruit, vegetables and meat (Model 2). Dose–response relationships (Model 1) were examined using a nonparametric test for trend. In total, 2,871 cases and 522,364 non-cases were included in the analyses. The present study showed an overall increased BC risk for high dietary vitamin D intake (HR: 1.14, 95% CI: 1.03-1.26). A similar increase BC risk with high vitamin D intake was observed among women and for the non-muscle invasive BC subtype, (HR: 1.41, 95% CI: 1.15-1.72, HR: 1.13, 95% CI: 1.01-1.27, respectively). High calcium intake decreased the BC risk among women (HR: 0.81, 95% CI: 0.67-0.97). A combined inverse effect on BC risk was observed for low vitamin D intake and high calcium intake (HR: 0.67, 95% CI: 0.48-0.93), while a positive effect was observed for high vitamin D intake in combination with low, moderate and high phosphorus (HR: 1.31, 95% CI: 1.09-1.59, HR: 1.17, 95% CI: 1.01-1.36, HR: 1.16, 95% CI: 1.03-1.31, respectively). Combining all nutrients showed a decreased BC risk for low vitamin D intake, high calcium and moderate phosphor intake (HR: 0.37, 95% CI: 0.18-0.75), and an increased BC risk for moderate intake of all the nutrients (HR: 1.18, 95% CI: 1.02-1.38), for high vitamin D and low calcium and phosphor intake (HR: 1.28, 95% CI: 1.01-1.62), and for moderate vitamin D and calcium and high phosphorus intake (HR: 1.27, 95% CI: 1.01-1.59). No significant dose-response analyses were observed. The findings of this study show an increased BC risk for high dietary vitamin D intake and a decreased risk for high calcium intake. Besides, the study highlights the importance of examining the effect of a nutrient in combination with complementary nutrients for risk assessment. Future research should focus on nutrients in a wider context and in nutritional patterns.

Keywords: bladder cancer, nutritional oncology, pooled cohort analysis, vitamin D

Procedia PDF Downloads 71
24168 TP53 Mutations in Molecular Subtypes of Breast Cancer in Young Pakistani Patients

Authors: Nadia Naseem, Farwa Batool, Nasir Mehmood, AbdulHannan Nagi

Abstract:

Background: The incidence and mortality of breast cancer vary significantly in geographically distinct populations. In Pakistan, breast cancer has shown an increase in incidence in young females and is characterized by more aggressive behavior. The tumor suppressor TP53 gene is a crucial genetic factor that plays a significant role in breast carcinogenesis. This study investigated the TP53 mutations in molecular subtypes of both nodes negative and positive breast cancer in young Pakistani patients. Material and Methods: p53, Estrogen Receptor (ER), Progesterone Receptor (PR), Her-2 neu and Ki 67 expressions were analyzed immunohistochemically in a series of 75 node negative (A) and 75 node positive (B) young (aged: 19-40 years) breast cancer patients diagnosed between 2014 to 2017 at two leading hospitals of Punjab, Pakistan. Tumor tissue specimens and peripheral blood samples were examined for TP53 mutations by direct sequencing of the gene (exons 4-9). The relation of TP53 mutations to these markers and clinicopathological data was investigated. Results: Mean age of the patients was 32.4 + 9.1 SD. Invasive breast carcinoma was the most frequent histological variant (A=92%, B=94.6%). Grade 3 carcinoma was the commonest grade (A=72%, B=81.3%). Triple negative cases (ER-, PR-, Her-2) formed most of the molecular subtypes (A=44%, B=50.6%). A total of 17.2% (A: 6.6%, B: 10.6%) patients showed TP53 mutations. Mutations were significantly more frequent in triple negative cases (A: 74.8%, B: 62.2%) compared to HER2-positive patients (P < 0.0001). In the multivariate analysis of the whole patient group, the independent prognosticator were triple negative cases (P=0.021), TP53 overexpression by IHC (P=0.001) and advanced-stage disease (P=0.007). No statistically significant correlation between TP53 mutations and clinicopathological parameters was found (P < 0.05). Conclusions: It is concluded that TP53 mutations are infrequently present in breast carcinoma of young Pakistani population and there was no significant correlation between p53 mutation and early onset disease. Immunohistochemically detected TP53 expression in our resource-constrained to set up can be beneficial in predicting mutations at the younger age in our population.

Keywords: immunohistochemistry (IHC), invasive breast carcinoma (IBC), Pakistan, TP53

Procedia PDF Downloads 139
24167 Investigation of Delivery of Triple Play Data in GE-PON Fiber to the Home Network

Authors: Ashima Anurag Sharma

Abstract:

Optical fiber based networks can deliver performance that can support the increasing demands for high speed connections. One of the new technologies that have emerged in recent years is Passive Optical Networks. This research paper is targeted to show the simultaneous delivery of triple play service (data, voice, and video). The comparison between various data rates is presented. It is demonstrated that as we increase the data rate, number of users to be decreases due to increase in bit error rate.

Keywords: BER, PON, TDMPON, GPON, CWDM, OLT, ONT

Procedia PDF Downloads 515
24166 Microarray Gene Expression Data Dimensionality Reduction Using PCA

Authors: Fuad M. Alkoot

Abstract:

Different experimental technologies such as microarray sequencing have been proposed to generate high-resolution genetic data, in order to understand the complex dynamic interactions between complex diseases and the biological system components of genes and gene products. However, the generated samples have a very large dimension reaching thousands. Therefore, hindering all attempts to design a classifier system that can identify diseases based on such data. Additionally, the high overlap in the class distributions makes the task more difficult. The data we experiment with is generated for the identification of autism. It includes 142 samples, which is small compared to the large dimension of the data. The classifier systems trained on this data yield very low classification rates that are almost equivalent to a guess. We aim at reducing the data dimension and improve it for classification. Here, we experiment with applying a multistage PCA on the genetic data to reduce its dimensionality. Results show a significant improvement in the classification rates which increases the possibility of building an automated system for autism detection.

Keywords: PCA, gene expression, dimensionality reduction, classification, autism

Procedia PDF Downloads 548
24165 Data Science-Based Key Factor Analysis and Risk Prediction of Diabetic

Authors: Fei Gao, Rodolfo C. Raga Jr.

Abstract:

This research proposal will ascertain the major risk factors for diabetes and to design a predictive model for risk assessment. The project aims to improve diabetes early detection and management by utilizing data science techniques, which may improve patient outcomes and healthcare efficiency. The phase relation values of each attribute were used to analyze and choose the attributes that might influence the examiner's survival probability using Diabetes Health Indicators Dataset from Kaggle’s data as the research data. We compare and evaluate eight machine learning algorithms. Our investigation begins with comprehensive data preprocessing, including feature engineering and dimensionality reduction, aimed at enhancing data quality. The dataset, comprising health indicators and medical data, serves as a foundation for training and testing these algorithms. A rigorous cross-validation process is applied, and we assess their performance using five key metrics like accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). After analyzing the data characteristics, investigate their impact on the likelihood of diabetes and develop corresponding risk indicators.

Keywords: diabetes, risk factors, predictive model, risk assessment, data science techniques, early detection, data analysis, Kaggle

Procedia PDF Downloads 58
24164 A Methodology to Integrate Data in the Company Based on the Semantic Standard in the Context of Industry 4.0

Authors: Chang Qin, Daham Mustafa, Abderrahmane Khiat, Pierre Bienert, Paulo Zanini

Abstract:

Nowadays, companies are facing lots of challenges in the process of digital transformation, which can be a complex and costly undertaking. Digital transformation involves the collection and analysis of large amounts of data, which can create challenges around data management and governance. Furthermore, it is also challenged to integrate data from multiple systems and technologies. Although with these pains, companies are still pursuing digitalization because by embracing advanced technologies, companies can improve efficiency, quality, decision-making, and customer experience while also creating different business models and revenue streams. In this paper, the issue that data is stored in data silos with different schema and structures is focused. The conventional approaches to addressing this issue involve utilizing data warehousing, data integration tools, data standardization, and business intelligence tools. However, these approaches primarily focus on the grammar and structure of the data and neglect the importance of semantic modeling and semantic standardization, which are essential for achieving data interoperability. In this session, the challenge of data silos in Industry 4.0 is addressed by developing a semantic modeling approach compliant with Asset Administration Shell (AAS) models as an efficient standard for communication in Industry 4.0. The paper highlights how our approach can facilitate the data mapping process and semantic lifting according to existing industry standards such as ECLASS and other industrial dictionaries. It also incorporates the Asset Administration Shell technology to model and map the company’s data and utilize a knowledge graph for data storage and exploration.

Keywords: data interoperability in industry 4.0, digital integration, industrial dictionary, semantic modeling

Procedia PDF Downloads 81
24163 Big Data Analytics and Data Security in the Cloud via Fully Homomorphic Encryption

Authors: Waziri Victor Onomza, John K. Alhassan, Idris Ismaila, Noel Dogonyaro Moses

Abstract:

This paper describes the problem of building secure computational services for encrypted information in the Cloud Computing without decrypting the encrypted data; therefore, it meets the yearning of computational encryption algorithmic aspiration model that could enhance the security of big data for privacy, confidentiality, availability of the users. The cryptographic model applied for the computational process of the encrypted data is the Fully Homomorphic Encryption Scheme. We contribute theoretical presentations in high-level computational processes that are based on number theory and algebra that can easily be integrated and leveraged in the Cloud computing with detail theoretic mathematical concepts to the fully homomorphic encryption models. This contribution enhances the full implementation of big data analytics based cryptographic security algorithm.

Keywords: big data analytics, security, privacy, bootstrapping, homomorphic, homomorphic encryption scheme

Procedia PDF Downloads 368
24162 Evaluation of Some Trace Elements in Biological Samples of Egyptian Viral Hepatitis Patients under Nutrition Therapy

Authors: Tarek Elnimr, Reda Morsy, Assem El Fert, Aziza Ismail

Abstract:

Hepatitis is an inflammation of the liver. The condition can be self-limiting or can progress to fibrosis, cirrhosis or liver cancer. Disease caused by the hepatitis virus, the virus can cause hepatitis infection, ranging in severity from a mild illness lasting a few weeks to a serious, lifelong illness. A growing body of evidence indicates that many trace elements play important roles in a number of carcinogenic processes that proceed with various mechanisms. To examine the status of trace elements during the development of hepatic carcinoma, we determined the iron, copper, zinc and selenium levels in some biological samples of patients at different stages of viral hepatic disease. We observed significant changes in the iron, copper, zinc and selenium levels in the biological samples of patients hepatocellular carcinoma, relative to those of healthy controls. The mean hair, nail, RBC, serum and whole blood copper levels in patients with hepatitis virus were significantly higher than that of the control group. In contrast the mean iron, zinc, and selenium levels in patients having hepatitis virus were significantly lower than those of the control group. On the basis of this study, we identified the impact of natural supplements to improve the treatment of viral liver damage, using the level of some trace elements such as, iron, copper, zinc and selenium, which might serve as biomarkers for increases survival and reduces disease progression. Most of the elements revealed diverse and random distribution in the samples of the donor groups. The correlation study pointed out significant disparities in the mutual relationships among the trace elements in the patients and controls. Principal component analysis and cluster analysis of the element data manifested diverse apportionment of the selected elements in the scalp hair, nail and blood components of the patients compared with the healthy counterparts.

Keywords: hepatitis, hair, nail, blood components, trace element, nutrition therapy, multivariate analysis, correlation, ICP-MS

Procedia PDF Downloads 393
24161 Prevalence and Risk Factors of Diabetes and Its Association with Com-Morbidities among South Indian Women

Authors: Balasaheb Bansode

Abstract:

Diabetes is a very important component in non-communicable diseases. Diabetes ailment is a route of the multi-morbidities ailments. The South Indian states are almost completing the demographic transition in India. The study objectives present the prevalence of diabetes and its association with co-morbidities among the south Indian women. The study based on National Family Health Survey fourth round (NFHS) 4 conducted in 2015-16. The univariate, bivariate and multivariate analyses techniques have been used to find the association of risk factors and comorbidities with diabetics. The result reveals that the prevalence of diabetes is high among South Indian women. The study shows the women with diabetics have more chances to diagnose with hypertension and anemia comorbidities. The factors responsible for co-morbidities are changing the demographic situation, socioeconomic status, overweight and addict with substance use in South India. The awareness about diabetes prevention and management should be increased through health education, disease management programmes, trained peers and community health workers and community-based programmes.

Keywords: diabetes, risk factors, comorbidities, women

Procedia PDF Downloads 173
24160 Protecting Privacy and Data Security in Online Business

Authors: Bilquis Ferdousi

Abstract:

With the exponential growth of the online business, the threat to consumers’ privacy and data security has become a serious challenge. This literature review-based study focuses on a better understanding of those threats and what legislative measures have been taken to address those challenges. Research shows that people are increasingly involved in online business using different digital devices and platforms, although this practice varies based on age groups. The threat to consumers’ privacy and data security is a serious hindrance in developing trust among consumers in online businesses. There are some legislative measures taken at the federal and state level to protect consumers’ privacy and data security. The study was based on an extensive review of current literature on protecting consumers’ privacy and data security and legislative measures that have been taken.

Keywords: privacy, data security, legislation, online business

Procedia PDF Downloads 91
24159 Flowing Online Vehicle GPS Data Clustering Using a New Parallel K-Means Algorithm

Authors: Orhun Vural, Oguz Bayat, Rustu Akay, Osman N. Ucan

Abstract:

This study presents a new parallel approach clustering of GPS data. Evaluation has been made by comparing execution time of various clustering algorithms on GPS data. This paper aims to propose a parallel based on neighborhood K-means algorithm to make it faster. The proposed parallelization approach assumes that each GPS data represents a vehicle and to communicate between vehicles close to each other after vehicles are clustered. This parallelization approach has been examined on different sized continuously changing GPS data and compared with serial K-means algorithm and other serial clustering algorithms. The results demonstrated that proposed parallel K-means algorithm has been shown to work much faster than other clustering algorithms.

Keywords: parallel k-means algorithm, parallel clustering, clustering algorithms, clustering on flowing data

Procedia PDF Downloads 210
24158 An Analysis of Privacy and Security for Internet of Things Applications

Authors: Dhananjay Singh, M. Abdullah-Al-Wadud

Abstract:

The Internet of Things is a concept of a large scale ecosystem of wireless actuators. The actuators are defined as things in the IoT, those which contribute or produces some data to the ecosystem. However, ubiquitous data collection, data security, privacy preserving, large volume data processing, and intelligent analytics are some of the key challenges into the IoT technologies. In order to solve the security requirements, challenges and threats in the IoT, we have discussed a message authentication mechanism for IoT applications. Finally, we have discussed data encryption mechanism for messages authentication before propagating into IoT networks.

Keywords: Internet of Things (IoT), message authentication, privacy, security

Procedia PDF Downloads 367
24157 Cognitive Science Based Scheduling in Grid Environment

Authors: N. D. Iswarya, M. A. Maluk Mohamed, N. Vijaya

Abstract:

Grid is infrastructure that allows the deployment of distributed data in large size from multiple locations to reach a common goal. Scheduling data intensive applications becomes challenging as the size of data sets are very huge in size. Only two solutions exist in order to tackle this challenging issue. First, computation which requires huge data sets to be processed can be transferred to the data site. Second, the required data sets can be transferred to the computation site. In the former scenario, the computation cannot be transferred since the servers are storage/data servers with little or no computational capability. Hence, the second scenario can be considered for further exploration. During scheduling, transferring huge data sets from one site to another site requires more network bandwidth. In order to mitigate this issue, this work focuses on incorporating cognitive science in scheduling. Cognitive Science is the study of human brain and its related activities. Current researches are mainly focused on to incorporate cognitive science in various computational modeling techniques. In this work, the problem solving approach of human brain is studied and incorporated during the data intensive scheduling in grid environments. Here, a cognitive engine is designed and deployed in various grid sites. The intelligent agents present in CE will help in analyzing the request and creating the knowledge base. Depending upon the link capacity, decision will be taken whether to transfer data sets or to partition the data sets. Prediction of next request is made by the agents to serve the requesting site with data sets in advance. This will reduce the data availability time and data transfer time. Replica catalog and Meta data catalog created by the agents assist in decision making process.

Keywords: data grid, grid workflow scheduling, cognitive artificial intelligence

Procedia PDF Downloads 382
24156 Heritage and Tourism in the Era of Big Data: Analysis of Chinese Cultural Tourism in Catalonia

Authors: Xinge Liao, Francesc Xavier Roige Ventura, Dolores Sanchez Aguilera

Abstract:

With the development of the Internet, the study of tourism behavior has rapidly expanded from the traditional physical market to the online market. Data on the Internet is characterized by dynamic changes, and new data appear all the time. In recent years the generation of a large volume of data was characterized, such as forums, blogs, and other sources, which have expanded over time and space, together they constitute large-scale Internet data, known as Big Data. This data of technological origin that derives from the use of devices and the activity of multiple users is becoming a source of great importance for the study of geography and the behavior of tourists. The study will focus on cultural heritage tourist practices in the context of Big Data. The research will focus on exploring the characteristics and behavior of Chinese tourists in relation to the cultural heritage of Catalonia. Geographical information, target image, perceptions in user-generated content will be studied through data analysis from Weibo -the largest social networks of blogs in China. Through the analysis of the behavior of heritage tourists in the Big Data environment, this study will understand the practices (activities, motivations, perceptions) of cultural tourists and then understand the needs and preferences of tourists in order to better guide the sustainable development of tourism in heritage sites.

Keywords: Barcelona, Big Data, Catalonia, cultural heritage, Chinese tourism market, tourists’ behavior

Procedia PDF Downloads 124
24155 Towards A Framework for Using Open Data for Accountability: A Case Study of A Program to Reduce Corruption

Authors: Darusalam, Jorish Hulstijn, Marijn Janssen

Abstract:

Media has revealed a variety of corruption cases in the regional and local governments all over the world. Many governments pursued many anti-corruption reforms and have created a system of checks and balances. Three types of corruption are faced by citizens; administrative corruption, collusion and extortion. Accountability is one of the benchmarks for building transparent government. The public sector is required to report the results of the programs that have been implemented so that the citizen can judge whether the institution has been working such as economical, efficient and effective. Open Data is offering solutions for the implementation of good governance in organizations who want to be more transparent. In addition, Open Data can create transparency and accountability to the community. The objective of this paper is to build a framework of open data for accountability to combating corruption. This paper will investigate the relationship between open data, and accountability as part of anti-corruption initiatives. This research will investigate the impact of open data implementation on public organization.

Keywords: open data, accountability, anti-corruption, framework

Procedia PDF Downloads 316
24154 Developing Motorized Spectroscopy System for Tissue Scanning

Authors: Tuba Denkceken, Ayse Nur Sarı, Volkan Ihsan Tore, Mahmut Denkceken

Abstract:

The aim of the presented study was to develop a newly motorized spectroscopy system. Our system is composed of probe and motor parts. The probe part consists of bioimpedance and fiber optic components that include two platinum wires (each 25 micrometer in diameter) and two fiber cables (each 50 micrometers in diameter) respectively. Probe was examined on tissue phantom (polystyrene microspheres with different diameters). In the bioimpedance part of the probe current was transferred to the phantom and conductivity information was obtained. Adjacent two fiber cables were used in the fiber optic part of the system. Light was transferred to the phantom by fiber that was connected to the light source and backscattered light was collected with the other adjacent fiber for analysis. It is known that the nucleus expands and the nucleus-cytoplasm ratio increases during the cancer progression in the cell and this situation is one of the most important criteria for evaluating the tissue for pathologists. The sensitivity of the probe to particle (nucleus) size in phantom was tested during the study. Spectroscopic data obtained from our system on phantom was evaluated by multivariate statistical analysis. Thus the information about the particle size in the phantom was obtained. Bioimpedance and fiber optic experiments results which were obtained from polystyrene microspheres showed that the impedance value and the oscillation amplitude were increasing while the size of particle was enlarging. These results were compatible with the previous studies. In order to motorize the system within the motor part, three driver electronic circuits were designed primarily. In this part, supply capacitors were placed symmetrically near to the supply inputs which were used for balancing the oscillation. Female capacitors were connected to the control pin. Optic and mechanic switches were made. Drivers were structurally designed as they could command highly calibrated motors. It was considered important to keep the drivers’ dimension as small as we could (4.4x4.4x1.4 cm). Then three miniature step motors were connected to each other along with three drivers. Since spectroscopic techniques are quantitative methods, they yield more objective results than traditional ones. In the future part of this study, it is planning to get spectroscopic data that have optic and impedance information from the cell culture which is normal, low metastatic and high metastatic breast cancer. In case of getting high sensitivity in differentiated cells, it might be possible to scan large surface tissue areas in a short time with small steps. By means of motorize feature of the system, any region of the tissue will not be missed, in this manner we are going to be able to diagnose cancerous parts of the tissue meticulously. This work is supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) through 3001 project (115E662).

Keywords: motorized spectroscopy, phantom, scanning system, tissue scanning

Procedia PDF Downloads 186
24153 A Retrospective Cohort Study on an Outbreak of Gastroenteritis Linked to a Buffet Lunch Served during a Conference in Accra

Authors: Benjamin Osei Tutu, Sharon Annison

Abstract:

On 21st November, 2016, an outbreak of foodborne illness occurred after a buffet lunch served during a stakeholders’ consultation meeting held in Accra. An investigation was conducted to characterise the affected people, determine the etiologic food, the source of contamination and the etiologic agent and to implement appropriate public health measures to prevent future occurrences. A retrospective cohort study was conducted via telephone interviews, using a structured questionnaire developed from the buffet menu. A case was defined as any person suffering from symptoms of foodborne illness e.g. diarrhoea and/or abdominal cramps after eating food served during the stakeholder consultation meeting in Accra on 21st November, 2016. The exposure status of all the members of the cohort was assessed by taking the food history of each respondent during the telephone interview. The data obtained was analysed using Epi Info 7. An environmental risk assessment was conducted to ascertain the source of the food contamination. Risks of foodborne infection from the foods eaten were determined using attack rates and odds ratios. Data was obtained from 54 people who consumed food served during the stakeholders’ meeting. Out of this population, 44 people reported with symptoms of food poisoning representing 81.45% (overall attack rate). The peak incubation period was seven hours with a minimum and maximum incubation periods of four and 17 hours, respectively. The commonly reported symptoms were diarrhoea (97.73%, 43/44), vomiting (84.09%, 37/44) and abdominal cramps (75.00%, 33/44). From the incubation period, duration of illness and the symptoms, toxin-mediated food poisoning was suspected. The environmental risk assessment of the implicated catering facility indicated a lack of time/temperature control, inadequate knowledge on food safety among workers and sanitation issues. Limited number of food samples was received for microbiological analysis. Multivariate analysis indicated that illness was significantly associated with the consumption of the snacks served (OR 14.78, P < 0.001). No stool and blood or samples of etiologic food were available for organism isolation; however, the suspected etiologic agent was Staphylococcus aureus or Clostridium perfringens. The outbreak could probably be due to the consumption of unwholesome snack (tuna sandwich or chicken. The contamination and/or growth of the etiologic agent in the snack may be due to the breakdown in cleanliness, time/temperature control and good food handling practices. Training of food handlers in basic food hygiene and safety is recommended.

Keywords: Accra, buffet, conference, C. perfringens, cohort study, food poisoning, gastroenteritis, office workers, Staphylococcus aureus

Procedia PDF Downloads 217
24152 Patient Understanding of Health Information: Implications for Organizational Health Literacy in Germany

Authors: Florian Tille, Heide Weishaar, Bernhard Gibis, Susanne Schnitzer

Abstract:

Introduction: The quality of patient-doctor communication and of written health information is central to organizational health literacy (HL). Whether patients understand their doctors’ explanations and textual material on health, however, is understudied. This study identifies the overall levels of patient understanding of health information and its associations with patients’ social characteristics in outpatient health care in Germany. Materials & Methods: This analysis draws on data collected via a 2017 national health survey with a sample of 6,105 adults. Quality of communication was measured for consultations with general practitioners (GPs) and specialists (SPs) via the Ask Me 3 program questions, and through a question on written health material. Correlations with social characteristics were explored employing bivariate and multivariate logistic regression analyses. Results: Over 90% of all respondents reported that they had understood their doctors’ explanations during the last consultation. Failed understanding was strongly correlated with patients’ very poor health (Odds Ratio [OR]: 5.19; 95% confidence interval [CI]: 2.23–12.10; ref. excellent/very good health), current health problem (OR: 6.54, CI: 1.70–25.12; ref. preventive examination) and age 65 years and above (OR: 2.97, CI: 1.10–8.00; ref. 18 to 34 years). Fewer patients answered they understood written material well (86.7% for las visit at GP, 89.7% at SP). Understanding written material poorly was highly associated with basic education (OR: 4.20, CI: 2.76–6.39; ref. higher education) and 65 years old and above (OR: 2.66, CI: 1.43–4.96). Discussion: Overall ratings of oral patient-doctor communication and written communication of health information are high. Yet, a considerable share of patients reports not-understanding their doctors and poor understanding of the written health-related material. Interventions that can contribute to improving organizational HL in outpatient care in Germany include HL training for doctors, reducing system barriers to easily-accessible health information for patients and combining oral and written health communication means. Conclusion: This work adds to the study of organizational HL in Germany. To increase patient understanding of health-relevant information and thereby possibly reduce health disparities, meeting the communication needs especially of persons in different age groups, with basic education and in very poor health is suggested.

Keywords: health survey, organizational health literacy, patient-doctor communication, social characteristics, outpatient care, Ask Me 3

Procedia PDF Downloads 155
24151 Syndromic Surveillance Framework Using Tweets Data Analytics

Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden

Abstract:

Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.

Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza

Procedia PDF Downloads 102
24150 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia

Authors: Yuyun Wabula, B. J. Dewancker

Abstract:

In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.

Keywords: geolocation, Twitter, distribution analysis, human mobility

Procedia PDF Downloads 301
24149 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining

Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser

Abstract:

Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.

Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract

Procedia PDF Downloads 643
24148 Sensor Data Analysis for a Large Mining Major

Authors: Sudipto Shanker Dasgupta

Abstract:

One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.

Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data

Procedia PDF Downloads 395
24147 Data-Centric Anomaly Detection with Diffusion Models

Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu

Abstract:

Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.

Keywords: diffusion models, anomaly detection, data-centric, generative AI

Procedia PDF Downloads 72
24146 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report

Authors: Elizabeta Krstić Vukelja

Abstract:

Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.

Keywords: regulation, healthcare system, personal dana protection, quality data assurance

Procedia PDF Downloads 28
24145 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 255
24144 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 48
24143 Spatio-Temporal Variation of Gaseous Pollutants and the Contribution of Particulate Matters in Chao Phraya River Basin, Thailand

Authors: Samart Porncharoen, Nisa Pakvilai

Abstract:

The elevated levels of air pollutants in regional atmospheric environments is a significant problem that affects human health in Thailand, particularly in the Chao Phraya River Basin. Of concern are issues surrounding ambient air pollution such as particulate matter, gaseous pollutants and more specifically concerning air pollution along the river. Therefore, the spatio-temporal study of air pollution in this real environment can gain more accurate air quality data for making formalized environmental policy in river basins. In order to inform such a policy, a study was conducted over a period of January –December, 2015 to continually collect measurements of various pollutants in both urban and regional locations in the Chao Phraya River Basin. This study investigated the air pollutants in many diverse environments along the Chao Phraya River Basin, Thailand in 2015. Multivariate Analysis Techniques such as Principle Component Analysis (PCA) and Path analysis were utilised to classify air pollution in the surveyed location. Measurements were collected in both urban and rural areas to see if significant differences existed between the two locations in terms of air pollution levels. The meteorological parameters of various particulates were collected continually from a Thai pollution control department monitoring station over a period of January –December, 2015. Of interest to this study were the readings of SO2, CO, NOx, O3, and PM10. Results showed a daily arithmetic mean concentration of SO2, CO, NOx, O3, PM10 reading at 3±1 ppb, 0.5± 0.5 ppm, 30±21 ppb, 19±16 ppb, and 40±20 ug/m3 in urban locations (Bangkok). During the same time period, the readings for the same measurements in rural areas, Ayutthaya (were 1±0.5 ppb, 0.1± 0.05 ppm, 25±17 ppb, 30±21 ppb, and 35±10 ug/m3respectively. This show that Bangkok were located in highly polluted environments that are dominated source emitted from vehicles. Further, results were analysed to ascertain if significant seasonal variation existed in the measurements. It was found that levels of both gaseous pollutants and particle matter in dry season were higher than the wet season. More broadly, the results show that levels of pollutants were measured highest in locations along the Chao Phraya. River Basin known to have a large number of vehicles and biomass burning. This correlation suggests that the principle pollutants were from these anthropogenic sources. This study contributes to the body of knowledge surrounding ambient air pollution such as particulate matter, gaseous pollutants and more specifically concerning air pollution along the Chao Phraya River Basin. Further, this study is one of the first to utilise continuous mobile monitoring along a river in order to gain accurate measurements during a data collection period. Overall, the results of this study can be used for making formalized environmental policy in river basins in order to reduce the physical effects on human health.

Keywords: air pollution, Chao Phraya river basin, meteorology, seasonal variation, principal component analysis

Procedia PDF Downloads 270
24142 Data Analytics in Hospitality Industry

Authors: Tammy Wee, Detlev Remy, Arif Perdana

Abstract:

In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.

Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing

Procedia PDF Downloads 164
24141 Realization of a (GIS) for Drilling (DWS) through the Adrar Region

Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz

Abstract:

Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.

Keywords: GIS, DWS, drilling, Adrar

Procedia PDF Downloads 295