Search results for: multivariate data

24355 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 576

24354 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 380

24353 The Teacher’s Role in Generating and Maintaining the Motivation of Adult Learners of English: A Mixed Methods Study in Hungarian Corporate Contexts

Authors: Csaba Kalman

Abstract:

In spite of the existence of numerous second language (L2) motivation theories, the teacher’s role in motivating learners has remained an under-researched niche to this day. If we narrow down our focus on the teacher’s role on motivating adult learners of English in an English as a Foreign Language (EFL) context in corporate environments, empirical research is practically non-existent. This study fills the above research niche by exploring the most motivating aspects of the teacher’s personality, behaviour, and teaching practices that affect adult learners’ L2 motivation in corporate contexts in Hungary. The study was conducted in a wide range of industries in 18 organisations that employ over 250 people in Hungary. In order to triangulate the research, 21 human resources managers, 18 language teachers, and 466 adult learners of English were involved in the investigation by participating in interview studies, and quantitative questionnaire studies that measured ten scales related to the teacher’s role, as well as two criterion measure scales of intrinsic and extrinsic motivation. The qualitative data were analysed using a template organising style, while descriptive, inferential statistics, as well as multivariate statistical techniques, such as correlation and regression analyses, were used for analysing the quantitative data. The results showed that certain aspects of the teacher’s personality (thoroughness, enthusiasm, credibility, and flexibility), as well as preparedness, incorporating English for Specific Purposes (ESP) in the syllabus, and focusing on the present, proved to be the most salient aspects of the teacher’s motivating influence. The regression analyses conducted with the criterion measure scales revealed that 22% of the variance in learners’ intrinsic motivation could be explained by the teacher’s preparedness and appearance, and 23% of the variance in learners’ extrinsic motivation could be attributed to the teacher’s personal branding and incorporating ESP in the syllabus. The findings confirm the pivotal role teachers play in motivating L2 learners independent of the context they teach in; and, at the same time, call for further research so that we can better conceptualise the motivating influence of L2 teachers.

Keywords: adult learners, corporate contexts, motivation, teacher’s role

Procedia PDF Downloads 89

24352 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 430

24351 Drivers of Liking: Probiotic Petit Suisse Cheese

Authors: Helena Bolini, Erick Esmerino, Adriano Cruz, Juliana Paixao

Abstract:

The currently concern for health has increased demand for low-calorie ingredients and functional foods as probiotics. Understand the reasons that infer on food choice, besides a challenging task, it is important step for development and/or reformulation of existing food products. The use of appropriate multivariate statistical techniques, such as External Preference Map (PrefMap), associated with regression by Partial Least Squares (PLS) can help in determining those factors. Thus, this study aimed to determine, through PLS regression analysis, the sensory attributes considered drivers of liking in probiotic petit suisse cheeses, strawberry flavor, sweetened with different sweeteners. Five samples in same equivalent sweetness: PROB1 (Sucralose 0.0243%), PROB2 (Stevia 0.1520%), PROB3 (Aspartame 0.0877%), PROB4 (Neotame 0.0025%) and PROB5 (Sucrose 15.2%) determined by just-about-right and magnitude estimation methods, and three commercial samples COM1, COM2 and COM3, were studied. Analysis was done over data coming from QDA, performed by 12 expert (highly trained assessors) on 20 descriptor terms, correlated with data from assessment of overall liking in acceptance test, carried out by 125 consumers, on all samples. Sequentially, results were submitted to PLS regression using XLSTAT software from Byossistemes. As shown in results, it was possible determine, that three sensory descriptor terms might be considered drivers of liking of probiotic petit suisse cheese samples added with sweeteners (p<0.05). The milk flavor was noticed as a sensory characteristic with positive impact on acceptance, while descriptors bitter taste and sweet aftertaste were perceived as descriptor terms with negative impact on acceptance of petit suisse probiotic cheeses. It was possible conclude that PLS regression analysis is a practical and useful tool in determining drivers of liking of probiotic petit suisse cheeses sweetened with artificial and natural sweeteners, allowing food industry to understand and improve their formulations maximizing the acceptability of their products.

Keywords: acceptance, consumer, quantitative descriptive analysis, sweetener

Procedia PDF Downloads 431

24350 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 52

24349 The Principle Probabilities of Space-Distance Resolution for a Monostatic Radar and Realization in Cylindrical Array

Authors: Anatoly D. Pluzhnikov, Elena N. Pribludova, Alexander G. Ryndyk

Abstract:

In conjunction with the problem of the target selection on a clutter background, the analysis of the scanning rate influence on the spatial-temporal signal structure, the generalized multivariate correlation function and the quality of the resolution with the increase pulse repetition frequency is made. The possibility of the object space-distance resolution, which is conditioned by the range-to-angle conversion with an increased scanning rate, is substantiated. The calculations for the real cylindrical array at high scanning rate are presented. The high scanning rate let to get the signal to noise improvement of the order of 10 dB for the space-time signal processing.

Keywords: antenna pattern, array, signal processing, spatial resolution

Procedia PDF Downloads 165

24348 Prevalence and Factors Associated with Multiple Parasitic Infections among Rural Community in Kano State Nigeria

Authors: Salwa S. Dawaki, Init Ithoi, Sa’adatu I. Yelwa

Abstract:

Introduction: Parasitic infections are major public health problems worldwide, particularly in developing countries. Two third of the world population is infected while about 3 billion are at risk of parasitic infections. It is demonstrated that most parasitic infections occur as multiple infections especially among poor and rural communities of most countries in the tropical regions. Parasitic infections are endemic in Nigeria, yet multiple infections are rarely reported. The study aimed to estimate the prevalence and identify factors associating with multiple parasitic infections among rural population in Kano State Nigeria. Methodology: A cross-sectional survey was conducted from June to August 2013 in rural Kano State, Nigeria. Three samples stool, urine, and blood were collected from each of the 551 volunteers aged between one and ninety years old recruited for the survey. A pre-tested questionnaire was used to obtain epidemiological data. Data were analysed using appropriate descriptive, univariate and multivariate logistic regression methods. Major findings: The participants were 61.7% male, 38.3% female, and 69.0% were adults of 15 years and above. Overall, 463 (84%) were infected with parasitic infections among which 60.9% had multiple infections. A total of 15 parasitic species were recovered, and up to 8 different parasitic species were found concurrently in a single host. Plasmodium was the most common parasite followed by Blastocystis, Entamoeba species, and hookworms. It was found that presence of an infected family member (P = 0.017; OR = 1.52; 95% CI = 1.08, 2.13) and not wearing shoes outside home (P = 0.043; OR = 1.50; 95% CI = 1.01, 2.18) significantly associated with higher risk of having multiple parasitic infections among the studied population. Conclusion: Parasitic infections pose a public health challenge in the rural community of Kano. Multiple parasitic infections are highly prevalent and presence of an infected family member as well as not wearing proper foot wear outside home increases the risk of infection. Poor hygiene, unfavourable socioeconomic conditions, and culture promote survival and transmission of parasites. There is a need for implementation of integrated approach aimed at controlling or eliminating the infections with emphasis on public awareness.

Keywords: multiple infections, parasitic infections, poor hygiene, risk of infection

Procedia PDF Downloads 159

24347 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 467

24346 Dietary Patterns and Hearing Loss in Older People

Authors: N. E. Gallagher, C. E. Neville, N. Lyner, J. Yarnell, C. C. Patterson, J. E. Gallacher, Y. Ben-Shlomo, A. Fehily, J. V. Woodside

Abstract:

Hearing loss is highly prevalent in older people and can reduce quality of life substantially. Emerging research suggests that potentially modifiable risk factors, including risk factors previously related to cardiovascular disease risk, may be associated with a decreased or increased incidence of hearing loss. This has prompted investigation into the possibility that certain nutrients, foods or dietary patterns may also be associated with incidence of hearing loss. The aim of this study was to determine any associations between dietary patterns and hearing loss in men enrolled in the Caerphilly study. The Caerphilly prospective cohort study began in 1979-1983 with recruitment of 2512 men aged 45-59 years. Dietary data was collected using a self-administered, semi-quantitative, 56-item food frequency questionnaire (FFQ) at baseline (1979-1983), and 7-day weighed food intake (WI) in a 30% sub-sample, while pure-tone unaided audiometric threshold was assessed at 0.5, 1, 2 and 4 kHz, between 1984 and 1988. Principal components analysis (PCA) was carried out to determine a posteriori dietary patterns and multivariate linear and logistic regression models were used to examine associations with hearing level (pure tone average (PTA) of frequencies 0.5, 1, 2 and 4 kHz in decibels (dB)) for linear regression and with hearing loss (PTA>25dB) for logistic regression. Three dietary patterns were determined using PCA on the FFQ data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, both linear and logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P<0.001) and linear regression analysis showed a significant association between the High sugar/Alcohol avoider pattern and hearing loss (P=0.04). Three similar dietary patterns were determined using PCA on the WI data- Traditional, Healthy, High sugar/Alcohol avoider. After adjustment for potential confounding factors, logistic regression analyses showed a significant and inverse association between the Healthy pattern and hearing loss (P=0.02) and a significant association between the Traditional pattern and hearing loss (P=0.04). A Healthy dietary pattern was found to be significantly inversely associated with hearing loss in middle-aged men in the Caerphilly study. Furthermore, a High sugar/Alcohol avoider pattern (FFQ) and a Traditional pattern (WI) were associated with poorer hearing levels. Consequently, the role of dietary factors in hearing loss remains to be fully established and warrants further investigation.

Keywords: ageing, diet, dietary patterns, hearing loss

Procedia PDF Downloads 217

24345 Lipidomic Response to Neoadjuvant Chemoradiotherapy in Rectal Cancer

Authors: Patricia O. Carvalho, Marcia C. F. Messias, Salvador Sanchez Vinces, Caroline F. A. Gatinoni, Vitor P. Iordanu, Carlos A. R. Martinez

Abstract:

Lipidomics methods are widely used in the identification and validation of disease-specific biomarkers and therapy response evaluation. The present study aimed to identify a panel of potential lipid biomarkers to evaluate response to neoadjuvant chemoradiotherapy in rectal adenocarcinoma (RAC). Liquid chromatography–mass spectrometry (LC-MS)-based untargeted lipidomic was used to profile human serum samples from patients with clinical stage T2 or T3 resectable RAC, after and before chemoradiotherapy treatment. A total of 28 blood plasma samples were collected from 14 patients with RAC who recruited at the São Francisco University Hospital (HUSF/USF). The study was approved by the ethics committee (CAAE 14958819.8.0000.5514). Univariate and multivariate statistical analyses were applied to explore dysregulated metabolic pathways using untargeted lipidic profiling and data mining approaches. A total of 36 statistically significant altered lipids were identified and the subsequent partial least-squares discriminant analysis model was both cross validated (R2, Q2) and permutated. Lisophosphatidyl-choline (LPC) plasmalogens containing palmitoleic and oleic acids, with high variable importance in projection score, showed a tendency to be lower after completion of chemoradiotherapy. Chemoradiotherapy seems to change plasmanyl-phospholipids levels, indicating that these lipids play an important role in the RAC pathogenesis.

Keywords: lipidomics, neoadjuvant chemoradiotherapy, plasmalogens, rectal adenocarcinoma

Procedia PDF Downloads 115

24344 Prognostic and Predictive Value of Tumor: Infiltrating Lymphocytes in Triple Negative Breast Cancer

Authors: Wooseok Byon, Eunyoung Kim, Junseong Kwon, Byung Joo Song, Chan Heun Park

Abstract:

Background/Purpose: Previous preclinical and clinical data suggest that increased lymphocytic infiltration would be associated with good prognosis and benefit from immunogenic chemotherapy especially in triple-negative breast cancer (TNBC). We investigated a single-center experience of TNBC and relationship with lymphocytic infiltration. Methods: From January 2004 to December 2012, at the Department of Surgery, Kangbuk Samsung Hospital, Sungkyunkwan University, School of Medicine, we retrospectively reviewed 897 breast cancer patients-clinical outcomes, clinicopathological characteristics, breast cancer subtypes. And we reviewed lymphocytic infiltration of TNBC specimens by two pathologists. Statistical analysis of risk factors associated with recurrence was performed. Results: A total of 897 patients, 76 were TNBC (8.47%). Mean age of TNBC patients were 50.95 (SD10.42) years, mean follow-up periods was 40.06 months. We reviewed 49 slides, and there were 8 recurrent breast cancer patients (16.32%), and 4 patients were expired (8.16%). There were 9 lymphocytic predominant breast cancers (LPBC)-carcinomas with either intratumoral lymphocytes in >60% of tumor cell nests. 1 patient of LPBC was recurred and 8 were not. In multivariate logistic regression, the odds ratio of lymphocytic infiltration was 0.59 (p=0.643). Conclusion: In a single-center experience of TNBC, the lymphocytic infiltration in tumor cell nest might be a good trend on the prognosis but there was not statistically significant.

Keywords: tumor-infiltrating lymphocytes, triple negative breast cancer, medical and health sciences

Procedia PDF Downloads 393

24343 Healthcare Big Data Analytics Using Hadoop

Authors: Chellammal Surianarayanan

Abstract:

Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.

Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare

Procedia PDF Downloads 389

24342 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments

Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo

Abstract:

Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.

Keywords: data disorders, quality, healthcare, treatment

Procedia PDF Downloads 418

24341 The Effect of Transactional Analysis Group Training on Self-Knowledge and Its Ego States (The Child, Parent, and Adult): A Quasi-Experimental Study Applied to Counselors of Tehran

Authors: Mehravar Javid, Sadrieh Khajavi Mazanderani, Kelly Gleischman, Zoe Andris

Abstract:

The present study was conducted with the aim of investigating the effectiveness of transactional analysis group training on self-knowledge and Its dimensions (self, child, and adult) in counselors working in public and private high schools in Tehran. Counseling has become an important job for society, and there is a need for consultants in organizations. Providing better and more efficient counseling is one of the goals of the education system. The personal characteristics of counselors are important for the success of the therapy. In TA, humans have three ego states, which are named parent, adult, and child, and the main concept in the transactional analysis is self-state, which means a stable feeling and pattern of thinking related to behavioral patterns. Self-knowledge, considered a prerequisite to effective communication, fosters psychological growth, and recognizing it, is pivotal for emotional development, leading to profound insights. The research sample included 30 working counselors (22 women and 8 men) in the academic year 2019-2020 who achieved the lowest scores on the self-knowledge questionnaire. The research method was quasi-experimental with a control group (15 people in the experimental group and 15 people in the control group). The research tool was a self-awareness questionnaire with 29 questions and three subscales (child, parent, and adult Ego state). The experimental group was exposed to transactional analysis training for 10 once-weekly 2-hour sessions; the questionnaire was implemented in both groups (post-test). Multivariate covariance analysis was used to analyze the data. The data showed that the level of self-awareness of counselors who received transactional analysis training is higher than that of counselors who did not receive any training (p<0.01). The result obtained from this analysis shows that transactional analysis training is an effective therapy for enhancing self-knowledge and its subscales (Adult ego state, Parent ego state, and Child ego state). Teaching transactional analysis increases self-knowledge, and self-realization and helps people to achieve independence and remove irresponsibility to improve intra-personal and interpersonal relationships.

Keywords: ego state, group, transactional analysis, self-knowledge

Procedia PDF Downloads 56

24340 Big Data and Analytics in Higher Education: An Assessment of Its Status, Relevance and Future in the Republic of the Philippines

Authors: Byron Joseph A. Hallar, Annjeannette Alain D. Galang, Maria Visitacion N. Gumabay

Abstract:

One of the unique challenges provided by the twenty-first century to Philippine higher education is the utilization of Big Data. The higher education system in the Philippines is generating burgeoning amounts of data that contains relevant data that can be used to generate the information and knowledge needed for accurate data-driven decision making. This study examines the status, relevance and future of Big Data and Analytics in Philippine higher education. The insights gained from the study may be relevant to other developing nations similarly situated as the Philippines.

Keywords: big data, data analytics, higher education, republic of the philippines, assessment

Procedia PDF Downloads 323

24339 Data Management and Analytics for Intelligent Grid

Authors: G. Julius P. Roy, Prateek Saxena, Sanjeev Singh

Abstract:

Power distribution utilities two decades ago would collect data from its customers not later than a period of at least one month. The origin of SmartGrid and AMI has subsequently increased the sampling frequency leading to 1000 to 10000 fold increase in data quantity. This increase is notable and this steered to coin the tern Big Data in utilities. Power distribution industry is one of the largest to handle huge and complex data for keeping history and also to turn the data in to significance. Majority of the utilities around the globe are adopting SmartGrid technologies as a mass implementation and are primarily focusing on strategic interdependence and synergies of the big data coming from new information sources like AMI and intelligent SCADA, there is a rising need for new models of data management and resurrected focus on analytics to dissect data into descriptive, predictive and dictatorial subsets. The goal of this paper is to is to bring load disaggregation into smart energy toolkit for commercial usage.

Keywords: data management, analytics, energy data analytics, smart grid, smart utilities

Procedia PDF Downloads 764

24338 Comparison of Power Generation Status of Photovoltaic Systems under Different Weather Conditions

Authors: Zhaojun Wang, Zongdi Sun, Qinqin Cui, Xingwan Ren

Abstract:

Based on multivariate statistical analysis theory, this paper uses the principal component analysis method, Mahalanobis distance analysis method and fitting method to establish the photovoltaic health model to evaluate the health of photovoltaic panels. First of all, according to weather conditions, the photovoltaic panel variable data are classified into five categories: sunny, cloudy, rainy, foggy, overcast. The health of photovoltaic panels in these five types of weather is studied. Secondly, a scatterplot of the relationship between the amount of electricity produced by each kind of weather and other variables was plotted. It was found that the amount of electricity generated by photovoltaic panels has a significant nonlinear relationship with time. The fitting method was used to fit the relationship between the amount of weather generated and the time, and the nonlinear equation was obtained. Then, using the principal component analysis method to analyze the independent variables under five kinds of weather conditions, according to the Kaiser-Meyer-Olkin test, it was found that three types of weather such as overcast, foggy, and sunny meet the conditions for factor analysis, while cloudy and rainy weather do not satisfy the conditions for factor analysis. Therefore, through the principal component analysis method, the main components of overcast weather are temperature, AQI, and pm2.5. The main component of foggy weather is temperature, and the main components of sunny weather are temperature, AQI, and pm2.5. Cloudy and rainy weather require analysis of all of their variables, namely temperature, AQI, pm2.5, solar radiation intensity and time. Finally, taking the variable values in sunny weather as observed values, taking the main components of cloudy, foggy, overcast and rainy weather as sample data, the Mahalanobis distances between observed value and these sample values are obtained. A comparative analysis was carried out to compare the degree of deviation of the Mahalanobis distance to determine the health of the photovoltaic panels under different weather conditions. It was found that the weather conditions in which the Mahalanobis distance fluctuations ranged from small to large were: foggy, cloudy, overcast and rainy.

Keywords: fitting, principal component analysis, Mahalanobis distance, SPSS, MATLAB

Procedia PDF Downloads 127

24337 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive

Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh

Abstract:

Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.

Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data

Procedia PDF Downloads 277

24336 A Fuzzy Kernel K-Medoids Algorithm for Clustering Uncertain Data Objects

Authors: Behnam Tavakkol

Abstract:

Uncertain data mining algorithms use different ways to consider uncertainty in data such as by representing a data object as a sample of points or a probability distribution. Fuzzy methods have long been used for clustering traditional (certain) data objects. They are used to produce non-crisp cluster labels. For uncertain data, however, besides some uncertain fuzzy k-medoids algorithms, not many other fuzzy clustering methods have been developed. In this work, we develop a fuzzy kernel k-medoids algorithm for clustering uncertain data objects. The developed fuzzy kernel k-medoids algorithm is superior to existing fuzzy k-medoids algorithms in clustering data sets with non-linearly separable clusters.

Keywords: clustering algorithm, fuzzy methods, kernel k-medoids, uncertain data

Procedia PDF Downloads 198

24335 Democracy Bytes: Interrogating the Exploitation of Data Democracy by Radical Terrorist Organizations

Authors: Nirmala Gopal, Sheetal Bhoola, Audecious Mugwagwa

Abstract:

This paper discusses the continued infringement and exploitation of data by non-state actors for destructive purposes, emphasizing radical terrorist organizations. It will discuss how terrorist organizations access and use data to foster their nefarious agendas. It further examines how cybersecurity, designed as a tool to curb data exploitation, is ineffective in raising global citizens' concerns about how their data can be kept safe and used for its acquired purpose. The study interrogates several policies and data protection instruments, such as the Data Protection Act, Cyber Security Policies, Protection of Personal Information(PPI) and General Data Protection Regulations (GDPR), to understand data use and storage in democratic states. The study outcomes point to the fact that international cybersecurity and cybercrime legislation, policies, and conventions have not curbed violations of data access and use by radical terrorist groups. The study recommends ways to enhance cybersecurity and reduce cyber risks using democratic principles.

Keywords: cybersecurity, data exploitation, terrorist organizations, data democracy

Procedia PDF Downloads 183

24334 Healthcare Data Mining Innovations

Authors: Eugenia Jilinguirian

Abstract:

In the healthcare industry, data mining is essential since it transforms the field by collecting useful data from large datasets. Data mining is the process of applying advanced analytical methods to large patient records and medical histories in order to identify patterns, correlations, and trends. Healthcare professionals can improve diagnosis accuracy, uncover hidden linkages, and predict disease outcomes by carefully examining these statistics. Additionally, data mining supports personalized medicine by personalizing treatment according to the unique attributes of each patient. This proactive strategy helps allocate resources more efficiently, enhances patient care, and streamlines operations. However, to effectively apply data mining, however, and ensure the use of private healthcare information, issues like data privacy and security must be carefully considered. Data mining continues to be vital for searching for more effective, efficient, and individualized healthcare solutions as technology evolves.

Keywords: data mining, healthcare, big data, individualised healthcare, healthcare solutions, database

Procedia PDF Downloads 52

24333 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: clustering algorithms, coastal engineering, data mining, data summarization, statistical methods

Procedia PDF Downloads 345

24332 Access to Health Data in Medical Records in Indonesia in Terms of Personal Data Protection Principles: The Limitation and Its Implication

Authors: Anny Retnowati, Elisabeth Sundari

Abstract:

This research aims to elaborate the meaning of personal data protection principles on patient access to health data in medical records in Indonesia and its implications. The method uses normative legal research by examining health law in Indonesia regarding the patient's right to access their health data in medical records. The data will be analysed qualitatively using the interpretation method to elaborate on the limitation of the meaning of personal data protection principles on patients' access to their data in medical records. The results show that patients only have the right to obtain copies of their health data in medical records. There is no right to inspect directly at any time. Indonesian health law limits the principle of patients' right to broad access to their health data in medical records. This restriction has implications for the reduction of personal data protection as part of human rights. This research contribute to show that a limitaion of personal data protection may abuse the human rights.

Keywords: access, health data, medical records, personal data, protection

Procedia PDF Downloads 70

24331 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises

Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto

Abstract:

The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.

Keywords: data management, digitization, industry 4.0, knowledge engineering, metamodel

Procedia PDF Downloads 338

24330 Analysis and Forecasting of Bitcoin Price Using Exogenous Data

Authors: J-C. Leneveu, A. Chereau, L. Mansart, T. Mesbah, M. Wyka

Abstract:

Extracting and interpreting information from Big Data represent a stake for years to come in several sectors such as finance. Currently, numerous methods are used (such as Technical Analysis) to try to understand and to anticipate market behavior, with mixed results because it still seems impossible to exactly predict a financial trend. The increase of available data on Internet and their diversity represent a great opportunity for the financial world. Indeed, it is possible, along with these standard financial data, to focus on exogenous data to take into account more macroeconomic factors. Coupling the interpretation of these data with standard methods could allow obtaining more precise trend predictions. In this paper, in order to observe the influence of exogenous data price independent of other usual effects occurring in classical markets, behaviors of Bitcoin users are introduced in a model reconstituting Bitcoin value, which is elaborated and tested for prediction purposes.

Keywords: big data, bitcoin, data mining, social network, financial trends, exogenous data, global economy, behavioral finance

Procedia PDF Downloads 342

24329 Women’s Empowerment on Modern Contraceptive Use in Poor-Rich Segment of Population: Evidence from South Asian Countries

Authors: Muhammad Asim

Abstract:

Background: Less than half of women in South Asia (SA) use any modern contraceptive method which leads to a huge burden of unintended pregnancies, unsafe abortions, maternal deaths, and socioeconomic loss. Women empowerment plays a pivotal role in improving various health seeking behaviours, including contraceptive use. The objective of this study to explore the association between women's empowerment and modern contraceptive, among rich and poor segment of population in SA. Methods: We used the most recent, large-scale, demographic health survey data of five South Asian countries, namely Afghanistan, Pakistan, Bangladesh, India, and Nepal. The outcome variable was the current use of modern contraceptive methods. The main exposure variable was a combination (interaction) of socio-economic status (SES) and women’s level of empowerment (low, medium, and high), where SES was bifurcated into poor and rich; and women empowerment was divided into three categories: decision making, attitude to violence and social independence. Moreover, overall women empowerment indicator was also created by using three dimensions of women empowerment. We applied both descriptive statistics and multivariable logistic regression techniques for data analyses. Results: Most of the women possessed ‘medium’ level of empowerment across South Asian Countries. The lowest attitude to violence empowerment was found in Afghanistan, and the lowest social independence empowerment was observed in Bangladesh across SA. However, Pakistani women have the lowest decision-making empowerment in the region. The lowest modern contraceptive use (22.1%) was found in Afghanistan and the highest (53.2%) in Bangladesh. The multivariate results depict that the overall measure of women empowerment does not affect modern contraceptive use among poor and rich women in most of South Asian countries. However, the decision-making empowerment plays a significant role among both poor and rich women to use modern contraceptive methods across South Asian countries. Conclusions: The effect of women’s empowerment on modern contraceptive use is not consistent across countries, and among poor and rich segment of population. Of the three dimensions of women’s empowerment, the autonomy of decision making in household affairs emerged as a stronger determinant of mCPR as compared with social independence and attitude towards violence against women.

Keywords: women empowerment, contraceptive use, South Asia, women autonomy

Procedia PDF Downloads 65

24328 The Role of Identifications in Women Psychopathology

Authors: Mary Gouva, Elena Dragioti, Evangelia Kotrsotsiou

Abstract:

Family identification has the potential to play a very decisive role in psychopathology. In this study we aimed to investigate the impact of family identifications on female psychopathology. A community sample of 101 women (mean age 20.81 years, SD = 0.91 ranged 20-25) participated to the present study. The girls completed a) the Symptom Check-List Revised (SCL-90) and b) questionnaire concerning socio-demographic information and questions for family identifications. The majority of women reported that they matched to the father in terms of identifications (47.1%). Age and birth order were not contributed on family identifications (F(5) =2.188, p=.062 and F(3)=1.244, p=.299 respectively). Multivariate analysis by using MANCOVA found statistical significant associations between family identifications and domains of psychopathology as provided by SCL-90 (P<05). Our results highlight the role of identifications especially on father and female psychopathology as well as replicate the Freudian perception about the female Oedipus complex.

Keywords: family identification, psychoanalysis, psychopathology, women

Procedia PDF Downloads 303

24327 Women’s Empowerment on Modern Contraceptive Use in Poor-Rich Segment of Population: Evidence From South Asian Countries

Authors: Muhammad Asim

Abstract:

Background: Less than half of women in South Asia (SA) use any modern contraceptive method which leads to a huge burden of unintended pregnancies, unsafe abortions, maternal deaths, and socioeconomic loss. Women empowerment plays a pivotal role in improving various health seeking behaviours, including contraceptive use. The objective of this study to explore the association between women's empowerment and modern contraceptive, among rich and poor segment of population in SA. Methods: We used the most recent, large-scale, demographic health survey data of five South Asian countries, namely Afghanistan, Pakistan, Bangladesh, India, and Nepal. The outcome variable was the current use of modern contraceptive methods. The main exposure variable was a combination (interaction) of socio-economic status (SES) and women’s level of empowerment (low, medium, and high), where SES was bifurcated into poor and rich; and women empowerment was divided into three categories: decision making, attitude to violence and social independence. Moreover, overall women empowerment indicator was also created by using three dimensions of women empowerment. We applied both descriptive statistics and multivariable logistic regression techniques for data analyses. Results: Most of the women possessed ‘medium’ level of empowerment across South Asian Countries. The lowest attitude to violence empowerment was found in Afghanistan, and the lowest social independence empowerment was observed in Bangladesh across SA. However, Pakistani women have the lowest decision-making empowerment in the region. The lowest modern contraceptive use (22.1%) was found in Afghanistan and the highest (53.2%) in Bangladesh. The multivariate results depict that the overall measure of women empowerment does not affect modern contraceptive use among poor and rich women in most of South Asian countries. However, the decision-making empowerment plays a significant role among both poor and rich women to use modern contraceptive methods across South Asian countries. Conclusions: The effect of women’s empowerment on modern contraceptive use is not consistent across countries, and among poor and rich segment of population. Of the three dimensions of women’s empowerment, the autonomy of decision making in household affairs emerged as a stronger determinant of mCPR as compared with social independence and attitude towards violence against women.

Keywords: women empowerment, modern contraceptive use, South Asia, women autonomy

Procedia PDF Downloads 65

24326 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment: A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper, we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: mobile health, data integration, expert systems, disease-related malnutrition

Procedia PDF Downloads 463