Search results for: multivariate data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24454

Search results for: multivariate data

24124 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 487
24123 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies

Authors: Monica Lia

Abstract:

This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.

Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes

Procedia PDF Downloads 397
24122 Tempo-Spatial Pattern of Progress and Disparity in Child Health in Uttar Pradesh, India

Authors: Gudakesh Yadav

Abstract:

Uttar Pradesh is one of the poorest performing states of India in terms of child health. Using data from the three round of NFHS and two rounds of DLHS, this paper attempts to examine tempo-spatial change in child health and care practices in Uttar Pradesh and its regions. Rate-ratio, CI, multivariate, and decomposition analysis has been used for the study. Findings demonstrate that child health care practices have improved over the time in all regions of the state. However; western and southern region registered the lowest progress in child immunization. Nevertheless, there is no decline in prevalence of diarrhea and ARI over the period, and it remains critically high in the western and southern region. These regions also poorly performed in giving ORS, diarrhoea and ARI treatment. Public health services are least preferred for diarrhoea and ARI treatment. Results from decomposition analysis reveal that rural area, mother’s illiteracy and wealth contributed highest to the low utilization of the child health care practices consistently over the period of time. The study calls for targeted intervention for vulnerable children to accelerate child health care service utilization. Poor performing regions should be targeted and routinely monitored on poor child health indicators.

Keywords: Acute Respiratory Infection (ARI), decomposition, diarrhea, inequality, immunization

Procedia PDF Downloads 278
24121 Impact of Diabetes Mellitus Type 2 on Clinical In-Stent Restenosis in First Elective Percutaneous Coronary Intervention Patients

Authors: Leonard Simoni, Ilir Alimehmeti, Ervina Shirka, Endri Hasimi, Ndricim Kallashi, Verona Beka, Suerta Kabili, Artan Goda

Abstract:

Background: Diabetes Mellitus type 2, small vessel calibre, stented length of vessel, complex lesion morphology, and prior bypass surgery have resulted risk factors for In-Stent Restenosis (ISR). However, there are some contradictory results about body mass index (BMI) as a risk factor for ISR. Purpose: We want to identify clinical, lesional and procedural factors that can predict clinical ISR in our patients. Methods: Were enrolled 759 patients who underwent first-time elective PCI with Bare Metal Stents (BMS) from September 2011 to December 2013 in our Department of Cardiology and followed them for at least 1.5 years with a median of 862 days (2 years and 4 months). Only the patients re-admitted with ischemic heart disease underwent control coronary angiography but no routine angiographic control was performed. Patients were categorized in ISR and non-ISR groups and compared between them. Multivariate analysis - Binary Logistic Regression: Forward Conditional Method was used to identify independent predictive risk factors. P was considered statistically significant when <0.05. Results: ISR compared to non-ISR individuals had a significantly lower BMI (25.7±3.3 vs. 26.9±3.7, p=0.004), higher risk anatomy (LM + 3-vessel CAD) (23% vs. 14%, p=0.03), higher number of stents/person used (2.1±1.1 vs. 1.75±0.96, p=0.004), greater length of stents/person used (39.3±21.6 vs. 33.3±18.5, p=0.01), and a lower use of clopidogrel and ASA (together) (95% vs. 99%, p=0.012). They also had a higher, although not statistically significant, prevalence of Diabetes Mellitus (42% vs. 32%, p=0.072) and a greater number of treated vessels (1.36±0.5 vs. 1.26±0.5, p=0.08). In the multivariate analysis, Diabetes Mellitus type 2 and multiple stents used were independent predictors risk factors for In-Stent Restenosis, OR 1.66 [1.03-2.68], p=0.039, and OR 1.44 [1.16-1.78,] p=0.001, respectively. On the other side higher BMI and use of clopidogrel and ASA together resulted protective factors OR 0.88 [0.81-0.95], p=0.001 and OR 0.2 [0.06-0.72] p=0.013, respectively. Conclusion: Diabetes Mellitus and multiple stents are strong predictive risk factors, whereas the use of clopidogrel and ASA together are protective factors for clinical In-Stent Restenosis. Paradoxically High BMI is a protective factor for In-stent Restenosis, probably related to a larger diameter of vessels and consequently a larger diameter of stents implanted in these patients. Further studies are needed to clarify this finding.

Keywords: body mass index, diabetes mellitus, in-stent restenosis, percutaneous coronary intervention

Procedia PDF Downloads 180
24120 The Effect of Meta-Cognitive Therapy on Meta-Cognitive Defects and Emotional Regulation in Substance Dependence Patients

Authors: Sahra Setorg

Abstract:

The purpose of this study was to determine the effect of meta-cognitive therapy on meta-cognitive defects and emotional regulation in industrial substance dependence patients. This quasi-experimental research was conducted with post-test and two-month follow-up design with control and experimental groups. The statistical population consisted of all industrial Substance dependence patients refer to addictive withdrawal clinics in Esfahan city, in Iran in 2013. 45 patients were selected from three clinics through the convenience sampling method and were randomly divided into two experimental groups (15 crack dependences, 15 amphetamine dependences) and one control group (n=15). The meta-cognitive questionnaire (MCQ) and difficulties in emotional regulation questionnaire (DERS) were used as pre-test measures and the experimental groups (crack and amphetamine) received 8 MC therapy sessions in groups. The data were analyzed via multivariate covariance statistic method by spss-18. The results showed that MCT had a significant effect in improving the meta-cognitive defects in crack and amphetamine dependences. Also, this therapy can increase the emotional regulation in both groups (p<0/05).The effect of this therapy is confirmed in two months followup. According to these findings, met-cognitive is as an interface and important variable in prevention, control, and treatment of the new industrial substance dependences.

Keywords: meta-cognitive therapy, meta-cognitive defects, emotional regulation, substance dependence disorder

Procedia PDF Downloads 484
24119 The Effectiveness of Mindfulness Education on Emotional, Psychological, and Social Well-Being in 12th Grade Students in Tehran City

Authors: Fariba Dortaj, H. Bashir Nejad, Akram Dortaj,

Abstract:

Investigate the Effectiveness of Mindfulness Education on Emotional, Psychological, and Social Well-being in 12th grade students in Tehran city is the aim of present study. The research method is semi-experimental with pretest-posttest design with control group. The statistical population of the study includes all 12th grade students of the 12th district of Tehran city in the academic year of 2017 to 2018. From the mentioned population, 60 students had earned low scores in three dimensions of Subjective Well-Being Questionnaire of Keyes and Magyar-Moe (2003) by using random sampling method and they were selected and randomly assigned into 2 experimental and control groups. Then experimental groups were received a Mindfulness protocol in 8 sessions during 2 hours. After completion of the sessions, all subjects were re-evaluated. Data were analyzed by using multivariate analysis of covariance. The findings of this study showed that in the emotional well-being aspect with the components of positive emotional affection (P < 0.025, F = 17/80) and negative emotions (P <0.025, F = 5/41), in the psychological well-being of the components Self-esteem (P < 0.008, F = 25.26), life goal (P < 0.008, F = 38.19), environmental domination (P <0.008, F=82.82), relationships with others (P < 0.008, F = 19.12), personal development with (P < 0.008, F = 87.38), and in the social well-being aspect, the correlation coefficients with (P<0.01, F=12/21), admission and acceptability with (P <0.01, F =18.09) and realism with (P <0.01, F = 11.30), there was a significant difference between the experimental and control groups and it can be said that the education of mindfulness affects the improvement of components of psychological, social and emotional well-being in students.

Keywords: mindfulness, emotional well-being, psychological well-being, social well-being

Procedia PDF Downloads 138
24118 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 343
24117 Time Series Forecasting (TSF) Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan

Abstract:

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed-length window in the past as an explicit input. In this paper, we study how the performance of predictive models changes as a function of different look-back window sizes and different amounts of time to predict the future. We also consider the performance of the recent attention-based Transformer models, which have had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: air quality prediction, deep learning algorithms, time series forecasting, look-back window

Procedia PDF Downloads 126
24116 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 84
24115 Investigation of the Effect of Teaching Thinking and Research Lesson by Cooperative and Traditional Methods on Creativity of Sixth Grade Students

Authors: Faroogh Khakzad, Marzieh Dehghani, Elahe Hejazi

Abstract:

The present study investigates the effect of teaching a Thinking and Research lesson by cooperative and traditional methods on the creativity of sixth-grade students in Piranshahr province. The statistical society includes all the sixth-grade students of Piranshahr province. The sample of this studytable was selected by available sampling from among male elementary schools of Piranshahr. They were randomly assigned into two groups of cooperative teaching method and traditional teaching method. The design of the study is quasi-experimental with a control group. In this study, to assess students’ creativity, Abedi’s creativity questionnaire was used. Based on Cronbach’s alpha coefficient, the reliability of the factor flow was 0.74, innovation was 0.61, flexibility was 0.63, and expansion was 0.68. To analyze the data, t-test, univariate and multivariate covariance analysis were used for evaluation of the difference of means and the pretest and posttest scores. The findings of the research showed that cooperative teaching method does not significantly increase creativity (p > 0.05). Moreover, cooperative teaching method was found to have significant effect on flow factor (p < 0.05), but in innovation and expansion factors no significant effect was observed (p < 0.05).

Keywords: cooperative teaching method, traditional teaching method, creativity, flow, innovation, flexibility, expansion, thinking and research lesson

Procedia PDF Downloads 290
24114 Prevalence of Malnutrition and Associated Factors among Children Aged 6-59 Months at Hidabu Abote District, North Shewa, Oromia Regional State

Authors: Kebede Mengistu, Kassahun Alemu, Bikes Destaw

Abstract:

Introduction: Malnutrition continues to be a major public health problem in developing countries. It is the most important risk factor for the burden of diseases. It causes about 300, 000 deaths per year and responsible for more than half of all deaths in children. In Ethiopia, child malnutrition rate is one of the most serious public health problem and the highest in the world. High malnutrition rates in the country pose a significant obstacle to achieving better child health outcomes. Objective: To assess prevalence of malnutrition and associated factors among children aged 6-59 months at Hidabu Abote district, North shewa, Oromia. Methods: A community based cross sectional study was conducted on 820 children aged 6-59 months from September 8-23, 2012 at Hidabu Abote district. Multistage sampling method was used to select households. Children were selected from each kebeles by simple random sampling. Anthropometric measurements and structured questioners were used. Data was processed using EPi-info soft ware and exported to SPSS for analysis. Then after, sex, age, months, height, and weight transferred with HHs number to ENA for SMART 2007software to convert nutritional data into Z-scores of the indices; H/A, W/H and W/A. Bivariate and multivariate logistic regressions were used to identify associated factors of malnutrition. Results: The analysis this study revealed that, 47.6%, 30.9% and 16.7% of children were stunted, underweight and wasted, respectively. The main associated factors of stunting were found to be child age, family monthly income, children were received butter as pre-lacteal feeding and family planning. Underweight was associated with number of children HHs and children were received butter as per-lacteal feeding but un treatment of water in HHs only associated with wasting. Conclusion and recommendation: From the findings of this study, it is concluded that malnutrition is still an important problem among children aged 6-59 months. Therefore, especial attention should be given on intervention of malnutrition.

Keywords: children, Hidabu Abote district, malnutrition, public health

Procedia PDF Downloads 400
24113 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 278
24112 The Comparison of Emotional Regulation Strategies and Psychological Symptoms in Patients with Multiple Sclerosis and Normal Individuals

Authors: Amir Salamatzade, Marhamet HematPour

Abstract:

Due to the increasing importance of psychological factors in the incidence and exacerbation of chronic diseases such as multiple sclerosis, the aim of this study was to determine the difference between emotional regulation strategies and psychological symptoms in patients with multiple sclerosis and normal people. The research method was causal-comparative (post-event). The statistical population of this research included all patients with multiple sclerosis referred to the MS Association of Rasht in the first quarter of 2021, approximately 350 people. The study sample also included 120 people (60 patients with multiple sclerosis and 60 normal people) who were selected by the available sampling method and completed the emotional regulation and anxiety, depression, and stress Lavibund and Lavibund (1995) questionnaires. Data were analyzed using an independent t-test and multivariate variance analysis. The results showed that there was a significant difference between the mean of emotional regulation strategies and the components of emotional reassessment and emotional inhibition between the two groups of patients with multiple sclerosis and normal individuals (p < 0.01). There is a significant difference between the mean of psychological symptoms and the components of depression, anxiety, and stress in the two groups of patients with multiple sclerosis and normal individuals. (p < 0.01). Based on this, it can be concluded that patients with multiple sclerosis have lower levels of emotional regulation strategies and higher levels of psychological symptoms than normal individuals.

Keywords: emotional regulation strategies, psychological symptoms, multiple sclerosis, normal Individuals

Procedia PDF Downloads 186
24111 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 489
24110 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 359
24109 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 407
24108 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 154
24107 Glycemic Control on Self-Efficacy and Self-Care Behaviors among Omani Adults with Type 2 Diabetes

Authors: Melba Sheila D'Souza, Anandhi Amirtharaj, Shreedevi Balachandran

Abstract:

Background: Type 2 diabetes has a significant impact on individuals’ health and well-being. Glycemic control may influence self-efficacy and self-care behaviors, and reduce the risk of complications among adults with type 2 diabetes. Type 2 diabetes has substantial morbidity and mortality and 60% of adults’ poor self-care. Glycemic control is associated with reported self-efficacy and self-care behavior. Adults with type 2 diabetes with less information were less likely to take diabetes self-care. Aim: To examine the relationship between glycemic control, demographic factors, clinical factors on self-efficacy, self-care behaviors among Omani adults with type 2 diabetes. Methods: A correlational, descriptive study was used. Omani adults with type 2 diabetes (n=140) were recruited from a public hospital in Oman. The data were collected during January-March 2015. Ethical approval was given by the college research and ethics committee, College of Nursing, and the Hospital, Sultan Qaboos University Data was collected on self-efficacy, self-care behaviors and glycemic control. The study was approved by the Institution Ethics and Research Committee. Bivariate and multivariate analyses were conducted. Results: Most adults had a fasting blood glucose >7.2mmol/L (90.7%), with the majority demonstrating ‘uncontrolled or poor HbA1c of > 8%’ (65%). Variance of self-care behavior (20.6%) and 31.3% of the variance of the self-efficacy was explained by the age, duration of diabetes, medication, HbA1c and prevention of activities of living. Adults with type 2 diabetes with poor glycemic control were more likely to have poor self-efficacy and poor self-care behaviors. Conclusion: This study confirms that self-efficacy model on outcome predicts self-efficacy and self-care behavior. Higher understanding of diabetes, prevention of normal daily activities, higher ability to fit diabetes life in a positive manner and high patient-physician communication were significant with self-efficacy and self-care behaviors. Hence, glycemic control has a high effect on improving self-care behaviors like diet, exercise, medication, foot care and self-efficacy among type 2 diabetes. Implications: Using these findings to improve self-efficacy, individualized self-care management is recommended for better self-efficacy and self-care behaviors among adults with type 2 diabetes.

Keywords: self-efficacy, self-care behaviors, self-care management, glycemic control, type 2 diabetes, nurse

Procedia PDF Downloads 381
24106 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 116
24105 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 216
24104 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 566
24103 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 371
24102 Multidimensional Poverty and Child Cognitive Development

Authors: Bidyadhar Dehury, Sanjay Kumar Mohanty

Abstract:

According to the Right to Education Act of India, education is the fundamental right of all children of age group 6-14 year irrespective of their status. Using the unit level data from India Human Development Survey (IHDS), we tried to understand the inter-relationship between the level of poverty and the academic performance of the children aged 8-11 years. The level of multidimensional poverty is measured using five dimensions and 10 indicators using Alkire-Foster approach. The weighted deprivation score was obtained by giving equal weight to each dimension and indicators within the dimension. The weighted deprivation score varies from 0 to 1 and grouped into four categories as non-poor, vulnerable, multidimensional poor and sever multidimensional poor. The academic performance index was measured using three variables reading skills, math skills and writing skills using PCA. The bivariate and multivariate analysis was used in the analysis. The outcome variable was ordinal. So the predicted probabilities were calculated using the ordinal logistic regression. The predicted probabilities of good academic performance index was 0.202 if the child was sever multidimensional poor, 0.235 if the child was multidimensional poor, 0.264 if the child was vulnerable, and 0.316 if the child was non-poor. Hence, if the level of poverty among the children decreases from sever multidimensional poor to non-poor, the probability of good academic performance increases.

Keywords: multidimensional poverty, academic performance index, reading skills, math skills, writing skills, India

Procedia PDF Downloads 564
24101 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 418
24100 A Multilevel Approach of Reproductive Preferences and Subsequent Behavior in India

Authors: Anjali Bansal

Abstract:

Reproductive preferences mainly deal with two questions: when a couple wants children and how many they want. Questions related to these desires are often included in the fertility surveys as they can provide relevant information on the subsequent behavior. The aim of the study is to observe whether respondent’s response to these questions changed over time or not. We also tried to identify socio- economic and demographic factors associated with the stability (or instability) of fertility preferences. For this purpose, we used IHDS1 (2004-05) and follow up survey IHDS2 (2011-12) data and applied bivariate, multivariate and multilevel repeated measure analysis to it to find the consistency between responses. From the analysis, we found that preferences of women changes over the course of time as from the bivariate analysis we have found that 52% of women are not consistent in their desired family size and huge inconsistency are found in desire to continue childbearing. To get a better overlook of these inconsistencies, we have computed Intra Class Correlation (ICC) which tries to explain the consistency between individuals on their fertility responses at two time periods. We also explored that husband’s desire for additional child specifically male offspring contribute to these variations. Our findings lead us to a cessation that in India, individuals fertility preferences changed over a seven-year time period as the Intra Class correlation comes out to be very small which explains the variations among individuals. Concerted efforts should be made, therefore, to educate people, and conduct motivational programs to promote family planning for family welfare.

Keywords: change, consistency, preferences, over time

Procedia PDF Downloads 143
24099 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 35
24098 Internal Mercury Exposure Levels Correlated to DNA Methylation of Imprinting Gene H19 in Human Sperm of Reproductive-Aged Man

Authors: Zhaoxu Lu, Yufeng Ma, Linying Gao, Li Wang, Mei Qiang

Abstract:

Mercury (Hg) is a well-recognized environmental pollutant known by its toxicity of development and neurotoxicity, which may result in adverse health outcomes. However, the mechanisms underlying the teratogenic effects of Hg are not well understood. Imprinting genes are emerging regulators for fetal development subject to environmental pollutants impacts. In this study, we examined the association between paternal preconception Hg exposures and the alteration of DNA methylation of imprinting genes in human sperm DNA. A total of 618 men aged from 22 to 59 was recruited from the Reproductive Medicine Clinic of Maternal and Child Care Service Center and the Urologic Surgery Clinic of Shanxi Academy of Medical Sciences during April 2015 and March 2016. Demographic information was collected using questionnaires. Urinary Hg concentrations were measured using a fully-automatic double-channel hydride generation atomic fluorescence spectrometer. And methylation status in the DMRs of imprinting genes H19, Meg3 and Peg3 of sperm DNA were examined by bisulfite pyrosequencing in 243 participants. Spearman’s rank and multivariate regression analysis were used for correlation analysis between sperm DNA methylation status of imprinting genes and urinary Hg levels. The median concentration of Hg for participants overall was 9.09μg/l (IQR: 5.54 - 12.52μg/l; range = 0 - 71.35μg/l); no significant difference was found in median concentrations of Hg among various demographic groups (p > 0.05). The proportion of samples that a beyond intoxication criterion (10μg/l) for urinary Hg was 42.6%. Spearman’s rank correlation analysis indicates a negative correlation between urinary Hg concentrations and average DNA methylation levels in the DMRs of imprinted genes H19 (rs=﹣0.330, p = 0.000). However, there was no such a correlation found in genes of Peg3 and Meg3. Further, we analyzed of correlation between methylation level at each CpG site of H19 and Hg level, the results showed that three out of 7 CpG sites on H19 DMR, namely CpG2 (rs =﹣0.138, p = 0.031), CpG4 (rs =﹣0.369, p = 0.000) and CpG6 (rs=﹣0.228, p = 0.000), demonstrated a significant negative correlation between methylation levels and the levels of urinary Hg. After adjusting age, smoking, drinking, intake of aquatic products and education by multivariate regression analysis, the results have shown a similar correlation. In summary, mercury nonoccupational environmental exposure in reproductive-aged men associated with altered DNA methylation outcomes at DMR of imprinting gene H19 in sperm, implicating the susceptibility of the developing sperm for environmental insults.

Keywords: epigenetics, genomic imprinting gene, DNA methylation, mercury, transgenerational effects, sperm

Procedia PDF Downloads 226
24097 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 450
24096 Healthcare Big Data Analytics Using Hadoop

Authors: Chellammal Surianarayanan

Abstract:

Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.

Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare

Procedia PDF Downloads 382
24095 Object-Oriented Multivariate Proportional-Integral-Derivative Control of Hydraulic Systems

Authors: J. Fernandez de Canete, S. Fernandez-Calvo, I. García-Moral

Abstract:

This paper presents and discusses the application of the object-oriented modelling software SIMSCAPE to hydraulic systems, with particular reference to multivariable proportional-integral-derivative (PID) control. As a result, a particular modelling approach of a double cylinder-piston coupled system is proposed and motivated, and the SIMULINK based PID tuning tool has also been used to select the proper controller parameters. The paper demonstrates the usefulness of the object-oriented approach when both physical modelling and control are tackled.

Keywords: object-oriented modeling, multivariable hydraulic system, multivariable PID control, computer simulation

Procedia PDF Downloads 320