Search results for: count data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25021

Search results for: count data

24571 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 75
24570 The Effect of Low Voltage Direct Current Applications on the Growth of Microalgae Chlorella Vulgaris

Authors: Osman Kök, İlhami̇ Tüzün, Yaşar Aluç

Abstract:

This study was conducted to explore the effect of direct current (DC) applications on the growth of microalgae Chlorella vulgaris KKU71, isolated from highly saline freshwater. Experiments were implemented based upon the cross-combinations of both the intensity and duration of electric applications, generating a full factorial design of 10V, 20V, 30V, and 5s, 30s, 60s, respectively. Growth parameters of cultures were monitored on Optical Density (OD), Cell Count (CC), Chlorophyll-a, b (Chl-a, b), and Total Carotenoids (TCar). All DC-assisted treatments stimulated the growth and thus led to higher values of growth parameters such as OD, CC, Chl-a, and TCar. Monotonically increasing with the intensity and duration of DC applications, wet and dry biomass yields of the harvested algae reached their highest level at 30V-60s in all sets of treatments. In addition, this increase between DC applications was listed as C(control)<10V<20V<30V and C<5s<30s<60s. As a result, direct current applications increased the biomass.

Keywords: Chlorella Vulgaris, direct current, growth, biomass

Procedia PDF Downloads 130
24569 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 264
24568 Immune Responses and Pathological Manifestations in Chicken to Oral Infection with Salmonella typhimurium

Authors: Mudasir Ahmad Syed, Raashid Ahmd Wani, Mashooq Ahmad Dar, Uneeb Urwat, Riaz Ahmad Shah, Nazir Ahmad Ganai

Abstract:

Salmonella enterica serovar Typhimurium (Salmonella Typhimurium) is a primary avian pathogen responsible for severe intestinal pathology in younger chickens and economic losses. However, the Salmonella Typhimurium is also able to cause infection in humans, described by typhoid fever and acute gastro-intestinal disease. A study was conducted at days to investigate pathological, histopathological, haemato-biochemical, immunological and expression kinetics of NRAMP (natural resistance associated macrophage protein) gene family (NRAMP1 and NRAMP2) in broiler chickens following experimental infection of Salmonella Typhimurium at 0,1,3,5,7,9,11,13 and 15 days respectively. Infection was developed in birds through oral route at 2×108 CFU/ml. Clinical symptoms appeared 4 days post infection (dpi) and after one-week birds showed progressive weakness, anorexia, diarrhea and lowering of head. On postmortem examination, liver showed congestion, hemorrhage and necrotic foci on surface, while as spleen, lungs and intestines revealed congestion and hemorrhages. Histopathological alterations were principally observed in liver in second week post infection. Changes in liver comprised of congestion, areas of necrosis, reticular endothelial hyperplasia in association with mononuclear cell and heterophilic infiltration. Hematological studies confirm a significant decrease (P<0.05) in RBC count, Hb concentration and PCV. White blood cell count showed significant increase throughout the experimental study. An increase in heterophils was found up to 7dpi and a decreased pattern was observed afterwards. Initial lymphopenia followed by lymphocytosis was found in infected chicks. Biochemical studies showed a significant increase in glucose, AST and ALT concentration and a significant decrease (P<0.05) in total protein and albumin level in the infected group. Immunological studies showed higher titers of IgG in infected group as compared to control group. The real time gene expression of NRAMPI and NRAMP2 genes increased significantly (P<0.05) in infected group as compared to controls. The peak expression of NRAMP1 gene was seen in liver, spleen and caecum of infected birds at 3dpi, 5dpi and 7dpi respectively, while as peak expression of NRAMP2 gene in liver, spleen and caecum of infected chicken was seen at 9dpi, 5dpi and 9dpi respectively. This study has role in diagnostics and prognostics in the poultry industry for the detection of salmonella infections at early stages of poultry development.

Keywords: biochemistry, histopathology, NRAMP, poultry, real time expression, Salmonella Typhimurium

Procedia PDF Downloads 326
24567 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 187
24566 Design of Speedy, Scanty Adder for Lossy Application Using QCA

Authors: T. Angeline Priyanka, R. Ganesan

Abstract:

Recent trends in microelectronics technology have gradually changed the strategies used in very large scale integration (VLSI) circuits. Complementary Metal Oxide Semiconductor (CMOS) technology has been the industry standard for implementing VLSI device for the past two decades, but due to scale-down issues of ultra-low dimension achievement is not achieved so far. Hence it paved a way for Quantum Cellular Automata (QCA). It is only one of the many alternative technologies proposed as a replacement solution to the fundamental limit problem that CMOS technology will impose in the years to come. In this brief, presented a new adder that possesses high speed of operation occupying less area is proposed. This adder is designed especially for error tolerant application. Hence in the proposed adder, the overall area (cell count) and simulation time are reduced by 88 and 73 percent respectively. Various results of the proposed adder are shown and described.

Keywords: quantum cellular automata, carry look ahead adder, ripple carry adder, lossy application, majority gate, crossover

Procedia PDF Downloads 545
24565 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 36
24564 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 56
24563 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 406
24562 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 245
24561 Relationship Between Behavioral Inhibition/Approach System, and Perceived Stress, Whit Control White Blood Cell In Multiple Sclerosis Patients

Authors: Amin Alvani

Abstract:

Multiple sclerosis (MS) is a chronic, often disabling disease in which the immune system attacks the myelin sheath of neurons in the central nervous system. The present study aimed to investigate the Relationship between behavioral inhibition/approach system (BIS-BAS) and perceived stress (PS) whit control white blood cell (WBC). 60 MS patients (male=36.7, female=63.3%; age range=15-65 participated in the study and completed the demographic questionnaire, the count blood cell (CBC) test, the behavioral Activation and behavioral inhibition scale (BIS-BAS), and the perceived stress Questionnaire (PSS-14). The results revealed that Between of BAS-reward responsiveness (BAS-DR) subscale and PS, in more than MS patient (BIS), there are increase WBC.

Keywords: behavioral inhibition/approach system, perceived stress, white blood cell, multiple sclerosis

Procedia PDF Downloads 79
24560 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 126
24559 Sentiment Analysis of Fake Health News Using Naive Bayes Classification Models

Authors: Danielle Shackley, Yetunde Folajimi

Abstract:

As more people turn to the internet seeking health-related information, there is more risk of finding false, inaccurate, or dangerous information. Sentiment analysis is a natural language processing technique that assigns polarity scores to text, ranging from positive, neutral, and negative. In this research, we evaluate the weight of a sentiment analysis feature added to fake health news classification models. The dataset consists of existing reliably labeled health article headlines that were supplemented with health information collected about COVID-19 from social media sources. We started with data preprocessing and tested out various vectorization methods such as Count and TFIDF vectorization. We implemented 3 Naive Bayes classifier models, including Bernoulli, Multinomial, and Complement. To test the weight of the sentiment analysis feature on the dataset, we created benchmark Naive Bayes classification models without sentiment analysis, and those same models were reproduced, and the feature was added. We evaluated using the precision and accuracy scores. The Bernoulli initial model performed with 90% precision and 75.2% accuracy, while the model supplemented with sentiment labels performed with 90.4% precision and stayed constant at 75.2% accuracy. Our results show that the addition of sentiment analysis did not improve model precision by a wide margin; while there was no evidence of improvement in accuracy, we had a 1.9% improvement margin of the precision score with the Complement model. Future expansion of this work could include replicating the experiment process and substituting the Naive Bayes for a deep learning neural network model.

Keywords: sentiment analysis, Naive Bayes model, natural language processing, topic analysis, fake health news classification model

Procedia PDF Downloads 87
24558 Probiotics’ Antibacterial Activity on Beef and Camel Minced Meat at Altered Ranges of Temperature

Authors: Rania Samir Zaki

Abstract:

Because of their inhibitory effects, selected probiotic Lactobacilli may be used as antimicrobial against some hazardous microorganisms responsible for spoilage of fresh minced beef (cattle) minced meat and camel minced meat. Lactic acid bacteria were isolated from camel meat. These included 10 isolates; 1 Lactobacillus fermenti, 4 Lactobacillus plantarum, 4 Lactobacillus pulgaricus, 3 Lactobacillus acidophilus and 1 Lactobacillus brevis. The most efficient inhibitory organism was Lactobacillus plantarum which can be used as a propiotic with antibacterial activity. All microbiological analyses were made at the time 0, first day and the second day at altered ranges of temperature [4±2 ⁰C (chilling temperature), 25±2 ⁰C, and 38±2 ⁰C]. Results showed a significant decrease of pH 6.2 to 5.1 within variant types of meat, in addition to reduction of Total Bacterial Count, Enterococci, Bacillus cereus and Escherichia coli together with the stability of Coliforms and absence of Staphylococcus aureus.

Keywords: antibacterial, camel meat, inhibition, probiotics

Procedia PDF Downloads 288
24557 The Predictive Value of Micro Rna 451 on the Outcome of Imatinib Treatment in Chronic Myeloid Leukemia Patients

Authors: Nehal Adel Khalil, Amel Foad Ketat, Fairouz Elsayed Mohamed Ali, Nahla Abdelmoneim Hamid, Hazem Farag Manaa

Abstract:

Background: Chronic myeloid leukemia (CML) represents 15% of adult leukemias. Imatinib Mesylate (IM) is the gold standard treatment for new cases of CML. Treatment with IM results in improvement of the majority of cases. However, about 25% of cases may develop resistance. Sensitive and specific early predictors of IM resistance in CML patients have not been established to date. Aim: To investigate the value of miR-451 in CML as an early predictor for IM resistance in Egyptian CML patients. Methods: The study employed Real time Polymerase Reaction (qPCR) technique to investigate the leucocytic expression of miR-451 in fifteen newly diagnosed CML patients (group I), fifteen IM responder CML patients (group II), fifteen IM resistant CML patients (group III) and fifteen healthy subjects of matched age and sex as a control group (group IV). The response to IM was defined as < 10% BCR-ABL transcript level after 3 months of therapy. The following parameters were assessed in subjects of all the studied groups: 1- Complete blood count (CBC). 2- Measurement of plasma level of miRNA 451 using real-time Polymerase Chain Reaction (qPCR). 3- Detection of BCR-ABL gene mutation in CML using qPCR. Results: The present study revealed that miR-451 was significantly down-regulated in leucocytes of newly diagnosed CML patients as compared to healthy subjects. IM responder CML patients showed an up-regulation of miR- 451 compared with IM resistant CML patients. Conclusion: According to the data from the present study, it can be concluded that leucocytic miR- 451 expression is a useful additional follow-up marker for the response to IM and a promising prognostic biomarker for CML.

Keywords: chronic myeloid leukemia, imatinib resistance, microRNA 451, Polymerase Chain Reaction

Procedia PDF Downloads 286
24556 A Pilot Randomized Controlled Trial of a Physical Activity Intervention in a Low Socioeconomic Population: Focus on Mental Contrasting with Implementation Intentions

Authors: Shaun G. Abbott, Rebecca C. Reynolds, John B. F. de Wit

Abstract:

Low physical activity (PA) levels are a major public health concern in Australia. There is some evidence that PA interventions can increase PA levels via various methods, including online delivery. Low Socioeconomic Status (SES) people participate in less PA than the rest of the population, partly due to poor self-regulation behaviors associated with socioeconomic characteristics. Interventions that involve a particular method of self-regulation, Mental Contrasting with Implementation Intentions (MCII), has regularly achieved healthy behavior change, but few studies focus on PA behavior outcomes and no studies examining the effect of MCII on the PA behaviors of low SES people has been done. In this study, a pilot randomized controlled trial (RCT) will deliver MCII for PA behavior change to individuals of relative disadvantage for the first time. The current pilot study will predict sample size for a future full RCT and test the hypothesis that sedentary participants from areas of relative socioeconomic disadvantage of Sydney, who learn the MCII technique will be more physically active, have improved anthropometry and psychological indicators at the completion of a 12-week intervention compared to baseline and control. Eligible participants of relative socioeconomic disadvantage will be randomly assigned to either the ‘PA Information Plus MCII Intervention Group’ or a ‘PA Information-Only Control Group’. Both groups will attend a baseline and 12-week face-to-face consultation; where PA, anthropometric and psychological data will be gathered. The intervention group will be guided through an MCII session at the baseline appointment to establish a PA goal to aim to achieve over 12 weeks. Other than these baseline and 12-week consultations, all participant interaction will occur online. All participants will receive a ‘Fitbit’ accelerometer to record objectively. PA as a daily step count, along with a PA diary for the duration of the study. PA data will be recorded on a personalized online spreadsheet. Both groups will receive a standard PA information email at weeks 2, 4, and 8. The intervention group will also receive scripted follow-up online appointments to discuss goal progress. The current pilot study is in recruitment stage with findings to be presented at the conference in December if selected.

Keywords: implementation intentions, mental contrasting, motivation, pedometer, physical activity, socioeconomic

Procedia PDF Downloads 298
24555 Study of the Genotoxic Potential of Plant Growth Regulator Ethephon

Authors: Mahshid Hodjat, Maryam Baeeri, Mohammad Amin Rezvanfar, Mohammad Abdollahi

Abstract:

Ethephon is one of the most widely used plant growth regulator in agriculture that its application has been increased in recent years. The toxicity of organophosphate compounds is mostly attributed to their potent inhibition of acetylcholinesterase and their involvement in neurodegenerative disease. Although there are few reports on butyrylcholinesterase inhibitory role of ethephon, still there is no evidence on neurotoxicity and genotoxicity of this compound. The aim of the current study is to assess the potential genotoxic effect of ethephon using two genotoxic endpoints; γH2AX expression and comet assay on embryonic murine fibroblast. γH2AX serves as an early and sensitive biomarker for evaluating the genotoxic effects of chemicals. Oxidative stress biomarkers, including intracellular reactive oxygen species, lipid peroxidation and antioxidant capacity were also examined. The results showed a significant increase in cell proliferation 24h post-treatment with 10, 40,160µg/ml ethephon. The γH2AX expression and γH2AX foci count per cell were increased at low concentration of ethephon that was concomitant with increased DNA damage break at 40 and 160 µg/ml as illustrated by increased comet tail moment. A significant increase in lipid peroxidation and ROS formation were observed at 160 µg/ml and higher doses. The results showed that low-dose of ethephon promoted cell proliferation while induce DNA damage, raising the possibility of ethephon mutagenicity. Ethephon-induced genotoxic effect of low dose might not related to oxidative damage. However, ethephon was found to increase oxidative stress at higher doses, lead to cellular cytotoxicity. Taken together, all data indicated that ethylene, deserves more attention as a plant regulator with potential genotoxicity for which appropriate control is needed to reduce its usage.

Keywords: ethephon, DNA damage, γH2AX, oxidative stress

Procedia PDF Downloads 300
24554 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 125
24553 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 427
24552 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 85
24551 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 148
24550 Factors Influencing the Use of Psychoactive Substance among Senior Secondary Students in Ibadan South-West Local Government, Oyo State, Nigeria

Authors: Olajumoke Temilola Fatimat, Fasasi Fausat Kikelomo, Ishola Ganiyat Folasayo, Omayeka Mary

Abstract:

Psychoactive substances are chemical substances that affect the normal functioning of the brain and cause changes in behavior, mood, and consciousness. Psychoactive substance abuse constitutes one of the most important risk–taking behavior among adolescents and young adults in secondary schools. The study, therefore, assessed the factors influencing the use of psychoactive substances among senior secondary students in Ibadan South–West Local Government Area, Oyo State. A descriptive non-experimental design was adopted; purposive and simple random sampling techniques were used to select 330 respondents, while questionnaires were used for data collection. The descriptive statistics of frequency count, percentages, inferential statistics of chi-square, and analysis of variance were used for the analysis. The results revealed that the majority of the respondents had heard of the term substance abuse before 226 (75.3%); it was also revealed that the majority of the respondents had good knowledge of psychoactive substances, 67.8%. There was no significant relationship between age and knowledge of psychoactive substances among senior secondary students, with a p-value of 0.199. The outcome of this study indicates that drug abuse is increasing day by day among secondary school students and may have greatly contributed to poor performance in examinations as well as undermining academic ability and performance among students. It was recommended that efforts should be made by the school authorities of the secondary schools in Ibadan South–West Local Government Area, Oyo State, and in Oyo State generally in collaboration with health personnel to educate adolescents on psychoactive substance abuse. This is to ensure that adolescents are adequately educated and updated on knowledge of psychoactive substance abuse.

Keywords: factors, influence, psychoactive substance, secondary school

Procedia PDF Downloads 59
24549 The Impact of Health Tourism on Companies’ Performance: A Cross Country Analysis

Authors: Anna Paola Micheli, Carmelo Intrisano, Anna Maria Calce

Abstract:

This research focused on the capability of health tourism to improve the economic and financial performance of healthcare companies. It is assumed that health tourism companies have better profitability and financial efficiency because they can also count on cross-border demand differently from no health tourism companies. A three-level gap analysis was conducted: the first concerns health tourism companies located in Italy and in the other EU28 states; in the second Italian and EU28, no health tourism companies were compared; the third level is about the Italian system with a comparison between health tourism and no health tourism companies. Findings highlighted that Italian healthcare companies have better profitability performance if compared to European ones, but they present weaknesses in the financial position given the illiquidity and excessive leverage. Furthermore, studying the Italian system, we found that health tourism companies are more profitable than no health tourism companies.

Keywords: financial performance, gap analysis, health tourism, profitability performance, value creation

Procedia PDF Downloads 215
24548 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 138
24547 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 341
24546 Assessment of E-Learning Facilities in Open and Distance Learning and Information Need by Students

Authors: Sabo Elizabeth

Abstract:

Electronic learning is increasingly popular learning approach in higher educational institutions due to vast growth of internet technology. This approach is important in human capital development. An investigation of open distance and e-learning facilities and information need by open and distance learning students was carried out in Jalingo, Nigeria. Structured questionnaires were administered to 70 registered ODL students of the NOUN. Information sourced from the respondents covered demographic, economic and institutional variables. Data collected for demographic variables were computed as frequency count and percentages. Assessment of the effectiveness of ODL facilities and information need among open and distance learning students was computed on a three or four point Likert Rating Scale. Findings indicated that there are more men compared to women. A large proportion of the respondents are married and there are more matured students in ODL compared to the youth. A high proportion of the ODL students obtained qualifications higher than the secondary school certificate. The proportion of computer literate ODL students was high, and large number of the students does not own a laptop computer. Inadequate e -books and reference materials, internet gadgets and inadequate books (hard copies) and reference material are factors that limit utilization of e-learning facilities in the study areas. Inadequate computer facilities and power back up caused inconveniences and delay in administering and use of e learning facilities. To a high extent, open and distance learning students needed information on university time table and schedule of activities, availability and access to books (hard and e-books) and reference materials. The respondents emphasized that contact with course coordinators via internet will provide a better learning and academic performance.

Keywords: open and distance learning, information required, electronic books, internet gadgets, Likert scale test

Procedia PDF Downloads 318
24545 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 177
24544 Effect of Different Contaminants on Mineral Insulating Oil Characteristics

Authors: H. M. Wilhelm, P. O. Fernandes, L. P. Dill, C. Steffens, K. G. Moscon, S. M. Peres, V. Bender, T. Marchesan, J. B. Ferreira Neto

Abstract:

Deterioration of insulating oil is a natural process that occurs during transformers operation. However, this process can be accelerated by some factors, such as oxygen, high temperatures, metals and, moisture, which rapidly reduce oil insulating capacity and favor transformer faults. Parts of building materials of a transformer can be degraded and yield soluble compounds and insoluble particles that shorten the equipment life. Physicochemical tests, dissolved gas analysis (including propane, propylene and, butane), volatile and furanic compounds determination, besides quantitative and morphological analyses of particulate are proposed in this study in order to correlate transformers building materials degradation with insulating oil characteristics. The present investigation involves tests of medium temperature overheating simulation by means of an electric resistance wrapped with the following materials immersed in mineral insulating oil: test I) copper, tin, lead and, paper (heated at 350-400 °C for 8 h); test II) only copper (at 250 °C for 11 h); and test III) only paper (at 250 °C for 8 h and at 350 °C for 8 h). A different experiment is the simulation of electric arc involving copper, using an electric welding machine at two distinct energy sets (low and high). Analysis results showed that dielectric loss was higher in the sample of test I, higher neutralization index and higher values of hydrogen and hydrocarbons, including propane and butane, were also observed. Test III oil presented higher particle count, in addition, ferrographic analysis revealed contamination with fibers and carbonized paper. However, these particles had little influence on the oil physicochemical parameters (dielectric loss and neutralization index) and on the gas production, which was very low. Test II oil showed high levels of methane, ethane, and propylene, indicating the effect of metal on oil degradation. CO2 and CO gases were formed in the highest concentration in test III, as expected. Regarding volatile compounds, in test I acetone, benzene and toluene were detected, which are oil oxidation products. Regarding test III, methanol was identified due to cellulose degradation, as expected. Electric arc simulation test showed the highest oil oxidation in presence of copper and at high temperature, since these samples had huge concentration of hydrogen, ethylene, and acetylene. Particle count was also very high, showing the highest release of copper in such conditions. When comparing high and low energy, the first presented more hydrogen, ethylene, and acetylene. This sample had more similar results to test I, pointing out that the generation of different particles can be the cause for faults such as electric arc. Ferrography showed more evident copper and exfoliation particles than in other samples. Therefore, in this study, by using different combined analytical techniques, it was possible to correlate insulating oil characteristics with possible contaminants, which can lead to transformers failure.

Keywords: Ferrography, gas analysis, insulating mineral oil, particle contamination, transformer failures

Procedia PDF Downloads 210
24543 The Combined Methodology To Detect Onboard Driver Fatigue

Authors: K. Senthil Nathan, P. Rajasekaran

Abstract:

Fatigue is a feeling of extreme physical or mental tiredness. Almost everyone becomes fatigued at some time, but driver’s fatigue is a serious problem that leads to thousands of automobile crashes each year. Fatigue process is often a change from the alertness and vigor state to the tiredness and weakness state. It is not only accompanied by drowsiness but also has a negative impact on mood. There have been studies to detect and quantify fatigue from the measurement of physiology variables such as electroencephalogram (EEG), electrooculogram (EOG), and electromyogram (EMG). This project involves a multimodal sensing of driver’s drowsiness. The first method is to count the eye blinking rate. In the second level, we authenticate the results of eye blink module with a grip sensor. The Flexiforce sensor is placed over the steering wheel. In the third level, the activities are sensed, the time elapsed from the driver’s last activity is counted here. The activities in the sense: Changing gear, applying brake, pressing sound horns, and turning the steering wheel. Absence of these activities is also an indicator of fatigue.

Keywords: eye blink sensor, Flexiforce sensor, EEG, EOG, EMG

Procedia PDF Downloads 476
24542 Comparison of Parametric and Bayesian Survival Regression Models in Simulated and HIV Patient Antiretroviral Therapy Data: Case Study of Alamata Hospital, North Ethiopia

Authors: Zeytu G. Asfaw, Serkalem K. Abrha, Demisew G. Degefu

Abstract:

Background: HIV/AIDS remains a major public health problem in Ethiopia and heavily affecting people of productive and reproductive age. We aimed to compare the performance of Parametric Survival Analysis and Bayesian Survival Analysis using simulations and in a real dataset application focused on determining predictors of HIV patient survival. Methods: A Parametric Survival Models - Exponential, Weibull, Log-normal, Log-logistic, Gompertz and Generalized gamma distributions were considered. Simulation study was carried out with two different algorithms that were informative and noninformative priors. A retrospective cohort study was implemented for HIV infected patients under Highly Active Antiretroviral Therapy in Alamata General Hospital, North Ethiopia. Results: A total of 320 HIV patients were included in the study where 52.19% females and 47.81% males. According to Kaplan-Meier survival estimates for the two sex groups, females has shown better survival time in comparison with their male counterparts. The median survival time of HIV patients was 79 months. During the follow-up period 89 (27.81%) deaths and 231 (72.19%) censored individuals registered. The average baseline cluster of differentiation 4 (CD4) cells count for HIV/AIDS patients were 126.01 but after a three-year antiretroviral therapy follow-up the average cluster of differentiation 4 (CD4) cells counts were 305.74, which was quite encouraging. Age, functional status, tuberculosis screen, past opportunistic infection, baseline cluster of differentiation 4 (CD4) cells, World Health Organization clinical stage, sex, marital status, employment status, occupation type, baseline weight were found statistically significant factors for longer survival of HIV patients. The standard error of all covariate in Bayesian log-normal survival model is less than the classical one. Hence, Bayesian survival analysis showed better performance than classical parametric survival analysis, when subjective data analysis was performed by considering expert opinions and historical knowledge about the parameters. Conclusions: Thus, HIV/AIDS patient mortality rate could be reduced through timely antiretroviral therapy with special care on the potential factors. Moreover, Bayesian log-normal survival model was preferable than the classical log-normal survival model for determining predictors of HIV patients survival.

Keywords: antiretroviral therapy (ART), Bayesian analysis, HIV, log-normal, parametric survival models

Procedia PDF Downloads 180