Search results for: linked data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25268

Search results for: linked data

24908 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 21
24907 Gut Microbial Dynamics in a Mouse Model of Inflammation-Linked Carcinogenesis as a Result of Diet Supplementation with Specific Mushroom Extracts

Authors: Alvarez M., Chapela M. J., Balboa E., Rubianes D., Sinde E., Fernandez de Ana C., Rodríguez-Blanco A.

Abstract:

The gut microbiota plays an important role as gut inflammation could contribute to colorectal cancer development; however, this role is still not fully understood, and tools able to prevent this progression are yet to be developed. The main objective of this study was to monitor the effects of a mushroom extracts formulation in gut microbial community composition of an Azoxymethane (AOM)/Dextran sodium sulfate (DSS) mice model of inflammation-linked carcinogenesis. For the in vivo study, 41 adult male mice of the C57BL / 6 strain were obtained. 36 of them have been induced in a state of colon carcinogenesis by a single intraperitoneal administration of AOM at a dose of 12.5 mg/kg; the control group animals received instead of the same volume of 0.9% saline. DSS is an extremely toxic polysaccharide sulfate that causes chronic inflammation of the colon mucosa, favoring the appearance of severe colitis and the production of tumors induced by AOM. Induction by AOM/DSS is an interesting platform for chemopreventive intervention studies. This time the model was used to monitor gut microbiota changes as a result of supplementation with a specific mushroom extracts formulation previously shown to have prebiotic activity. The animals have been divided into three groups: (i) Cancer + mushroom extracts formulation experimental group: to which the MicoDigest2.0 mushroom extracts formulation developed by Hifas da Terra S.L has been administered dissolved in drinking water at an estimated concentration of 100 mg / ml. (ii) Control group of animals with Cancer: to which normal water has been administered without any type of treatment. (iii) Control group of healthy animals: these are the animals that have not been induced cancer or have not received any treatment in drinking water. This treatment has been maintained for a period of 3 months, after which the animals were sacrificed to obtain tissues that were subsequently analyzed to verify the effects of the mushroom extract formulation. A microbiological analysis has been carried out to compare the microbial communities present in the intestines of the mice belonging to each of the study groups. For this, the methodology of massive sequencing by molecular analysis of the 16S gene has been used (Ion Torrent technology). Initially, DNA extraction and metagenomics libraries were prepared using the 16S Metagenomics kit, always following the manufacturer's instructions. This kit amplifies 7 of the 9 hypervariable regions of the 16S gene that will then be sequenced. Finally, the data obtained will be compared with a database that makes it possible to determine the degree of similarity of the sequences obtained with a wide range of bacterial genomes. Results obtained showed that, similarly to certain natural compounds preventing colorectal tumorigenesis, a mushroom formulation enriched the Firmicutes and Proteobacteria phyla and depleted Bacteroidetes. Therefore, it was demonstrated that the consumption of the mushroom extracts’ formulation developed could promote the recovery of the microbial balance that is disrupted in the mice model of carcinogenesis. More preclinical and clinical studies are needed to validate this promising approach.

Keywords: carcinogenesis, microbiota, mushroom extracts, inflammation

Procedia PDF Downloads 130
24906 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 42
24905 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 380
24904 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 228
24903 Mechanical Behavior of 16NC6 Steel Hardened by Burnishing

Authors: Litim Tarek, Taamallah Ouahiba

Abstract:

This work relates to the physico-geometrical aspect of the surface layers of 16NC6 steel having undergone the burnishing treatment by hard steel ball. The results show that the optimal effects of burnishing are closely linked to the shape and the material of the active part of the device as well as to the surface plastic deformation ability of the material to be treated. Thus the roughness is improved by more than 70%, and the consolidation rate is increased by 30%. In addition, modeling of the rational traction curves provides a work hardening coefficient of up to 0.3 in the presence of burnishing.

Keywords: 16NC6 steel, burnishing, hardening, roughness

Procedia PDF Downloads 136
24902 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 117
24901 Using Google Distance Matrix Application Programming Interface to Reveal and Handle Urban Road Congestion Hot Spots: A Case Study from Budapest

Authors: Peter Baji

Abstract:

In recent years, a growing body of literature emphasizes the increasingly negative impacts of urban road congestion in the everyday life of citizens. Although there are different responses from the public sector to decrease traffic congestion in urban regions, the most effective public intervention is using congestion charges. Because travel is an economic asset, its consumption can be controlled by extra taxes or prices effectively, but this demand-side intervention is often unpopular. Measuring traffic flows with the help of different methods has a long history in transport sciences, but until recently, there was not enough sufficient data for evaluating road traffic flow patterns on the scale of an entire road system of a larger urban area. European cities (e.g., London, Stockholm, Milan), in which congestion charges have already been introduced, designated a particular zone in their downtown for paying, but it protects only the users and inhabitants of the CBD (Central Business District) area. Through the use of Google Maps data as a resource for revealing urban road traffic flow patterns, this paper aims to provide a solution for a fairer and smarter congestion pricing method in cities. The case study area of the research contains three bordering districts of Budapest which are linked by one main road. The first district (5th) is the original downtown that is affected by the congestion charge plans of the city. The second district (13th) lies in the transition zone, and it has recently been transformed into a new CBD containing the biggest office zone in Budapest. The third district (4th) is a mainly residential type of area on the outskirts of the city. The raw data of the research was collected with the help of Google’s Distance Matrix API (Application Programming Interface) which provides future estimated traffic data via travel times between freely fixed coordinate pairs. From the difference of free flow and congested travel time data, the daily congestion patterns and hot spots are detectable in all measured roads within the area. The results suggest that the distribution of congestion peak times and hot spots are uneven in the examined area; however, there are frequently congested areas which lie outside the downtown and their inhabitants also need some protection. The conclusion of this case study is that cities can develop a real-time and place-based congestion charge system that forces car users to avoid frequently congested roads by changing their routes or travel modes. This would be a fairer solution for decreasing the negative environmental effects of the urban road transportation instead of protecting a very limited downtown area.

Keywords: Budapest, congestion charge, distance matrix API, application programming interface, pilot study

Procedia PDF Downloads 172
24900 A Method of Representing Knowledge of Toolkits in a Pervasive Toolroom Maintenance System

Authors: A. Mohamed Mydeen, Pallapa Venkataram

Abstract:

The learning process needs to be so pervasive to impart the quality in acquiring the knowledge about a subject by making use of the advancement in the field of information and communication systems. However, pervasive learning paradigms designed so far are system automation types and they lack in factual pervasive realm. Providing factual pervasive realm requires subtle ways of teaching and learning with system intelligence. Augmentation of intelligence with pervasive learning necessitates the most efficient way of representing knowledge for the system in order to give the right learning material to the learner. This paper presents a method of representing knowledge for Pervasive Toolroom Maintenance System (PTMS) in which a learner acquires sublime knowledge about the various kinds of tools kept in the toolroom and also helps for effective maintenance of the toolroom. First, we explicate the generic model of knowledge representation for PTMS. Second, we expound the knowledge representation for specific cases of toolkits in PTMS. We have also presented the conceptual view of knowledge representation using ontology for both generic and specific cases. Third, we have devised the relations for pervasive knowledge in PTMS. Finally, events are identified in PTMS which are then linked with pervasive data of toolkits based on relation formulated. The experimental environment and case studies show the accuracy and efficient knowledge representation of toolkits in PTMS.

Keywords: knowledge representation, pervasive computing, agent technology, ECA rules

Procedia PDF Downloads 307
24899 Assessment of Neurodevelopmental Needs in Duchenne Muscular Dystrophy

Authors: Mathula Thangarajh

Abstract:

Duchenne muscular dystrophy (DMD) is a severe form of X-linked muscular dystrophy caused by mutations in the dystrophin gene resulting in progressive skeletal muscle weakness. Boys with DMD also have significant cognitive disabilities. The intelligence quotient of boys with DMD, compared to peers, is approximately one standard deviation below average. Detailed neuropsychological testing has demonstrated that boys with DMD have a global developmental impairment, with verbal memory and visuospatial skills most significantly affected. Furthermore, the total brain volume and gray matter volume are lower in children with DMD compared to age-matched controls. These results are suggestive of a significant structural and functional compromise to the developing brain as a result of absent dystrophin protein expression. There is also some genetic evidence to suggest that mutations in the 3’ end of the DMD gene are associated with more severe neurocognitive problems. Our working hypothesis is that (i) boys with DMD do not make gains in neurodevelopmental skills compared to typically developing children and (ii) women carriers of DMD mutations may have subclinical cognitive deficits. We also hypothesize that there may be an intergenerational vulnerability of cognition, with boys of DMD-carrier mothers being more affected cognitively than boys of non-DMD-carrier mothers. The objectives of this study are: 1. Assess the neurodevelopment in boys with DMD at 4-time points and perform baseline neuroradiological assessment, 2. Assess cognition in biological mothers of DMD participants at baseline, 3. Assess possible correlation between DMD mutation and cognitive measures. This study also explores functional brain abnormalities in people with DMD by exploring how regional and global connectivity of the brain underlies executive function deficits in DMD. Such research can contribute to a better holistic understanding of the cognition alterations due to DMD and could potentially allow clinicians to create better-tailored treatment plans for the DMD population. There are four study visits for each participant (baseline, 2-4 weeks, 1 year, 18 months). At each visit, the participant completes the NIH Toolbox Cognition Battery, a validated psychometric measure that is recommended by NIH Common Data Elements for use in DMD. Visits 1, 3, and 4 also involve the administration of the BRIEF-2, ABAS-3, PROMIS/NeuroQoL, PedsQL Neuromuscular module 3.0, Draw a Clock Test, and an optional fMRI scan with the N-back matching task. We expect to enroll 52 children with DMD, 52 mothers of children with DMD, and 30 healthy control boys. This study began in 2020 during the height of the COVID-19 pandemic. Due to this, there were subsequent delays in recruitment because of travel restrictions. However, we have persevered and continued to recruit new participants for the study. We partnered with the Muscular Dystrophy Association (MDA) and helped advertise the study to interested families. Since then, we have had families from across the country contact us about their interest in the study. We plan to continue to enroll a diverse population of DMD participants to contribute toward a better understanding of Duchenne Muscular Dystrophy.

Keywords: neurology, Duchenne muscular dystrophy, muscular dystrophy, cognition, neurodevelopment, x-linked disorder, DMD, DMD gene

Procedia PDF Downloads 74
24898 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 265
24897 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 103
24896 Development of a Novel Score for Early Detection of Hepatocellular Carcinoma in Patients with Hepatitis C Virus

Authors: Hatem A. El-Mezayen, Hossam Darwesh

Abstract:

Background/Aim: Hepatocellular carcinoma (HCC) is often diagnosed at advanced stage where effective therapies are lacking. Identification of new scoring system is needed to discriminate HCC patients from those with chronic liver disease. Based on the link between vascular endothelial growth factor (VEGF) and HCC progression, we aimed to develop a novel score based on combination of VEGF and routine laboratory tests for early prediction of HCC. Methods: VEGF was assayed for HCC group (123), liver cirrhosis group (210) and control group (50) by Enzyme Linked Immunosorbent Assay (ELISA). Data from all groups were retrospectively analyzed including α feto protein (AFP), international normalized ratio (INR), albumin and platelet count, transaminases, and age. Areas under ROC curve were used to develop the score. Results: A novel index named hepatocellular carcinoma-vascular endothelial growth factor score (HCC-VEGF score)=1.26 (numerical constant) + 0.05 ×AFP (U L-1)+0.038 × VEGF(ng ml-1)+0.004× INR –1.02 × Albumin (g l-1)–0.002 × Platelet count × 109 l-1 was developed. HCC-VEGF score produce area under ROC curve of 0.98 for discriminating HCC patients from liver cirrhosis with sensitivity of 91% and specificity of 82% at cut-off 4.4 (ie less than 4.4 considered cirrhosis and greater than 4.4 considered HCC). Conclusion: Hepatocellular carcinoma-VEGF score could replace AFP in HCC screening and follow up of cirrhotic patients.

Keywords: Hepatocellular carcinoma, cirrhosis, HCV, diagnosis, tumor markers

Procedia PDF Downloads 302
24895 Bringing Together Student Collaboration and Research Opportunities to Promote Scientific Understanding and Outreach Through a Seismological Community

Authors: Michael Ray Brunt

Abstract:

China has been the site of some of the most significant earthquakes in history; however, earthquake monitoring has long been the provenance of universities and research institutions. The China Digital Seismographic Network was initiated in 1983 and improved significantly during 1992-1993. Data from the CDSN is widely used by government and research institutions, and, generally, this data is not readily accessible to middle and high school students. An educational seismic network in China is needed to provide collaboration and research opportunities for students and engaging students around the country in scientific understanding of earthquake hazards and risks while promoting community awareness. In 2022, the Tsinghua International School (THIS) Seismology Team, made up of enthusiastic students and facilitated by two experienced teachers, was established. As a group, the team’s objective is to install seismographs in schools throughout China, thus creating an educational seismic network that shares data from the THIS Educational Seismic Network (THIS-ESN) and facilitates collaboration. The THIS-ESN initiative will enhance education and outreach in China about earthquake risks and hazards, introduce seismology to a wider audience, stimulate interest in research among students, and develop students’ programming, data collection and analysis skills. It will also encourage and inspire young minds to pursue science, technology, engineering, the arts, and math (STEAM) career fields. The THIS-ESN utilizes small, low-cost RaspberryShake seismographs as a powerful tool linked into a global network, giving schools and the public access to real-time seismic data from across China, increasing earthquake monitoring capabilities in the perspective areas and adding to the available data sets regionally and worldwide helping create a denser seismic network. The RaspberryShake seismograph is compatible with free seismic data viewing platforms such as SWARM, RaspberryShake web programs and mobile apps are designed specifically towards teaching seismology and seismic data interpretation, providing opportunities to enhance understanding. The RaspberryShake is powered by an operating system embedded in the Raspberry Pi, which makes it an easy platform to teach students basic computer communication concepts by utilizing processing tools to investigate, plot, and manipulate data. THIS Seismology Team believes strongly in creating opportunities for committed students to become part of the seismological community by engaging in analysis of real-time scientific data with tangible outcomes. Students will feel proud of the important work they are doing to understand the world around them and become advocates spreading their knowledge back into their homes and communities, helping to improve overall community resilience. We trust that, in studying the results seismograph stations yield, students will not only grasp how subjects like physics and computer science apply in real life, and by spreading information, we hope students across the country can appreciate how and why earthquakes bear on their lives, develop practical skills in STEAM, and engage in the global seismic monitoring effort. By providing such an opportunity to schools across the country, we are confident that we will be an agent of change for society.

Keywords: collaboration, outreach, education, seismology, earthquakes, public awareness, research opportunities

Procedia PDF Downloads 40
24894 Diagnostic Performance of Tumor Associated Trypsin Inhibitor in Early Detection of Hepatocellular Carcinoma in Patients with Hepatitis C Virus

Authors: Aml M. El-Sharkawy, Hossam M. Darwesh

Abstract:

Abstract— Background/Aim: Hepatocellular carcinoma (HCC) is often diagnosed at advanced stage where effective therapies are lacking. Identification of new scoring system is needed to discriminate HCC patients from those with chronic liver disease. Based on the link between tumor associated trypsin inhibitor (TATI) and HCC progression, we aimed to develop a novel score based on combination of TATI and routine laboratory tests for early prediction of HCC. Methods: TATI was assayed for HCC group (123), liver cirrhosis group (210) and control group (50) by Enzyme Linked Immunosorbent Assay (ELISA). Data from all groups were retrospectively analyzed including α feto protein (AFP), international normalized ratio (INR), albumin and platelet count, transaminases, and age. Areas under ROC curve were used to develop the score. Results: A novel index named hepatocellular carcinoma-vascular endothelial growth factor score (HCC-TATI score) = 3.1 (numerical constant) + 0.09 ×AFP (U L-1) + 0.067 × TATI (ng ml-1) + 0.16 × INR – 1.17 × Albumin (g l-1) – 0.032 × Platelet count × 109 l-1 was developed. HCC-TATI score produce area under ROC curve of 0.98 for discriminating HCC patients from liver cirrhosis with sensitivity of 91% and specificity of 82% at cut-off 6.5 (ie less than 6.5 considered cirrhosis and greater than 4.4 considered HCC). Conclusion: Hepatocellular carcinoma-TATI score could replace AFP in HCC screening and follow up of cirrhotic patients.

Keywords: Hepatocellular carcinoma, cirrhosis, HCV, diagnosis, TATI

Procedia PDF Downloads 313
24893 The Physicochemical Properties of Two Rivers in Eastern Cape South Africa as Relates to Vibrio Spp Density

Authors: Oluwatayo Abioye, Anthony Okoh

Abstract:

In the past view decades; human has experienced outbreaks of infections caused by pathogenic Vibrio spp which are commonly found in aquatic milieu. Asides the well-known Vibrio cholerae, discovery of other pathogens in this genus has been on the increase. While the dynamics of occurrence and distribution of Vibrio spp have been linked to some physicochemical parameters in salt water, data in relation to fresh water is limited. Hence, two rivers of importance in the Eastern Cape, South Africa were selected for this study. In all, eleven sampling sites were systematically identified and relevant physicochemical parameters, as well as Vibrio spp density, were determined for the period of six months using standard instruments and methods. Results were statistically analysed to determined key physicochemical parameters that determine the density of Vibrio spp in the selected rivers. Results: The density of Vibrio spp in all the sampling points ranges between < 1 CFU/mL to 174 x 10-2 CFU/mL. The physicochemical parameters of some of the sampling points were above the recommended standards. The regression analysis showed that Vibrio density in the selected rivers depends on a complex relationship between various physicochemical parameters. Conclusion: This study suggests that Vibrio spp density in fresh water does not depend on only temperature and salinity as suggested by earlier studies on salt water but rather on a complex relationship between several physicochemical parameters.

Keywords: vibrio density, physicochemical properties, pathogen, aquatic milieu

Procedia PDF Downloads 220
24892 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 412
24891 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 70
24890 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 131
24889 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 562
24888 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 325
24887 An Investigation on Interactions between Social Security with Police Operation and Economics in the Field of Tourism

Authors: Mohammad Mahdi Namdari, Hosein Torki

Abstract:

Security as an abstract concept, has involved human being from the beginning of creation to the present, and certainly to the future. Accordingly, battles, conflicts, challenges, legal proceedings, crimes and all issues related to human kind are associated with this concept. Today by interviewing people about their life, the security of societies and Social crimes are interviewed too. Along with the security as an infrastructure and vital concept, the economy and related issues e.g. welfare, per capita income, total government revenue, export, import and etc. is considered another infrastructure and vital concept. These two vital concepts (Security and Economic) have linked together complexly and significantly. The present study employs analytical-descriptive research method using documents and Statistics of official sources. Discovery and explanation of this mutual connection are comprising a profound and extensive research; so management, development and reform in system and relationships of the scope of this two concepts are complex and difficult. Tourism and its position in today's economy is one of the main pillars of the economy of the 21st century that maybe associate with the security and social crimes more than other pillars. Like all human activities, economy of societies and partially tourism dependent on security especially in the public and social security. On the other hand, the true economic development (generally) and the growth of the tourism industry (dedicated) are a security generating and supporting for it, because a dynamic economic infrastructure prevents the formation of centers of crime and illegal activities by providing a context for socio-economic development for all segments of society in a fair and humane. This relationship is a formula of the complexity between the two concept of economy and security. Police as a revealed or people-oriented organization in the field of security directly has linked with the economy of a community and is very effective In the face of the tourism industry. The relationship between security and national crime index, and economic indicators especially ones related to tourism is confirming above discussion that is notable. According to understanding processes about security and economic as two key and vital concepts are necessary and significant for sovereignty of governments.

Keywords: economic, police, tourism, social security

Procedia PDF Downloads 299
24886 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 150
24885 Synthesis, Characterization and Applications of Novel Hydrogels Based On Chitosan Derivatives

Authors: Mahmoud H. Aboul-Ela, Riham R. Mohamed, Magdy W. Sabaa

Abstract:

Synthesis of cross-linked hydrogels composed of trimethyl chitosan (TMC) and poly(vinyl alcohol) (PVA) in different weight ratios in presence of glutaraldehyde as cross-linking agent. Characterization of the prepared hydrogels was done using FTIR, XRD, SEM and TGA. The prepared hydrogels were investigated as adsorbent materials for some transition metal ions from their aqueous solutions. Moreover, the swell ability of the prepared hydrogels was also investigated in both acidic and alkaline pHs, as well as in simulated body fluid (SBF).

Keywords: trimethyl chitosan, hydrogels, metal uptake, superabsorbent materials

Procedia PDF Downloads 362
24884 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 115
24883 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 218
24882 An Original and Suitable Induction Method of Repeated Hypoxic Stress by Hydralazine to Investigate the Integrity of an in Vitro Contact Co-Culture Blood Brain Barrier Model

Authors: Morgane Chatard, Clémentine Puech, Nathalie Perek, Frédéric Roche

Abstract:

Several neurological disorders are linked to repeated hypoxia. The impact of such repeated hypoxic stress, on endothelial cells function of the blood-brain barrier (BBB) is little studied in the literature. Indeed, the study of hypoxic stress in cellular pathways is complex using hypoxia exposure because HIF 1α (factor induced by hypoxia) has a short half life. Our study presents an innovative induction method of repeated hypoxic stress, more reproducible, which allows us to study its impacts on an in vitro contact co-culture BBB model. Repeated hypoxic stress was induced by hydralazine (a mimetic agent of hypoxia pathway) during two hours and repeated during 24 hours. Then, BBB integrity was assessed by permeability measurements (transendothelial electrical resistance and membrane permeability), tight junction protein expressions (cell-ELISA and confocal microscopy) and by studying expression and activity of efflux transporters. First, this study showed that repeated hypoxic stress leads to a BBB’s dysfunction illustrated by a significant increase in permeability. This loss of membrane integrity was linked to a significant decrease of tight junctions’ protein expressions, facilitating a possible transfer of potential cytotoxic compounds in the brain. Secondly, we demonstrated that brain microvascular endothelial cells had set-up defence mechanism. These endothelial cells significantly increased the activity of their efflux transporters which was associated with a significant increase in their expression. In conclusion, repeated hypoxic stress lead to a loss of BBB integrity with a decrease of tight junction proteins. In contrast, endothelial cells increased the expression of their efflux transporters to fight against cytotoxic compounds brain crossing. Unfortunately, enhanced efflux activity could also lead to reducing pharmacological drugs delivering to the brain in such hypoxic conditions.

Keywords: BBB model, efflux transporters, repeated hypoxic stress, tigh junction proteins

Procedia PDF Downloads 273
24881 Synthesis and Characterization of Chitosan Microparticles for Scaffold Structure and Bioprinting

Authors: J. E. Mendes, T. T. de Barros, O. B. G. de Assis, J. D. C. Pessoa

Abstract:

Chitosan, a natural polysaccharide of β-1,4-linked glucosamine residues, is a biopolymer obtained primarily from the exoskeletons of crustaceans. Interest in polymeric materials increases year by year. Chitosan is one of the most plentiful biomaterials, with a wide range of pharmaceutical, biomedical, industrial and agricultural applications. Chitosan nanoparticles were synthesized via the ionotropic gelation of chitosan with sodium tripolyphosphate (TPP). Two concentrations of chitosan microparticles (0.1 and 0.2%) were synthesized. In this study, it was possible to synthesize and characterize microparticles of chitosan biomaterial and this will be used for future applications in cell anchorage for 3D bioprinting.

Keywords: chitosan microparticles, biomaterial, scaffold, bioprinting

Procedia PDF Downloads 288
24880 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 107
24879 A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Authors: C. A. Barros, Ana P. Barroso

Abstract:

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

Keywords: automotive Industry, industry 4.0, Internet of Things, IATF 16949:2016, measurement system analysis

Procedia PDF Downloads 184