Search results for: linked data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25718

Search results for: linked data

25358 A Method of Representing Knowledge of Toolkits in a Pervasive Toolroom Maintenance System

Authors: A. Mohamed Mydeen, Pallapa Venkataram

Abstract:

The learning process needs to be so pervasive to impart the quality in acquiring the knowledge about a subject by making use of the advancement in the field of information and communication systems. However, pervasive learning paradigms designed so far are system automation types and they lack in factual pervasive realm. Providing factual pervasive realm requires subtle ways of teaching and learning with system intelligence. Augmentation of intelligence with pervasive learning necessitates the most efficient way of representing knowledge for the system in order to give the right learning material to the learner. This paper presents a method of representing knowledge for Pervasive Toolroom Maintenance System (PTMS) in which a learner acquires sublime knowledge about the various kinds of tools kept in the toolroom and also helps for effective maintenance of the toolroom. First, we explicate the generic model of knowledge representation for PTMS. Second, we expound the knowledge representation for specific cases of toolkits in PTMS. We have also presented the conceptual view of knowledge representation using ontology for both generic and specific cases. Third, we have devised the relations for pervasive knowledge in PTMS. Finally, events are identified in PTMS which are then linked with pervasive data of toolkits based on relation formulated. The experimental environment and case studies show the accuracy and efficient knowledge representation of toolkits in PTMS.

Keywords: knowledge representation, pervasive computing, agent technology, ECA rules

Procedia PDF Downloads 327
25357 The Collaboration between Resident and Non-resident Patent Applicants as a Strategy to Accelerate Technological Advance in Developing Nations

Authors: Hugo Rodríguez

Abstract:

Migrations of researchers, scientists, and inventors are a widespread phenomenon in modern times. In some cases, migrants stay linked to research groups in their countries of origin, either out of their own conviction or because of government policies. We examine different linear models of technological development (using the Ordinary Least Squares (OLS) technique) in eight selected countries and find that the collaborations between resident and nonresident patent applicants correlate with different levels of performance of the technological policies in three different scenarios. Therefore, the reinforcement of that link must be considered a powerful tool for technological development.

Keywords: development, collaboration, patents, technology

Procedia PDF Downloads 120
25356 The Role of Data Protection Officer in Managing Individual Data: Issues and Challenges

Authors: Nazura Abdul Manap, Siti Nur Farah Atiqah Salleh

Abstract:

For decades, the misuse of personal data has been a critical issue. Malaysia has accepted responsibility by implementing the Malaysian Personal Data Protection Act 2010 to secure personal data (PDPA 2010). After more than a decade, this legislation is set to be revised by the current PDPA 2023 Amendment Bill to align with the world's key personal data protection regulations, such as the European Union General Data Protection Regulations (GDPR). Among the other suggested adjustments is the Data User's appointment of a Data Protection Officer (DPO) to ensure the commercial entity's compliance with the PDPA 2010 criteria. The change is expected to be enacted in parliament fairly soon; nevertheless, based on the experience of the Personal Data Protection Department (PDPD) in implementing the Act, it is projected that there will be a slew of additional concerns associated with the DPO mandate. Consequently, the goal of this article is to highlight the issues that the DPO will encounter and how the Personal Data Protection Department should respond to this subject. The study result was produced using a qualitative technique based on an examination of the current literature. This research reveals that there are probable obstacles experienced by the DPO, and thus, there should be a definite, clear guideline in place to aid DPO in executing their tasks. It is argued that appointing a DPO is a wise measure in ensuring that the legal data security requirements are met.

Keywords: guideline, law, data protection officer, personal data

Procedia PDF Downloads 73
25355 Development of a Novel Score for Early Detection of Hepatocellular Carcinoma in Patients with Hepatitis C Virus

Authors: Hatem A. El-Mezayen, Hossam Darwesh

Abstract:

Background/Aim: Hepatocellular carcinoma (HCC) is often diagnosed at advanced stage where effective therapies are lacking. Identification of new scoring system is needed to discriminate HCC patients from those with chronic liver disease. Based on the link between vascular endothelial growth factor (VEGF) and HCC progression, we aimed to develop a novel score based on combination of VEGF and routine laboratory tests for early prediction of HCC. Methods: VEGF was assayed for HCC group (123), liver cirrhosis group (210) and control group (50) by Enzyme Linked Immunosorbent Assay (ELISA). Data from all groups were retrospectively analyzed including α feto protein (AFP), international normalized ratio (INR), albumin and platelet count, transaminases, and age. Areas under ROC curve were used to develop the score. Results: A novel index named hepatocellular carcinoma-vascular endothelial growth factor score (HCC-VEGF score)=1.26 (numerical constant) + 0.05 ×AFP (U L-1)+0.038 × VEGF(ng ml-1)+0.004× INR –1.02 × Albumin (g l-1)–0.002 × Platelet count × 109 l-1 was developed. HCC-VEGF score produce area under ROC curve of 0.98 for discriminating HCC patients from liver cirrhosis with sensitivity of 91% and specificity of 82% at cut-off 4.4 (ie less than 4.4 considered cirrhosis and greater than 4.4 considered HCC). Conclusion: Hepatocellular carcinoma-VEGF score could replace AFP in HCC screening and follow up of cirrhotic patients.

Keywords: Hepatocellular carcinoma, cirrhosis, HCV, diagnosis, tumor markers

Procedia PDF Downloads 319
25354 Data Collection Based on the Questionnaire Survey In-Hospital Emergencies

Authors: Nouha Mhimdi, Wahiba Ben Abdessalem Karaa, Henda Ben Ghezala

Abstract:

The methods identified in data collection are diverse: electronic media, focus group interviews and short-answer questionnaires [1]. The collection of poor-quality data resulting, for example, from poorly designed questionnaires, the absence of good translators or interpreters, and the incorrect recording of data allow conclusions to be drawn that are not supported by the data or to focus only on the average effect of the program or policy. There are several solutions to avoid or minimize the most frequent errors, including obtaining expert advice on the design or adaptation of data collection instruments; or use technologies allowing better "anonymity" in the responses [2]. In this context, we opted to collect good quality data by doing a sizeable questionnaire-based survey on hospital emergencies to improve emergency services and alleviate the problems encountered. At the level of this paper, we will present our study, and we will detail the steps followed to achieve the collection of relevant, consistent and practical data.

Keywords: data collection, survey, questionnaire, database, data analysis, hospital emergencies

Procedia PDF Downloads 102
25353 Diagnostic Performance of Tumor Associated Trypsin Inhibitor in Early Detection of Hepatocellular Carcinoma in Patients with Hepatitis C Virus

Authors: Aml M. El-Sharkawy, Hossam M. Darwesh

Abstract:

Abstract— Background/Aim: Hepatocellular carcinoma (HCC) is often diagnosed at advanced stage where effective therapies are lacking. Identification of new scoring system is needed to discriminate HCC patients from those with chronic liver disease. Based on the link between tumor associated trypsin inhibitor (TATI) and HCC progression, we aimed to develop a novel score based on combination of TATI and routine laboratory tests for early prediction of HCC. Methods: TATI was assayed for HCC group (123), liver cirrhosis group (210) and control group (50) by Enzyme Linked Immunosorbent Assay (ELISA). Data from all groups were retrospectively analyzed including α feto protein (AFP), international normalized ratio (INR), albumin and platelet count, transaminases, and age. Areas under ROC curve were used to develop the score. Results: A novel index named hepatocellular carcinoma-vascular endothelial growth factor score (HCC-TATI score) = 3.1 (numerical constant) + 0.09 ×AFP (U L-1) + 0.067 × TATI (ng ml-1) + 0.16 × INR – 1.17 × Albumin (g l-1) – 0.032 × Platelet count × 109 l-1 was developed. HCC-TATI score produce area under ROC curve of 0.98 for discriminating HCC patients from liver cirrhosis with sensitivity of 91% and specificity of 82% at cut-off 6.5 (ie less than 6.5 considered cirrhosis and greater than 4.4 considered HCC). Conclusion: Hepatocellular carcinoma-TATI score could replace AFP in HCC screening and follow up of cirrhotic patients.

Keywords: Hepatocellular carcinoma, cirrhosis, HCV, diagnosis, TATI

Procedia PDF Downloads 331
25352 The Physicochemical Properties of Two Rivers in Eastern Cape South Africa as Relates to Vibrio Spp Density

Authors: Oluwatayo Abioye, Anthony Okoh

Abstract:

In the past view decades; human has experienced outbreaks of infections caused by pathogenic Vibrio spp which are commonly found in aquatic milieu. Asides the well-known Vibrio cholerae, discovery of other pathogens in this genus has been on the increase. While the dynamics of occurrence and distribution of Vibrio spp have been linked to some physicochemical parameters in salt water, data in relation to fresh water is limited. Hence, two rivers of importance in the Eastern Cape, South Africa were selected for this study. In all, eleven sampling sites were systematically identified and relevant physicochemical parameters, as well as Vibrio spp density, were determined for the period of six months using standard instruments and methods. Results were statistically analysed to determined key physicochemical parameters that determine the density of Vibrio spp in the selected rivers. Results: The density of Vibrio spp in all the sampling points ranges between < 1 CFU/mL to 174 x 10-2 CFU/mL. The physicochemical parameters of some of the sampling points were above the recommended standards. The regression analysis showed that Vibrio density in the selected rivers depends on a complex relationship between various physicochemical parameters. Conclusion: This study suggests that Vibrio spp density in fresh water does not depend on only temperature and salinity as suggested by earlier studies on salt water but rather on a complex relationship between several physicochemical parameters.

Keywords: vibrio density, physicochemical properties, pathogen, aquatic milieu

Procedia PDF Downloads 248
25351 Federated Learning in Healthcare

Authors: Ananya Gangavarapu

Abstract:

Convolutional Neural Networks (CNN) based models are providing diagnostic capabilities on par with the medical specialists in many specialty areas. However, collecting the medical data for training purposes is very challenging because of the increased regulations around data collections and privacy concerns around personal health data. The gathering of the data becomes even more difficult if the capture devices are edge-based mobile devices (like smartphones) with feeble wireless connectivity in rural/remote areas. In this paper, I would like to highlight Federated Learning approach to mitigate data privacy and security issues.

Keywords: deep learning in healthcare, data privacy, federated learning, training in distributed environment

Procedia PDF Downloads 134
25350 The Utilization of Big Data in Knowledge Management Creation

Authors: Daniel Brian Thompson, Subarmaniam Kannan

Abstract:

The huge weightage of knowledge in this world and within the repository of organizations has already reached immense capacity and is constantly increasing as time goes by. To accommodate these constraints, Big Data implementation and algorithms are utilized to obtain new or enhanced knowledge for decision-making. With the transition from data to knowledge provides the transformational changes which will provide tangible benefits to the individual implementing these practices. Today, various organization would derive knowledge from observations and intuitions where this information or data will be translated into best practices for knowledge acquisition, generation and sharing. Through the widespread usage of Big Data, the main intention is to provide information that has been cleaned and analyzed to nurture tangible insights for an organization to apply to their knowledge-creation practices based on facts and figures. The translation of data into knowledge will generate value for an organization to make decisive decisions to proceed with the transition of best practices. Without a strong foundation of knowledge and Big Data, businesses are not able to grow and be enhanced within the competitive environment.

Keywords: big data, knowledge management, data driven, knowledge creation

Procedia PDF Downloads 104
25349 Survey on Data Security Issues Through Cloud Computing Amongst Sme’s in Nairobi County, Kenya

Authors: Masese Chuma Benard, Martin Onsiro Ronald

Abstract:

Businesses have been using cloud computing more frequently recently because they wish to take advantage of its advantages. However, employing cloud computing also introduces new security concerns, particularly with regard to data security, potential risks and weaknesses that could be exploited by attackers, and various tactics and strategies that could be used to lessen these risks. This study examines data security issues on cloud computing amongst sme’s in Nairobi county, Kenya. The study used the sample size of 48, the research approach was mixed methods, The findings show that data owner has no control over the cloud merchant's data management procedures, there is no way to ensure that data is handled legally. This implies that you will lose control over the data stored in the cloud. Data and information stored in the cloud may face a range of availability issues due to internet outages; this can represent a significant risk to data kept in shared clouds. Integrity, availability, and secrecy are all mentioned.

Keywords: data security, cloud computing, information, information security, small and medium-sized firms (SMEs)

Procedia PDF Downloads 79
25348 Cloud Design for Storing Large Amount of Data

Authors: M. Strémy, P. Závacký, P. Cuninka, M. Juhás

Abstract:

Main goal of this paper is to introduce our design of private cloud for storing large amount of data, especially pictures, and to provide good technological backend for data analysis based on parallel processing and business intelligence. We have tested hypervisors, cloud management tools, storage for storing all data and Hadoop to provide data analysis on unstructured data. Providing high availability, virtual network management, logical separation of projects and also rapid deployment of physical servers to our environment was also needed.

Keywords: cloud, glusterfs, hadoop, juju, kvm, maas, openstack, virtualization

Procedia PDF Downloads 348
25347 Chemical Synthesis of a cDNA and Its Expression Analysis

Authors: Salman Akrokayan

Abstract:

Synthetic cDNA (ScDNA) of granulocyte colony-stimulating factor (G-CSF) was constructed using a DNA synthesizer with the aim to increase its expression level. 5' end of the ScDNA of G-CSF coding region was modified by decreasing the GC content without altering the predicted amino acids sequence. The identity of the resulting protein from ScDNA was confirmed by the highly specific enzyme-linked immunosorbent assay. In conclusion, a synthetic G-CSF cDNA in combination with the recombinant DNA protocol offers a rapid and reliable strategy for synthesizing the target protein. However, the commercial utilization of this methodology requires rigorous validation and quality control.

Keywords: synthetic cDNA, recombinant G-CSF, cloning, gene expression

Procedia PDF Downloads 276
25346 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 371
25345 Association Rules Mining and NOSQL Oriented Document in Big Data

Authors: Sarra Senhadji, Imene Benzeguimi, Zohra Yagoub

Abstract:

Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.

Keywords: Apriori, Association rules mining, Big Data, Data Mining, Hadoop, MapReduce, MongoDB, NoSQL

Procedia PDF Downloads 154
25344 Using Google Distance Matrix Application Programming Interface to Reveal and Handle Urban Road Congestion Hot Spots: A Case Study from Budapest

Authors: Peter Baji

Abstract:

In recent years, a growing body of literature emphasizes the increasingly negative impacts of urban road congestion in the everyday life of citizens. Although there are different responses from the public sector to decrease traffic congestion in urban regions, the most effective public intervention is using congestion charges. Because travel is an economic asset, its consumption can be controlled by extra taxes or prices effectively, but this demand-side intervention is often unpopular. Measuring traffic flows with the help of different methods has a long history in transport sciences, but until recently, there was not enough sufficient data for evaluating road traffic flow patterns on the scale of an entire road system of a larger urban area. European cities (e.g., London, Stockholm, Milan), in which congestion charges have already been introduced, designated a particular zone in their downtown for paying, but it protects only the users and inhabitants of the CBD (Central Business District) area. Through the use of Google Maps data as a resource for revealing urban road traffic flow patterns, this paper aims to provide a solution for a fairer and smarter congestion pricing method in cities. The case study area of the research contains three bordering districts of Budapest which are linked by one main road. The first district (5th) is the original downtown that is affected by the congestion charge plans of the city. The second district (13th) lies in the transition zone, and it has recently been transformed into a new CBD containing the biggest office zone in Budapest. The third district (4th) is a mainly residential type of area on the outskirts of the city. The raw data of the research was collected with the help of Google’s Distance Matrix API (Application Programming Interface) which provides future estimated traffic data via travel times between freely fixed coordinate pairs. From the difference of free flow and congested travel time data, the daily congestion patterns and hot spots are detectable in all measured roads within the area. The results suggest that the distribution of congestion peak times and hot spots are uneven in the examined area; however, there are frequently congested areas which lie outside the downtown and their inhabitants also need some protection. The conclusion of this case study is that cities can develop a real-time and place-based congestion charge system that forces car users to avoid frequently congested roads by changing their routes or travel modes. This would be a fairer solution for decreasing the negative environmental effects of the urban road transportation instead of protecting a very limited downtown area.

Keywords: Budapest, congestion charge, distance matrix API, application programming interface, pilot study

Procedia PDF Downloads 191
25343 Immunization-Data-Quality in Public Health Facilities in the Pastoralist Communities: A Comparative Study Evidence from Afar and Somali Regional States, Ethiopia

Authors: Melaku Tsehay

Abstract:

The Consortium of Christian Relief and Development Associations (CCRDA), and the CORE Group Polio Partners (CGPP) Secretariat have been working with Global Alliance for Vac-cines and Immunization (GAVI) to improve the immunization data quality in Afar and Somali Regional States. The main aim of this study was to compare the quality of immunization data before and after the above interventions in health facilities in the pastoralist communities in Ethiopia. To this end, a comparative-cross-sectional study was conducted on 51 health facilities. The baseline data was collected in May 2019, while the end line data in August 2021. The WHO data quality self-assessment tool (DQS) was used to collect data. A significant improvment was seen in the accuracy of the pentavalent vaccine (PT)1 (p = 0.012) data at the health posts (HP), while PT3 (p = 0.010), and Measles (p = 0.020) at the health centers (HC). Besides, a highly sig-nificant improvment was observed in the accuracy of tetanus toxoid (TT)2 data at HP (p < 0.001). The level of over- or under-reporting was found to be < 8%, at the HP, and < 10% at the HC for PT3. The data completeness was also increased from 72.09% to 88.89% at the HC. Nearly 74% of the health facilities timely reported their respective immunization data, which is much better than the baseline (7.1%) (p < 0.001). These findings may provide some hints for the policies and pro-grams targetting on improving immunization data qaulity in the pastoralist communities.

Keywords: data quality, immunization, verification factor, pastoralist region

Procedia PDF Downloads 101
25342 From Biowaste to Biobased Products: Life Cycle Assessment of VALUEWASTE Solution

Authors: Andrés Lara Guillén, José M. Soriano Disla, Gemma Castejón Martínez, David Fernández-Gutiérrez

Abstract:

The worldwide population is exponentially increasing, which causes a rising demand for food, energy and non-renewable resources. These demands must be attended to from a circular economy point of view. Under this approach, the obtention of strategic products from biowaste is crucial for the society to keep the current lifestyle reducing the environmental and social issues linked to the lineal economy. This is the main objective of the VALUEWASTE project. VALUEWASTE is about valorizing urban biowaste into proteins for food and feed and biofertilizers, closing the loop of this waste stream. In order to achieve this objective, the project validates three value chains, which begin with the anaerobic digestion of the biowaste. From the anaerobic digestion, three by-products are obtained: i) methane that is used by microorganisms, which will be transformed into microbial proteins; ii) digestate that is used by black soldier fly, producing insect proteins; and iii) a nutrient-rich effluent, which will be transformed into biofertilizers. VALUEWASTE is an innovative solution, which combines different technologies to valorize entirely the biowaste. However, it is also required to demonstrate that the solution is greener than other traditional technologies (baseline systems). On one hand, the proteins from microorganisms and insects will be compared with other reference protein production systems (gluten, whey and soybean). On the other hand, the biofertilizers will be compared to the production of mineral fertilizers (ammonium sulphate and synthetic struvite). Therefore, the aim of this study is to provide that biowaste valorization can reduce the environmental impacts linked to both traditional proteins manufacturing processes and mineral fertilizers, not only at a pilot-scale but also at an industrial one. In the present study, both baseline system and VALUEWASTE solution are evaluated through the Environmental Life Cycle Assessment (E-LCA). The E-LCA is based on the standards ISO 14040 and 14044. The Environmental Footprint methodology was the one used in this study to evaluate the environmental impacts. The results for the baseline cases show that the food proteins coming from whey have the highest environmental impact on ecosystems compared to the other proteins sources: 7.5 and 15.9 folds higher than soybean and gluten, respectively. Comparing feed soybean and gluten, soybean has an environmental impact on human health 195.1 folds higher. In the case of biofertilizers, synthetic struvite has higher impacts than ammonium sulfate: 15.3 (ecosystems) and 11.8 (human health) fold, respectively. The results shown in the present study will be used as a reference to demonstrate the better environmental performance of the bio-based products obtained through the VALUEWASTE solution. Other originalities that the E-LCA performed in the VALUEWASTE project provides are the diverse direct implications on investment and policies. On one hand, better environmental performance will serve to remove the barriers linked to these kinds of technologies, boosting the investment that is backed by the E-LCA. On the other hand, it will be a germ to design new policies fostering these types of solutions to achieve two of the key targets of the European Community: being self-sustainable and carbon neutral.

Keywords: anaerobic digestion, biofertilizers, circular economy, nutrients recovery

Procedia PDF Downloads 84
25341 Identifying Critical Success Factors for Data Quality Management through a Delphi Study

Authors: Maria Paula Santos, Ana Lucas

Abstract:

Organizations support their operations and decision making on the data they have at their disposal, so the quality of these data is remarkably important and Data Quality (DQ) is currently a relevant issue, the literature being unanimous in pointing out that poor DQ can result in large costs for organizations. The literature review identified and described 24 Critical Success Factors (CSF) for Data Quality Management (DQM) that were presented to a panel of experts, who ordered them according to their degree of importance, using the Delphi method with the Q-sort technique, based on an online questionnaire. The study shows that the five most important CSF for DQM are: definition of appropriate policies and standards, control of inputs, definition of a strategic plan for DQ, organizational culture focused on quality of the data and obtaining top management commitment and support.

Keywords: critical success factors, data quality, data quality management, Delphi, Q-Sort

Procedia PDF Downloads 208
25340 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 286
25339 Chikungunya Virus Infection among Patients with Febrile Illness Attending University of Maiduguri Teaching Hospital, Nigeria

Authors: Abdul-Dahiru El-Yuguda, Saka Saheed Baba, Tawa Monilade Adisa, Mustapha Bala Abubakar

Abstract:

Background: Chikungunya (CHIK) virus, a previously anecdotally described arbovirus, is now assuming a worldwide public health burden. The CHIK virus infection is characterized by potentially life threatening and debilitating arthritis in addition to the high fever, arthralgia, myalgia, headache and rash. Method: Three hundred and seventy (370) serum samples were collected from outpatients with febrile illness attending University of Maiduguri Teaching Hospital, Nigeria, and was used to detect for Chikungunya (CHIK) virus IgG and IgM antibodies using the Enzyme Linked Immunosorbent Assays (ELISAs). Result: Out of the 370 sera tested, 39 (10.5%) were positive for presence of CHIK virus antibodies. A total of 24 (6.5%) tested positive for CHIK virus IgM only while none (0.0%) was positive for presence of CHIK virus IgG only and 15 (4.1%) of the serum samples were positive for both IgG and IgM antibodies. A significant difference (p<0.0001) was observed in the distribution of CHIK virus antibodies in relation to gender. The males had prevalence of 8.5% IgM antibodies as against 4.6% observed in females. On the other hand 4.6% of the females were positive for concurrent CHIK virus IgG and IgM antibodies when compared to a prevalence of 3.4% observed in males. Only the age groups ≤ 60 years and the undisclosed age group were positive for presence of CHIK virus IgG and/or IgM antibodies. No significant difference (p>0.05) was observed in the seasonal prevalence of CHIK virus antibodies among the study subjects Analysis of the prevalence of CHIK virus antibodies in relation to clinical presentation (as observed by Clinicians) of the patients revealed that headache and fever were the most frequently encountered ailments. Conclusion: The CHIK virus IgM and concurrent IgM and IgG antibody prevalence rates of 6.5% and 4.1% observed in this study indicates a current infection and the lack of IgG antibody alone observed shows that the infection is not endemic but sporadic. Recommendation: Further studies should be carried to establish the seasonal prevalence of CHIK virus infection vis-à-vis vector dynamics in the study area. A comprehensive study need to be carried out on the molecular characterization of the CHIK virus circulating in Nigeria with a view to developing CHIK virus vaccine.

Keywords: Chikungunya virus, IgM and IgG antibodies, febrile patients, enzyme linked immunosorbent assay

Procedia PDF Downloads 382
25338 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 137
25337 Bringing Together Student Collaboration and Research Opportunities to Promote Scientific Understanding and Outreach Through a Seismological Community

Authors: Michael Ray Brunt

Abstract:

China has been the site of some of the most significant earthquakes in history; however, earthquake monitoring has long been the provenance of universities and research institutions. The China Digital Seismographic Network was initiated in 1983 and improved significantly during 1992-1993. Data from the CDSN is widely used by government and research institutions, and, generally, this data is not readily accessible to middle and high school students. An educational seismic network in China is needed to provide collaboration and research opportunities for students and engaging students around the country in scientific understanding of earthquake hazards and risks while promoting community awareness. In 2022, the Tsinghua International School (THIS) Seismology Team, made up of enthusiastic students and facilitated by two experienced teachers, was established. As a group, the team’s objective is to install seismographs in schools throughout China, thus creating an educational seismic network that shares data from the THIS Educational Seismic Network (THIS-ESN) and facilitates collaboration. The THIS-ESN initiative will enhance education and outreach in China about earthquake risks and hazards, introduce seismology to a wider audience, stimulate interest in research among students, and develop students’ programming, data collection and analysis skills. It will also encourage and inspire young minds to pursue science, technology, engineering, the arts, and math (STEAM) career fields. The THIS-ESN utilizes small, low-cost RaspberryShake seismographs as a powerful tool linked into a global network, giving schools and the public access to real-time seismic data from across China, increasing earthquake monitoring capabilities in the perspective areas and adding to the available data sets regionally and worldwide helping create a denser seismic network. The RaspberryShake seismograph is compatible with free seismic data viewing platforms such as SWARM, RaspberryShake web programs and mobile apps are designed specifically towards teaching seismology and seismic data interpretation, providing opportunities to enhance understanding. The RaspberryShake is powered by an operating system embedded in the Raspberry Pi, which makes it an easy platform to teach students basic computer communication concepts by utilizing processing tools to investigate, plot, and manipulate data. THIS Seismology Team believes strongly in creating opportunities for committed students to become part of the seismological community by engaging in analysis of real-time scientific data with tangible outcomes. Students will feel proud of the important work they are doing to understand the world around them and become advocates spreading their knowledge back into their homes and communities, helping to improve overall community resilience. We trust that, in studying the results seismograph stations yield, students will not only grasp how subjects like physics and computer science apply in real life, and by spreading information, we hope students across the country can appreciate how and why earthquakes bear on their lives, develop practical skills in STEAM, and engage in the global seismic monitoring effort. By providing such an opportunity to schools across the country, we are confident that we will be an agent of change for society.

Keywords: collaboration, outreach, education, seismology, earthquakes, public awareness, research opportunities

Procedia PDF Downloads 68
25336 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 549
25335 Analysis of Different Classification Techniques Using WEKA for Diabetic Disease

Authors: Usama Ahmed

Abstract:

Data mining is the process of analyze data which are used to predict helpful information. It is the field of research which solve various type of problem. In data mining, classification is an important technique to classify different kind of data. Diabetes is most common disease. This paper implements different classification technique using Waikato Environment for Knowledge Analysis (WEKA) on diabetes dataset and find which algorithm is suitable for working. The best classification algorithm based on diabetic data is Naïve Bayes. The accuracy of Naïve Bayes is 76.31% and take 0.06 seconds to build the model.

Keywords: data mining, classification, diabetes, WEKA

Procedia PDF Downloads 138
25334 Assessment of Reservoir Quality and Heterogeneity in Middle Buntsandstein Sandstones of Southern Netherlands for Deep Geothermal Exploration

Authors: Husnain Yousaf, Rudy Swennen, Hannes Claes, Muhammad Amjad

Abstract:

In recent years, the Lower Triassic Main Buntsandstein sandstones in the southern Netherlands Basins have become a point of interest for their deep geothermal potential. To identify the most suitable reservoir for geothermal exploration, the diagenesis and factors affecting reservoir quality, such as porosity and permeability, are assessed. This is done by combining point-counted petrographic data with conventional core analysis. The depositional environments play a significant role in determining the distribution of lithofacies, cement, clays, and grain sizes. The position in the basin and proximity to the source areas determine the lateral variability of depositional environments. The stratigraphic distribution of depositional environments is linked to both local topography and climate, where high humidity leads to fluvial deposition and high aridity periods lead to aeolian deposition. The Middle Buntsandstein Sandstones in the southern part of the Netherlands shows high porosity and permeability in most sandstone intervals. There are various controls on reservoir quality in the examined sandstone samples. Grain sizes and total quartz content are the primary factors affecting reservoir quality. Conversely, carbonate and anhydrite cement, clay clasts, and intergranular clay represent a local control and cannot be applied on a regional scale. Similarly, enhanced secondary porosity due to feldspar dissolution is locally restricted and minor. The analysis of textural, mineralogical, and petrophysical data indicates that the aeolian and fluvial sandstones represent a heterogeneous reservoir system. The ephemeral fluvial deposits have an average porosity and permeability of <10% and <1mD, respectively, while the aeolian sandstones exhibit values of >18% and >100mD.

Keywords: reservoir quality, diagenesis, porosity, permeability, depositional environments, Buntsandstein, Netherlands

Procedia PDF Downloads 58
25333 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 75
25332 Interannual Variations in Snowfall and Continuous Snow Cover Duration in Pelso, Central Finland, Linked to Teleconnection Patterns, 1944-2010

Authors: M. Irannezhad, E. H. N. Gashti, S. Mohammadighavam, M. Zarrini, B. Kløve

Abstract:

Climate warming would increase rainfall by shifting precipitation falling form from snow to rain, and would accelerate snow cover disappearing by increasing snowpack. Using temperature and precipitation data in the temperature-index snowmelt model, we evaluated variability of snowfall and continuous snow cover duration(CSCD) during 1944-2010 over Pelso, central Finland. MannKendall non-parametric test determined that annual precipitation increased by 2.69 (mm/year, p<0.05) during the study period, but no clear trend in annual temperature. Both annual rainfall and snowfall increased by 1.67 and 0.78 (mm/year, p<0.05), respectively. CSCD was generally about 205 days from 14 October to 6 May. No clear trend was found in CSCD over Pelso. Spearman’s rank correlation showed most significant relationships of annual snowfall with the East Atlantic (EA) pattern, and CSCD with the East Atlantic/West Russia (EA/WR) pattern. Increased precipitation with no warming temperature caused the rainfall and snowfall to increase, while no effects on CSCD.

Keywords: variations, snowfall, snow cover duration, temperature-index snowmelt model, teleconnection patterns

Procedia PDF Downloads 220
25331 A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Authors: C. A. Barros, Ana P. Barroso

Abstract:

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

Keywords: automotive Industry, industry 4.0, Internet of Things, IATF 16949:2016, measurement system analysis

Procedia PDF Downloads 212
25330 Woody Carbon Stock Potentials and Factor Affecting Their Storage in Munessa Forest, Southern Ethiopia

Authors: Mojo Mengistu Gelasso

Abstract:

The tropical forest is considered the most important forest ecosystem for mitigating climate change by sequestering a high amount of carbon. The potential carbon stock of the forest can be influenced by many factors. Therefore, studying these factors is crucial for understanding the determinants that affect the potential for woody carbon storage in the forest. This study was conducted to evaluate the potential for woody carbon stock and how it varies based on plant community types, as well as along altitudinal, slope, and aspect gradients in the Munessa dry Afromontane forest. Vegetation data was collected using systematic sampling. Five line transects were established at 100 m intervals along the altitudinal gradient between two consecutive transect lines. On each transect, 10 quadrats (20 x 20 m), separated by 200 m, were established. The woody carbon was estimated using an appropriate allometric equation formulated for tropical forests. The data was analyzed using one-way ANOVA in R software. The results showed that the total woody carbon stock of the Munessa forest was 210.43 ton/ha. The analysis of variance revealed that woody carbon density varied significantly based on environmental factors, while community types had no significant effect. The highest mean carbon stock was found at middle altitudes (2367-2533 m.a.s.l), lower slopes (0-13%), and west-facing aspects. The Podocarpus falcatus-Croton macrostachyus community type also contributed a higher woody carbon stock, as larger tree size classes and older trees dominated it. Overall, the potential for woody carbon sequestration in this study was strongly associated with environmental variables. Additionally, the uneven distribution of species with larger diameter at breast height (DBH) in the study area might be linked to anthropogenic factors, as the current forest growth indicates characteristics of a secondary forest. Therefore, our study suggests that the development and implementation of a sustainable forest management plan is necessary to increase the carbon sequestration potential of this forest and mitigate climate change.

Keywords: munessa forest, woody carbon stock, environmental factors, climate mitigation

Procedia PDF Downloads 67
25329 Woody Carbon Stock Potentials and Factor Affecting Their Storage in Munessa Forest, Southern Ethiopia

Authors: Mengistu Gelasso Mojo

Abstract:

The tropical forest is considered the most important forest ecosystem for mitigating climate change by sequestering a high amount of carbon. The potential carbon stock of the forest can be influenced by many factors. Therefore, studying these factors is crucial for understanding the determinants that affect the potential for woody carbon storage in the forest. This study was conducted to evaluate the potential for woody carbon stock and how it varies based on plant community types, as well as along altitudinal, slope, and aspect gradients in the Munessa dry Afromontane forest. Vegetation data was collected using systematic sampling. Five line transects were established at 100 m intervals along the altitudinal gradient between two consecutive transect lines. On each transect, 10 quadrats (20 x 20 m), separated by 200 m, were established. The woody carbon was estimated using an appropriate allometric equation formulated for tropical forests. The data was analyzed using one-way ANOVA in R software. The results showed that the total woody carbon stock of the Munessa forest was 210.43 ton/ha. The analysis of variance revealed that woody carbon density varied significantly based on environmental factors, while community types had no significant effect. The highest mean carbon stock was found at middle altitudes (2367-2533 m.a.s.l), lower slopes (0-13%), and west-facing aspects. The Podocarpus falcatus-Croton macrostachyus community type also contributed a higher woody carbon stock, as larger tree size classes and older trees dominated it. Overall, the potential for woody carbon sequestration in this study was strongly associated with environmental variables. Additionally, the uneven distribution of species with larger diameter at breast height (DBH) in the study area might be linked to anthropogenic factors, as the current forest growth indicates characteristics of a secondary forest. Therefore, our study suggests that the development and implementation of a sustainable forest management plan is necessary to increase the carbon sequestration potential of this forest and mitigate climate change.

Keywords: munessa forest, woody carbon stock, environmental factors, climate mitigation

Procedia PDF Downloads 76