Search results for: Information and data requirements
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32078

Search results for: Information and data requirements

30728 Clinical Validation of an Automated Natural Language Processing Algorithm for Finding COVID-19 Symptoms and Complications in Patient Notes

Authors: Karolina Wieczorek, Sophie Wiliams

Abstract:

Introduction: Patient data is often collected in Electronic Health Record Systems (EHR) for purposes such as providing care as well as reporting data. This information can be re-used to validate data models in clinical trials or in epidemiological studies. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. Mentioning a disease in a discharge letter does not necessarily mean that a patient suffers from this disease. Many of them discuss a diagnostic process, different tests, or discuss whether a patient has a certain disease. The COVID-19 dataset in this study used natural language processing (NLP), an automated algorithm which extracts information related to COVID-19 symptoms, complications, and medications prescribed within the hospital. Free-text patient clinical patient notes are rich sources of information which contain patient data not captured in a structured form, hence the use of named entity recognition (NER) to capture additional information. Methods: Patient data (discharge summary letters) were exported and screened by an algorithm to pick up relevant terms related to COVID-19. Manual validation of automated tools is vital to pick up errors in processing and to provide confidence in the output. A list of 124 Systematized Nomenclature of Medicine (SNOMED) Clinical Terms has been provided in Excel with corresponding IDs. Two independent medical student researchers were provided with a dictionary of SNOMED list of terms to refer to when screening the notes. They worked on two separate datasets called "A” and "B”, respectively. Notes were screened to check if the correct term had been picked-up by the algorithm to ensure that negated terms were not picked up. Results: Its implementation in the hospital began on March 31, 2020, and the first EHR-derived extract was generated for use in an audit study on June 04, 2020. The dataset has contributed to large, priority clinical trials (including International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) by bulk upload to REDcap research databases) and local research and audit studies. Successful sharing of EHR-extracted datasets requires communicating the provenance and quality, including completeness and accuracy of this data. The results of the validation of the algorithm were the following: precision (0.907), recall (0.416), and F-score test (0.570). Percentage enhancement with NLP extracted terms compared to regular data extraction alone was low (0.3%) for relatively well-documented data such as previous medical history but higher (16.6%, 29.53%, 30.3%, 45.1%) for complications, presenting illness, chronic procedures, acute procedures respectively. Conclusions: This automated NLP algorithm is shown to be useful in facilitating patient data analysis and has the potential to be used in more large-scale clinical trials to assess potential study exclusion criteria for participants in the development of vaccines.

Keywords: automated, algorithm, NLP, COVID-19

Procedia PDF Downloads 97
30727 A Context-Sensitive Algorithm for Media Similarity Search

Authors: Guang-Ho Cha

Abstract:

This paper presents a context-sensitive media similarity search algorithm. One of the central problems regarding media search is the semantic gap between the low-level features computed automatically from media data and the human interpretation of them. This is because the notion of similarity is usually based on high-level abstraction but the low-level features do not sometimes reflect the human perception. Many media search algorithms have used the Minkowski metric to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information given by images in a collection. Our search algorithm tackles this problem by employing a similarity measure and a ranking strategy that reflect the nonlinearity of human perception and contextual information in a dataset. Similarity search in an image database based on this contextual information shows encouraging experimental results.

Keywords: context-sensitive search, image search, similarity ranking, similarity search

Procedia PDF Downloads 360
30726 Discussion of Leadership Styles and Performance Management in MNEs

Authors: Yin-Tsuo Huang

Abstract:

Most leadership theories focus on leader's development. However, in reality, the led is also very important in the leadership process. Development relates to ensure the individual to grow in the skills, knowledge, and abilities to perform at leaders’ highest possible level now and for the future. The topic area of the relationships among leadership styles, subordinate maturity, and information distinction was identified because it is a practical problem and personal experiences occurring in multinational enterprises. Some questions to be answered through this critical analysis of the literature are: (1) What are the effective leadership styles in the leader-member and member-member relationships? (2) How do the subordinates react to leaders’ managerial style? (3) What are the relationships among leadership styles, subordinate maturity, and resulting information distinction? (4) What kinds of information distinction effects the relationships between leadership styles and subordinate maturity? (5) Where do leaders and subordinates can get information, and how? (6) In what areas are leaders’ or subordinates’ knowledge weakest, and how can they get others to prove the information they need? (7) How important is that information to the subordinates? (8) Do the leaders keep too much information for their subordinates because it is inconvenient? The main purpose of this review is to explore the theoretical and empirical literature about the relationships among leadership style, subordinates maturity, and information distinction implications in multinational Taiwanese organizations to identify areas of future scholarly inquiry.

Keywords: leadership style, subordinate maturity, information distinction, multinational organization

Procedia PDF Downloads 505
30725 Culture Dimensions of Information Systems Security in Saudi Arabia National Health Services

Authors: Saleh Alumaran, Giampaolo Bella, Feng Chen

Abstract:

The study of organisations’ information security cultures has attracted scholars as well as healthcare services industry to research the topic and find appropriate tools and approaches to develop a positive culture. The vast majority of studies in Saudi national health services are on the use of technology to protect and secure health services information. On the other hand, there is a lack of research on the role and impact of an organisation’s cultural dimensions on information security. This research investigated and analysed the role and impact of cultural dimensions on information security in Saudi Arabia health service. Hypotheses were tested and two surveys were carried out in order to collect data and information from three major hospitals in Saudi Arabia (SA). The first survey identified the main cultural-dimension problems in SA health services and developed an initial information security culture framework model. The second survey evaluated and tested the developed framework model to test its usefulness, reliability and applicability. The model is based on human behaviour theory, where the individual’s attitude is the key element of the individual’s intention to behave as well as of his or her actual behaviour. The research identified six cultural dimensions: Saudi national culture, Saudi health service leadership, employees’ trust, technology, multicultural interactions and employees’ job roles. The research also identified a set of cultural sub-dimensions. These include working values and norms, tribe values and norms, attitudes towards women, power sharing, vision, social interaction, respect and understanding, hospital intra-net, hospital employees’ language(s) used, multi-national culture, communication system, employees’ job satisfaction and job security. The research identified that (a) the human behaviour towards medical information in SA is one of the main threats to information security and one of the main challenges to SA health authority, (b) The current situation of SA hospitals’ IS cultures is falling short in protecting medical information due to the current value and norms towards information security, (c) Saudi national culture and employees’ job role are the main dimensions playing major roles in the employees’ attitude, and technology is the least important dimension playing a role in the employees’ attitudes.

Keywords: cultural dimension, electronic health record, information security, privacy

Procedia PDF Downloads 350
30724 A Decadal Flood Assessment Using Time-Series Satellite Data in Cambodia

Authors: Nguyen-Thanh Son

Abstract:

Flood is among the most frequent and costliest natural hazards. The flood disasters especially affect the poor people in rural areas, who are heavily dependent on agriculture and have lower incomes. Cambodia is identified as one of the most climate-vulnerable countries in the world, ranked 13th out of 181 countries most affected by the impacts of climate change. Flood monitoring is thus a strategic priority at national and regional levels because policymakers need reliable spatial and temporal information on flood-prone areas to form successful monitoring programs to reduce possible impacts on the country’s economy and people’s likelihood. This study aims to develop methods for flood mapping and assessment from MODIS data in Cambodia. We processed the data for the period from 2000 to 2017, following three main steps: (1) data pre-processing to construct smooth time-series vegetation and water surface indices, (2) delineation of flood-prone areas, and (3) accuracy assessment. The results of flood mapping were verified with the ground reference data, indicating the overall accuracy of 88.7% and a Kappa coefficient of 0.77, respectively. These results were reaffirmed by close agreement between the flood-mapping area and ground reference data, with the correlation coefficient of determination (R²) of 0.94. The seasonally flooded areas observed for 2010, 2015, and 2016 were remarkably smaller than other years, mainly attributed to the El Niño weather phenomenon exacerbated by impacts of climate change. Eventually, although several sources potentially lowered the mapping accuracy of flood-prone areas, including image cloud contamination, mixed-pixel issues, and low-resolution bias between the mapping results and ground reference data, our methods indicated the satisfactory results for delineating spatiotemporal evolutions of floods. The results in the form of quantitative information on spatiotemporal flood distributions could be beneficial to policymakers in evaluating their management strategies for mitigating the negative effects of floods on agriculture and people’s likelihood in the country.

Keywords: MODIS, flood, mapping, Cambodia

Procedia PDF Downloads 120
30723 Evaluation of Surface Roughness Condition Using App Roadroid

Authors: Diego de Almeida Pereira

Abstract:

The roughness index of a road is considered the most important parameter about the quality of the pavement, as it has a close relation with the comfort and safety of the road users. Such condition can be established by means of functional evaluation of pavement surface deviations, measured by the International Roughness Index (IRI), an index that came out of the international evaluation of pavements, coordinated by the World Bank, and currently owns, as an index of limit measure, for purposes of receiving roads in Brazil, the value of 2.7 m/km. This work make use of the e.IRI parameter, obtained by the Roadroid app. for smartphones which use Android operating system. The choice of such application is due to the practicality for the user interaction, as it possesses a data storage on a cloud of its own, and the support given to universities all around the world. Data has been collected for six months, once in each month. The studies begun in March 2018, season of precipitations that worsen the conditions of the roads, besides the opportunity to accompany the damage and the quality of the interventions performed. About 350 kilometers of sections of four federal highways were analyzed, BR-020, BR-040, BR-060 and BR-070 that connect the Federal District (area where Brasilia is located) and surroundings, chosen for their economic and tourist importance, been two of them of federal and two others of private exploitation. As well as much of the road network, the analyzed stretches are coated of Hot Mix Asphalt (HMA). Thus, this present research performs a contrastive discussion between comfort conditions and safety of the roads under private exploitation in which users pay a fee to the concessionaires so they could travel on a road that meet the minimum requirements for usage, and regarding the quality of offered service on the roads under Federal Government jurisdiction. And finally, the contrast of data collected by National Department of Transport Infrastructure – DNIT, by means of a laser perfilometer, with data achieved by Roadroid, checking the applicability, the practicality and cost-effective, considering the app limitations.

Keywords: roadroid, international roughness index, Brazilian roads, pavement

Procedia PDF Downloads 79
30722 Visual Template Detection and Compositional Automatic Regular Expression Generation for Business Invoice Extraction

Authors: Anthony Proschka, Deepak Mishra, Merlyn Ramanan, Zurab Baratashvili

Abstract:

Small and medium-sized businesses receive over 160 billion invoices every year. Since these documents exhibit many subtle differences in layout and text, extracting structured fields such as sender name, amount, and VAT rate from them automatically is an open research question. In this paper, existing work in template-based document extraction is extended, and a system is devised that is able to reliably extract all required fields for up to 70% of all documents in the data set, more than any other previously reported method. The approaches are described for 1) detecting through visual features which template a given document belongs to, 2) automatically generating extraction rules for a given new template by composing regular expressions from multiple components, and 3) computing confidence scores that indicate the accuracy of the automatic extractions. The system can generate templates with as little as one training sample and only requires the ground truth field values instead of detailed annotations such as bounding boxes that are hard to obtain. The system is deployed and used inside a commercial accounting software.

Keywords: data mining, information retrieval, business, feature extraction, layout, business data processing, document handling, end-user trained information extraction, document archiving, scanned business documents, automated document processing, F1-measure, commercial accounting software

Procedia PDF Downloads 126
30721 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 314
30720 General Architecture for Automation of Machine Learning Practices

Authors: U. Borasi, Amit Kr. Jain, Rakesh, Piyush Jain

Abstract:

Data collection, data preparation, model training, model evaluation, and deployment are all processes in a typical machine learning workflow. Training data needs to be gathered and organised. This often entails collecting a sizable dataset and cleaning it to remove or correct any inaccurate or missing information. Preparing the data for use in the machine learning model requires pre-processing it after it has been acquired. This often entails actions like scaling or normalising the data, handling outliers, selecting appropriate features, reducing dimensionality, etc. This pre-processed data is then used to train a model on some machine learning algorithm. After the model has been trained, it needs to be assessed by determining metrics like accuracy, precision, and recall, utilising a test dataset. Every time a new model is built, both data pre-processing and model training—two crucial processes in the Machine learning (ML) workflow—must be carried out. Thus, there are various Machine Learning algorithms that can be employed for every single approach to data pre-processing, generating a large set of combinations to choose from. Example: for every method to handle missing values (dropping records, replacing with mean, etc.), for every scaling technique, and for every combination of features selected, a different algorithm can be used. As a result, in order to get the optimum outcomes, these tasks are frequently repeated in different combinations. This paper suggests a simple architecture for organizing this largely produced “combination set of pre-processing steps and algorithms” into an automated workflow which simplifies the task of carrying out all possibilities.

Keywords: machine learning, automation, AUTOML, architecture, operator pool, configuration, scheduler

Procedia PDF Downloads 52
30719 A Novel Probabilistic Spatial Locality of Reference Technique for Automatic Cleansing of Digital Maps

Authors: A. Abdullah, S. Abushalmat, A. Bakshwain, A. Basuhail, A. Aslam

Abstract:

GIS (Geographic Information System) applications require geo-referenced data, this data could be available as databases or in the form of digital or hard-copy agro-meteorological maps. These parameter maps are color-coded with different regions corresponding to different parameter values, converting these maps into a database is not very difficult. However, text and different planimetric elements overlaid on these maps makes an accurate image to database conversion a challenging problem. The reason being, it is almost impossible to exactly replace what was underneath the text or icons; thus, pointing to the need for inpainting. In this paper, we propose a probabilistic inpainting approach that uses the probability of spatial locality of colors in the map for replacing overlaid elements with underlying color. We tested the limits of our proposed technique using non-textual simulated data and compared text removing results with a popular image editing tool using public domain data with promising results.

Keywords: noise, image, GIS, digital map, inpainting

Procedia PDF Downloads 348
30718 Data Security and Privacy Challenges in Cloud Computing

Authors: Amir Rashid

Abstract:

Cloud Computing frameworks empower organizations to cut expenses by outsourcing computation resources on-request. As of now, customers of Cloud service providers have no methods for confirming the privacy and ownership of their information and data. To address this issue we propose the platform of a trusted cloud computing program (TCCP). TCCP empowers Infrastructure as a Service (IaaS) suppliers, for example, Amazon EC2 to give a shout box execution condition that ensures secret execution of visitor virtual machines. Also, it permits clients to bear witness to the IaaS supplier and decide if the administration is secure before they dispatch their virtual machines. This paper proposes a Trusted Cloud Computing Platform (TCCP) for guaranteeing the privacy and trustworthiness of computed data that are outsourced to IaaS service providers. The TCCP gives the deliberation of a shut box execution condition for a client's VM, ensuring that no cloud supplier's authorized manager can examine or mess up with its data. Furthermore, before launching the VM, the TCCP permits a client to dependably and remotely acknowledge that the provider at backend is running a confided in TCCP. This capacity extends the verification of whole administration, and hence permits a client to confirm the data operation in secure mode.

Keywords: cloud security, IaaS, cloud data privacy and integrity, hybrid cloud

Procedia PDF Downloads 295
30717 European Electromagnetic Compatibility Directive Applied to Astronomical Observatories

Authors: Oibar Martinez, Clara Oliver

Abstract:

The Cherenkov Telescope Array Project (CTA) aims to build two different observatories of Cherenkov Telescopes, located in Cerro del Paranal, Chile, and La Palma, Spain. These facilities are used in this paper as a case study to investigate how to apply standard Directives on Electromagnetic Compatibility to astronomical observatories. Cherenkov Telescopes are able to provide valuable information from both Galactic and Extragalactic sources by measuring Cherenkov radiation, which is produced by particles which travel faster than light in the atmosphere. The construction requirements demand compliance with the European Electromagnetic Compatibility Directive. The largest telescopes of these observatories, called Large Scale Telescopes (LSTs), are high precision instruments with advanced photomultipliers able to detect the faint sub-nanosecond blue light pulses produced by Cherenkov Radiation. They have a 23-meter parabolic reflective surface. This surface focuses the radiation on a camera composed of an array of high-speed photosensors which are highly sensitive to the radio spectrum pollution. The camera has a field of view of about 4.5 degrees and has been designed for maximum compactness and lowest weight, cost and power consumption. Each pixel incorporates a photo-sensor able to discriminate single photons and the corresponding readout electronics. The first LST is already commissioned and intends to be operated as a service to Scientific Community. Because of this, it must comply with a series of reliability and functional requirements and must have a Conformité Européen (CE) marking. This demands compliance with Directive 2014/30/EU on electromagnetic compatibility. The main difficulty of accomplishing this goal resides on the fact that Conformité Européen marking setups and procedures were implemented for industrial products, whereas no clear protocols have been defined for scientific installations. In this paper, we aim to give an answer to the question on how the directive should be applied to our installation to guarantee the fulfillment of all the requirements and the proper functioning of the telescope itself. Experts in Optics and Electromagnetism were both needed to make these kinds of decisions and match tests which were designed to be made over the equipment of limited dimensions on large scientific plants. An analysis of the elements and configurations most likely to be affected by external interferences and those that are most likely to cause the maximum disturbances was also performed. Obtaining the Conformité Européen mark requires knowing what the harmonized standards are and how the elaboration of the specific requirement is defined. For this type of large installations, one needs to adapt and develop the tests to be carried out. In addition, throughout this process, certification entities and notified bodies play a key role in preparing and agreeing the required technical documentation. We have focused our attention mostly on the technical aspects of each point. We believe that this contribution will be of interest for other scientists involved in applying industrial quality assurance standards to large scientific plant.

Keywords: CE marking, electromagnetic compatibility, european directive, scientific installations

Procedia PDF Downloads 107
30716 Teachers’ and Parents’ Perceptions of School and Family Partnership Practices of Schools in Mogadishu

Authors: Mohamed Abdullahi Gure, Farhia Ali Abdi

Abstract:

There is almost a complete certainty among educators that parental involvement is the remedy for many of the problems facing schools. It is also widely acknowledged that school administrators and teachers have important roles in promoting parental involvement in children’s education. This work aims at examining the views of parents and teachers on school-partnership practices for promoting parental involvement in education in selected primary schools in Mogadishu-Somalia. The method, which has been employed in this study, is a mixed-method approach; data were collected from parents as well as from teachers of the selected schools using survey questionnaires and interviews. A sample size of 377 parents and 214 teachers participated in this study. This study used an instrument that has been developed by Epstein and Salinas (1993) to assess the perceptions of parents and teachers about parental involvement. Furthermore, data was collected qualitatively through interviews with parents and teachers of the selected schools. The findings of this study show that parents and teachers had similar positive perceptions towards school practices for parental involvement. This study is significant for several reasons. It contributes to the limited information on parental involvement in Somalia and therefore, filling a gap in the existing empirical literature. It offers information to educators as well as to parents, which will help them understand the issues that relate to parental involvement in education. It is hoped that information from this study will facilitate parents and teachers to understand each other’s ideas on parental involvement and develop positive working relations to support children to become successful in their education.

Keywords: Mogadishu, parents, school-partnership, practices, teachers

Procedia PDF Downloads 147
30715 Evaluation of Urban Parks Based on POI Data: Taking Futian District of Shenzhen as an Example

Authors: Juanling Lin

Abstract:

The construction of urban parks is an important part of eco-city construction, and the intervention of big data provides a more scientific and rational platform for the assessment of urban parks by identifying and correcting the irrationality of urban park planning from the macroscopic level and then promoting the rational planning of urban parks. The study builds an urban park assessment system based on urban road network data and POI data, taking Futian District of Shenzhen as the research object, and utilizes the GIS geographic information system to assess the park system of Futian District in five aspects: park spatial distribution, accessibility, service capacity, demand, and supply-demand relationship. The urban park assessment system can effectively reflect the current situation of urban park construction and provide a useful exploration for realizing the rationality and fairness of urban park planning.

Keywords: urban parks, assessment system, POI, supply and demand

Procedia PDF Downloads 37
30714 Food Effects and Food Choices: Aligning the Two for Better Health

Authors: John Monro, Suman Mishra

Abstract:

Choosing foods for health benefits requires information that accurately represents the relative effectiveness of foods with respect to specific health end points, or with respect to responses leading to health outcomes. At present consumers must rely on nutrient composition data, and on health claims to guide them to healthy food choices. Nutrient information may be of limited usefulness because it does not reflect the effect of food structure and food component interactions – that is, whole food effects. Health claims demand stringent criteria that exclude most foods, even though most foods have properties through which they may contribute to positive health outcomes in a diet. In this presentation, we show how the functional efficacy of foods may be expressed in the same format as nutrients, with weight units, as virtual food components that allow a nutrition information panel to show not only what a food is, but also what it does. In the presentation, two body responses linked to well-being are considered – glycaemic response and colonic bulk – in order to illustrate the concept. We show how the nutrient information on available carbohydrates and dietary fibre values obtained by food analysis methods fail to provide information of the glycaemic potency or the colonic bulking potential of foods, because of failings in the methods and approach taken to food analysis. It is concluded that a category of food values that represent the functional efficacy of foods is required to accurately guide food choices for health.

Keywords: dietary fibre, glycaemic response, food values, food effects, health

Procedia PDF Downloads 498
30713 System-Driven Design Process for Integrated Multifunctional Movable Concepts

Authors: Oliver Bertram, Leonel Akoto Chama

Abstract:

In today's civil transport aircraft, the design of flight control systems is based on the experience gained from previous aircraft configurations with a clear distinction between primary and secondary flight control functions for controlling the aircraft altitude and trajectory. Significant system improvements are now seen particularly in multifunctional moveable concepts where the flight control functions are no longer considered separate but integral. This allows new functions to be implemented in order to improve the overall aircraft performance. However, the classical design process of flight controls is sequential and insufficiently interdisciplinary. In particular, the systems discipline is involved only rudimentarily in the early phase. In many cases, the task of systems design is limited to meeting the requirements of the upstream disciplines, which may lead to integration problems later. For this reason, approaching design with an incremental development is required to reduce the risk of a complete redesign. Although the potential and the path to multifunctional moveable concepts are shown, the complete re-engineering of aircraft concepts with less classic moveable concepts is associated with a considerable risk for the design due to the lack of design methods. This represents an obstacle to major leaps in technology. This gap in state of the art is even further increased if, in the future, unconventional aircraft configurations shall be considered, where no reference data or architectures are available. This means that the use of the above-mentioned experience-based approach used for conventional configurations is limited and not applicable to the next generation of aircraft. In particular, there is a need for methods and tools for a rapid trade-off between new multifunctional flight control systems architectures. To close this gap in the state of the art, an integrated system-driven design process for multifunctional flight control systems of non-classical aircraft configurations will be presented. The overall goal of the design process is to find optimal solutions for single or combined target criteria in a fast process from the very large solution space for the flight control system. In contrast to the state of the art, all disciplines are involved for a holistic design in an integrated rather than a sequential process. To emphasize the systems discipline, this paper focuses on the methodology for designing moveable actuation systems in the context of this integrated design process of multifunctional moveables. The methodology includes different approaches for creating system architectures, component design methods as well as the necessary process outputs to evaluate the systems. An application example of a reference configuration is used to demonstrate the process and validate the results. For this, new unconventional hydraulic and electrical flight control system architectures are calculated which result from the higher requirements for multifunctional moveable concept. In addition to typical key performance indicators such as mass and required power requirements, the results regarding the feasibility and wing integration aspects of the system components are examined and discussed here. This is intended to show how the systems design can influence and drive the wing and overall aircraft design.

Keywords: actuation systems, flight control surfaces, multi-functional movables, wing design process

Procedia PDF Downloads 141
30712 Recommendations for Data Quality Filtering of Opportunistic Species Occurrence Data

Authors: Camille Van Eupen, Dirk Maes, Marc Herremans, Kristijn R. R. Swinnen, Ben Somers, Stijn Luca

Abstract:

In ecology, species distribution models are commonly implemented to study species-environment relationships. These models increasingly rely on opportunistic citizen science data when high-quality species records collected through standardized recording protocols are unavailable. While these opportunistic data are abundant, uncertainty is usually high, e.g., due to observer effects or a lack of metadata. Data quality filtering is often used to reduce these types of uncertainty in an attempt to increase the value of studies relying on opportunistic data. However, filtering should not be performed blindly. In this study, recommendations are built for data quality filtering of opportunistic species occurrence data that are used as input for species distribution models. Using an extensive database of 5.7 million citizen science records from 255 species in Flanders, the impact on model performance was quantified by applying three data quality filters, and these results were linked to species traits. More specifically, presence records were filtered based on record attributes that provide information on the observation process or post-entry data validation, and changes in the area under the receiver operating characteristic (AUC), sensitivity, and specificity were analyzed using the Maxent algorithm with and without filtering. Controlling for sample size enabled us to study the combined impact of data quality filtering, i.e., the simultaneous impact of an increase in data quality and a decrease in sample size. Further, the variation among species in their response to data quality filtering was explored by clustering species based on four traits often related to data quality: commonness, popularity, difficulty, and body size. Findings show that model performance is affected by i) the quality of the filtered data, ii) the proportional reduction in sample size caused by filtering and the remaining absolute sample size, and iii) a species ‘quality profile’, resulting from a species classification based on the four traits related to data quality. The findings resulted in recommendations on when and how to filter volunteer generated and opportunistically collected data. This study confirms that correctly processed citizen science data can make a valuable contribution to ecological research and species conservation.

Keywords: citizen science, data quality filtering, species distribution models, trait profiles

Procedia PDF Downloads 195
30711 Extracting Terrain Points from Airborne Laser Scanning Data in Densely Forested Areas

Authors: Ziad Abdeldayem, Jakub Markiewicz, Kunal Kansara, Laura Edwards

Abstract:

Airborne Laser Scanning (ALS) is one of the main technologies for generating high-resolution digital terrain models (DTMs). DTMs are crucial to several applications, such as topographic mapping, flood zone delineation, geographic information systems (GIS), hydrological modelling, spatial analysis, etc. Laser scanning system generates irregularly spaced three-dimensional cloud of points. Raw ALS data are mainly ground points (that represent the bare earth) and non-ground points (that represent buildings, trees, cars, etc.). Removing all the non-ground points from the raw data is referred to as filtering. Filtering heavily forested areas is considered a difficult and challenging task as the canopy stops laser pulses from reaching the terrain surface. This research presents an approach for removing non-ground points from raw ALS data in densely forested areas. Smoothing splines are exploited to interpolate and fit the noisy ALS data. The presented filter utilizes a weight function to allocate weights for each point of the data. Furthermore, unlike most of the methods, the presented filtering algorithm is designed to be automatic. Three different forested areas in the United Kingdom are used to assess the performance of the algorithm. The results show that the generated DTMs from the filtered data are accurate (when compared against reference terrain data) and the performance of the method is stable for all the heavily forested data samples. The average root mean square error (RMSE) value is 0.35 m.

Keywords: airborne laser scanning, digital terrain models, filtering, forested areas

Procedia PDF Downloads 135
30710 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 370
30709 Recreational Nitrous Oxide Use: Increasing Risks and Harms

Authors: Julaine Allan, Jacqui Cameron, Helen Simpson, Kenny Kor

Abstract:

The pleasurable and intoxicating effects of psychoactive substances result in widespread use. However, deaths and injuries from psychoactive substance use, particularly among young people, are a global public health problem. Understanding the benefits and problems associated with different drugs is an important part of creating contextually and physiologically relevant harm reduction strategies. Nitrous oxide use is increasing. A systematic review sought information for harm reduction strategies. The aim of this study was to systematically collate and synthesize the disparate body of research on recreational nitrous oxide use to inform harm reduction approaches tailored for young people. A mixed-methods systematic review combined quantitative data such as prevalence and incidence statistics as well as interpretive data on the experience of N₂O use. Thirty-four studies were included in the final analysis. There was minimal information available to inform policy, health care, or individuals using N₂O. The cultural, contextual, and personal reasons for N₂O use are largely unexplored.

Keywords: substance misuse, nitrous oxide, harms, harm reduction, systematic review

Procedia PDF Downloads 93
30708 Mutual Authentication for Sensor-to-Sensor Communications in IoT Infrastructure

Authors: Shadi Janbabaei, Hossein Gharaee Garakani, Naser Mohammadzadeh

Abstract:

Internet of things is a new concept that its emergence has caused ubiquity of sensors in human life, so that at any time, all data are collected, processed and transmitted by these sensors. In order to establish a secure connection, the first challenge is authentication between sensors. However, this challenge also requires some features so that the authentication is done properly. Anonymity, untraceability, and being lightweight are among the issues that need to be considered. In this paper, we have evaluated the authentication protocols and have analyzed the security vulnerabilities found in them. Then an improved light weight authentication protocol for sensor-to-sensor communications is presented which uses the hash function and logical operators. The analysis of protocol shows that security requirements have been met and the protocol is resistant against various attacks. In the end, by decreasing the number of computational cost functions, it is argued that the protocol is lighter than before.

Keywords: anonymity, authentication, Internet of Things, lightweight, un-traceability

Procedia PDF Downloads 285
30707 Transformation of the Business Model in an Occupational Health Care Company Embedded in an Emerging Personal Data Ecosystem: A Case Study in Finland

Authors: Tero Huhtala, Minna Pikkarainen, Saila Saraniemi

Abstract:

Information technology has long been used as an enabler of exchange for goods and services. Services are evolving from generic to personalized, and the reverse use of customer data has been discussed in both academia and industry for the past few years. This article presents the results of an empirical case study in the area of preventive health care services. The primary data were gathered in workshops, in which future personal data-based services were conceptualized by analyzing future scenarios from a business perspective. The aim of this study is to understand business model transformation in emerging personal data ecosystems. The work was done as a case study in the context of occupational healthcare. The results have implications to theory and practice, indicating that adopting personal data management principles requires transformation of the business model, which, if successfully managed, may provide access to more resources, potential to offer better value, and additional customer channels. These advantages correlate with the broadening of the business ecosystem. Expanding the scope of this study to include more actors would improve the validity of the research. The results draw from existing literature and are based on findings from a case study and the economic properties of the healthcare industry in Finland.

Keywords: ecosystem, business model, personal data, preventive healthcare

Procedia PDF Downloads 247
30706 The Management Accountant’s Roles for Creation of Corporate Shared Value

Authors: Prateep Wajeetongratana

Abstract:

This study investigates the management accountant’s roles that link with the creation of corporate shared value to enable more effective decision-making and improve the information needs of stakeholders. Mixed method is employed to collect using triangulation for credibility. A quantitative approach is employed to conduct a survey of 200 Thai companies providing annual reports in the Stock Exchange of Thailand. The results of the study reveal that environmental and social data incorporated in a corporate social responsibility (CSR) disclosure are based on the indicators of the Global Reporting Initiatives (GRI) at a statistically significant level of 0.01. Environmental and social indicators in CSR are associated with environmental and social data disclosed in the annual report to support stakeholders’ and the public’s interests that are addressed and show that a significant relationship between environmental and social in CSR disclosures and the information in annual reports is statistically significant at the 0.01 level.

Keywords: corporate social responsibility, creating shared value, management accountant’s roles, stock exchange of Thailand

Procedia PDF Downloads 220
30705 Application of Blockchain Technology in Geological Field

Authors: Mengdi Zhang, Zhenji Gao, Ning Kang, Rongmei Liu

Abstract:

Management and application of geological big data is an important part of China's national big data strategy. With the implementation of a national big data strategy, geological big data management becomes more and more critical. At present, there are still a lot of technology barriers as well as cognition chaos in many aspects of geological big data management and application, such as data sharing, intellectual property protection, and application technology. Therefore, it’s a key task to make better use of new technologies for deeper delving and wider application of geological big data. In this paper, we briefly introduce the basic principle of blockchain technology at the beginning and then make an analysis of the application dilemma of geological data. Based on the current analysis, we bring forward some feasible patterns and scenarios for the blockchain application in geological big data and put forward serval suggestions for future work in geological big data management.

Keywords: blockchain, intellectual property protection, geological data, big data management

Procedia PDF Downloads 83
30704 Integrating the Modbus SCADA Communication Protocol with Elliptic Curve Cryptography

Authors: Despoina Chochtoula, Aristidis Ilias, Yannis Stamatiou

Abstract:

Modbus is a protocol that enables the communication among devices which are connected to the same network. This protocol is, often, deployed in connecting sensor and monitoring units to central supervisory servers in Supervisory Control and Data Acquisition, or SCADA, systems. These systems monitor critical infrastructures, such as factories, power generation stations, nuclear power reactors etc. in order to detect malfunctions and ignite alerts and corrective actions. However, due to their criticality, SCADA systems are vulnerable to attacks that range from simple eavesdropping on operation parameters, exchanged messages, and valuable infrastructure information to malicious modification of vital infrastructure data towards infliction of damage. Thus, the SCADA research community has been active over strengthening SCADA systems with suitable data protection mechanisms based, to a large extend, on cryptographic methods for data encryption, device authentication, and message integrity protection. However, due to the limited computation power of many SCADA sensor and embedded devices, the usual public key cryptographic methods are not appropriate due to their high computational requirements. As an alternative, Elliptic Curve Cryptography has been proposed, which requires smaller key sizes and, thus, less demanding cryptographic operations. Until now, however, no such implementation has been proposed in the SCADA literature, to the best of our knowledge. In order to fill this gap, our methodology was focused on integrating Modbus, a frequently used SCADA communication protocol, with Elliptic Curve based cryptography and develop a server/client application to demonstrate the proof of concept. For the implementation we deployed two C language libraries, which were suitably modify in order to be successfully integrated: libmodbus (https://github.com/stephane/libmodbus) and ecc-lib https://www.ceid.upatras.gr/webpages/faculty/zaro/software/ecc-lib/). The first library provides a C implementation of the Modbus/TCP protocol while the second one offers the functionality to develop cryptographic protocols based on Elliptic Curve Cryptography. These two libraries were combined, after suitable modifications and enhancements, in order to give a modified version of the Modbus/TCP protocol focusing on the security of the data exchanged among the devices and the supervisory servers. The mechanisms we implemented include key generation, key exchange/sharing, message authentication, data integrity check, and encryption/decryption of data. The key generation and key exchange protocols were implemented with the use of Elliptic Curve Cryptography primitives. The keys established by each device are saved in their local memory and are retained during the whole communication session and are used in encrypting and decrypting exchanged messages as well as certifying entities and the integrity of the messages. Finally, the modified library was compiled for the Android environment in order to run the server application as an Android app. The client program runs on a regular computer. The communication between these two entities is an example of the successful establishment of an Elliptic Curve Cryptography based, secure Modbus wireless communication session between a portable device acting as a supervisor station and a monitoring computer. Our first performance measurements are, also, very promising and demonstrate the feasibility of embedding Elliptic Curve Cryptography into SCADA systems, filling in a gap in the relevant scientific literature.

Keywords: elliptic curve cryptography, ICT security, modbus protocol, SCADA, TCP/IP protocol

Procedia PDF Downloads 259
30703 Heightening Pre-Service Teachers’ Attitude towards Learning and Metacognitive Learning through Information and Communication Technology: Pre-Service Science Teachers’ Perspective

Authors: Abiodun Ezekiel Adesina, Ijeoma Ginikanwa Akubugwo

Abstract:

Information and Communication Technology, ICT can heighten pre-service teachers’ attitudes toward learning and metacognitive learning; however, there is a dearth of literature on the perception of the pre-service teachers on heightening their attitude toward learning and metacognitive learning. Thus, this study investigates the perception of pre-service science teachers on heightening their attitude towards learning and metacognitive learning through ICT. Two research questions and four hypotheses guided the research. A mixed methods research was adopted for the study in concurrent triangulation type of integrating qualitative and quantitative approaches to the study. The cluster random sampling technique was adopted to select 250 pre-service science teachers in Oyo township. Two self-constructed instruments: Heightening Pre-service Science Teachers’ Attitude towards Learning and Metacognitive Learning through Information and Communication Technology Scale (HPALMIS, r=.73), and an unstructured interview were used for data collection. Thematic analysis, frequency counts and percentages, t-tests, and analysis of variance were used for data analysis. The perception level of the pre-service science teachers on heightening their attitude towards learning and metacognitive learning through ICT is above average, with the majority perceiving that ICT can enhance their thinking about their learning. The perception was significant (mean=92.68, SD=10.86, df=249, t=134.91, p<.05). The perception was significantly differentiated by gender (t=2.10, df= 248, p<.05) in favour of the female pre-service teachers and based on the first time of ICTs use (F(5,244)= 9.586, p<.05). Lecturers of science and science related courses should therefore imbibe the use of ICTs in heightening pre-service teachers’ attitude towards learning and metacognitive learning. Government should organize workshops, seminars, lectures, and symposia along with professional bodies for the science education lecturers to keep abreast of the trending ICT.

Keywords: pre-service teachers’ attitude towards learning, metacognitive learning, ICT, pre-service teachers’ perspectives

Procedia PDF Downloads 94
30702 Disinformation’s Threats to Democracy in Central Africa: Case Studies from Cameroon and Central African Republic

Authors: Simont Toussi

Abstract:

Cameroon and the Central African Republic arebound by the provisions of many regional and international charters, which condemn the manipulation of information, obstacles to access reliable information, or the limitation of freedoms of expression and opinion. These two countries also have constitutional guarantees for free speech and access to true and liable information. However, they are yet to define specific policies and regulations for access to information, disinformation, or misinformation. Yet, certain countries’ laws and regulations related to information and communication technologies, to criminal procedures, to terrorism, or intelligence services contain provisions that rather hider human rights by condemning false information. Like many other African countries, Cameroon and the Central African Republic face a profound democratic regression, and governments use multiple methods to stifle online discourse and digital rights. Despite the increased uptake of digital tools for political participation, there is a lack of interactivity and adoption of these tools. This enables a scarcity of information and creates room for the spreading of disinformation in the public space, hamperingdemocracy and the respect for human rights. This research aims to analyse the adequacy of stakeholders’ responses to disinformation in Cameroon and the Central African Republic in periods of political contestation, such as elections and anti-government protests, to highlight the nature, perpetrators, strategies, and channels of disinformation, as well as its effects on democratic actors, including civil society, bloggers, government critics, activists, and other human rights defenders. The study follows a qualitative method with literature review, content analysis, andkey informant’sinterviews with stakeholders’ representatives, emphasized crowdsourcing as a data and information collecting method in the two countries.

Keywords: disinformation, democracy, political manipulation, social media, media, fake news, central Africa, cameroon, misinformation, free speech

Procedia PDF Downloads 103
30701 Global Solar Irradiance: Data Imputation to Analyze Complementarity Studies of Energy in Colombia

Authors: Jeisson A. Estrella, Laura C. Herrera, Cristian A. Arenas

Abstract:

The Colombian electricity sector has been transforming through the insertion of new energy sources to generate electricity, one of them being solar energy, which is being promoted by companies interested in photovoltaic technology. The study of this technology is important for electricity generation in general and for the planning of the sector from the perspective of energy complementarity. Precisely in this last approach is where the project is located; we are interested in answering the concerns about the reliability of the electrical system when climatic phenomena such as El Niño occur or in defining whether it is viable to replace or expand thermoelectric plants. Reliability of the electrical system when climatic phenomena such as El Niño occur, or to define whether it is viable to replace or expand thermoelectric plants with renewable electricity generation systems. In this regard, some difficulties related to the basic information on renewable energy sources from measured data must first be solved, as these come from automatic weather stations. Basic information on renewable energy sources from measured data, since these come from automatic weather stations administered by the Institute of Hydrology, Meteorology and Environmental Studies (IDEAM) and, in the range of study (2005-2019), have significant amounts of missing data. For this reason, the overall objective of the project is to complete the global solar irradiance datasets to obtain time series to develop energy complementarity analyses in a subsequent project. Global solar irradiance data sets to obtain time series that will allow the elaboration of energy complementarity analyses in the following project. The filling of the databases will be done through numerical and statistical methods, which are basic techniques for undergraduate students in technical areas who are starting out as researchers technical areas who are starting out as researchers.

Keywords: time series, global solar irradiance, imputed data, energy complementarity

Procedia PDF Downloads 68
30700 "Revolutionizing Geographic Data: CADmapper's Automated Precision in CAD Drawing Transformation"

Authors: Toleen Alaqqad, Kadi Alshabramiy, Suad Zaafarany, Basma Musallam

Abstract:

CADmapper is a significant tool of software for transforming geographic data into realistic CAD drawings. It speeds up and simplifies the conversion process by automating it. This allows architects, urban planners, engineers, and geographic information system (GIS) experts to solely concentrate on the imaginative and scientific parts of their projects. While the future incorporation of AI has the potential for further improvements, CADmapper's current capabilities make it an indispensable asset in the business. It covers a combination of 2D and 3D city and urban area models. The user can select a specific square section of the map to view, and the fee is based on the dimensions of the area being viewed. The procedure is straightforward: you choose the area you want, then pick whether or not to include topography. 3D architectural data (if available), followed by selecting whatever design program or CAD style you want to publish the document which contains more than 200 free broad town plans in DXF format. If you desire to specify a bespoke area, it's free up to 1 km2.

Keywords: cadmaper, gdata, 2d and 3d data conversion, automated cad drawing, urban planning software

Procedia PDF Downloads 62
30699 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: computational social science, movie preference, machine learning, SVM

Procedia PDF Downloads 254