Search results for: biological data mining
25783 Efects of Data Corelation in a Sparse-View Compresive Sensing Based Image Reconstruction
Authors: Sajid Abas, Jon Pyo Hong, Jung-Ryun Le, Seungryong Cho
Abstract:
Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.Keywords: computed tomography, computed laminography, compressive sending, low-dose
Procedia PDF Downloads 46425782 Fuzzy Wavelet Model to Forecast the Exchange Rate of IDR/USD
Authors: Tri Wijayanti Septiarini, Agus Maman Abadi, Muhammad Rifki Taufik
Abstract:
The exchange rate of IDR/USD can be the indicator to analysis Indonesian economy. The exchange rate as a important factor because it has big effect in Indonesian economy overall. So, it needs the analysis data of exchange rate. There is decomposition data of exchange rate of IDR/USD to be frequency and time. It can help the government to monitor the Indonesian economy. This method is very effective to identify the case, have high accurate result and have simple structure. In this paper, data of exchange rate that used is weekly data from December 17, 2010 until November 11, 2014.Keywords: the exchange rate, fuzzy mamdani, discrete wavelet transforms, fuzzy wavelet
Procedia PDF Downloads 57125781 Humanising Digital Healthcare to Build Capacity by Harnessing the Power of Patient Data
Authors: Durhane Wong-Rieger, Kawaldip Sehmi, Nicola Bedlington, Nicole Boice, Tamás Bereczky
Abstract:
Patient-generated health data should be seen as the expression of the experience of patients, including the outcomes reflecting the impact a treatment or service had on their physical health and wellness. We discuss how the healthcare system can reach a place where digital is a determinant of health - where data is generated by patients and is respected and which acknowledges their contribution to science. We explore the biggest barriers facing this. The International Experience Exchange with Patient Organisation’s Position Paper is based on a global patient survey conducted in Q3 2021 that received 304 responses. Results were discussed and validated by the 15 patient experts and supplemented with literature research. Results are a subset of this. Our research showed patient communities want to influence how their data is generated, shared, and used. Our study concludes that a reasonable framework is needed to protect the integrity of patient data and minimise abuse, and build trust. Results also demonstrated a need for patient communities to have more influence and control over how health data is generated, shared, and used. The results clearly highlight that the community feels there is a lack of clear policies on sharing data.Keywords: digital health, equitable access, humanise healthcare, patient data
Procedia PDF Downloads 8225780 Use of Machine Learning in Data Quality Assessment
Authors: Bruno Pinto Vieira, Marco Antonio Calijorne Soares, Armando Sérgio de Aguiar Filho
Abstract:
Nowadays, a massive amount of information has been produced by different data sources, including mobile devices and transactional systems. In this scenario, concerns arise on how to maintain or establish data quality, which is now treated as a product to be defined, measured, analyzed, and improved to meet consumers' needs, which is the one who uses these data in decision making and companies strategies. Information that reaches low levels of quality can lead to issues that can consume time and money, such as missed business opportunities, inadequate decisions, and bad risk management actions. The step of selecting, identifying, evaluating, and selecting data sources with significant quality according to the need has become a costly task for users since the sources do not provide information about their quality. Traditional data quality control methods are based on user experience or business rules limiting performance and slowing down the process with less than desirable accuracy. Using advanced machine learning algorithms, it is possible to take advantage of computational resources to overcome challenges and add value to companies and users. In this study, machine learning is applied to data quality analysis on different datasets, seeking to compare the performance of the techniques according to the dimensions of quality assessment. As a result, we could create a ranking of approaches used, besides a system that is able to carry out automatically, data quality assessment.Keywords: machine learning, data quality, quality dimension, quality assessment
Procedia PDF Downloads 14825779 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges
Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars
Abstract:
In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting
Procedia PDF Downloads 15325778 Enhanced Decolourization and Biodegradation of Textile Azo and Xanthene Dyes by Using Bacterial Isolates
Authors: Gimhani Madhushika Hewayalage, Thilini Ariyadasa, Sanja Gunawardena
Abstract:
In Sri Lanka, the largest contribution for the industrial export earnings is governed by textile and apparel industry. However, this industry generates huge quantities of effluent consists of unfixed dyes which enhance the effluent colour and toxicity thereby leading towards environmental pollution. Therefore, the effluent should properly be treated prior to the release into the environment. The biological technique has now captured much attention as an environmental-friendly and cost-competitive effluent decolourization method due to the drawbacks of physical and chemical treatment techniques. The present study has focused on identifying dye decolourizing potential of several bacterial isolates obtained from the effluent of the local textile industry. Yellow EXF, Red EXF, Blue EXF, Nova Black WNN and Nylosan-Rhodamine-EB dyes have been selected for the study to represent different chromophore groups such as Azo and Xanthene. The rates of decolorization of each dye have been investigated by employing distinct bacterial isolates. Bacterial isolate which exhibited effective dye decolorizing potential was identified as Proteus mirabilis using 16S rRNA gene sequencing analysis. The high decolorizing rates of identified bacterial strain indicate its potential applicability in the treatment of dye-containing wastewaters.Keywords: azo, bacterial, biological, decolourization, xanthene
Procedia PDF Downloads 25225777 Nuclear Decay Data Evaluation for 217Po
Authors: S. S. Nafee, A. M. Al-Ramady, S. A. Shaheen
Abstract:
Evaluated nuclear decay data for the 217Po nuclide ispresented in the present work. These data include recommended values for the half-life T1/2, α-, β--, and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons has been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.Keywords: nuclear decay data evaluation, mass evaluation, total converison coefficients, atomic mass evaluation
Procedia PDF Downloads 43325776 Geographic Information System Using Google Fusion Table Technology for the Delivery of Disease Data Information
Authors: I. Nyoman Mahayasa Adiputra
Abstract:
Data in the field of health can be useful for the purposes of data analysis, one example of health data is disease data. Disease data is usually in a geographical plot in accordance with the area. Where the data was collected, in the city of Denpasar, Bali. Disease data report is still published in tabular form, disease information has not been mapped in GIS form. In this research, disease information in Denpasar city will be digitized in the form of a geographic information system with the smallest administrative area in the form of district. Denpasar City consists of 4 districts of North Denpasar, East Denpasar, West Denpasar and South Denpasar. In this research, we use Google fusion table technology for map digitization process, where this technology can facilitate from the administrator and from the recipient information. From the administrator side of the input disease, data can be done easily and quickly. From the receiving end of the information, the resulting GIS application can be published in a website-based application so that it can be accessed anywhere and anytime. In general, the results obtained in this study, divided into two, namely: (1) Geolocation of Denpasar and all of Denpasar districts, the process of digitizing the map of Denpasar city produces a polygon geolocation of each - district of Denpasar city. These results can be utilized in subsequent GIS studies if you want to use the same administrative area. (2) Dengue fever mapping in 2014 and 2015. Disease data used in this study is dengue fever case data taken in 2014 and 2015. Data taken from the profile report Denpasar Health Department 2015 and 2016. This mapping can be useful for the analysis of the spread of dengue hemorrhagic fever in the city of Denpasar.Keywords: geographic information system, Google fusion table technology, delivery of disease data information, Denpasar city
Procedia PDF Downloads 12925775 Inclusive Practices in Health Sciences: Equity Proofing Higher Education Programs
Authors: Mitzi S. Brammer
Abstract:
Given that the cultural make-up of programs of study in institutions of higher learning is becoming increasingly diverse, much has been written about cultural diversity from a university-level perspective. However, there are little data in the way of specific programs and how they address inclusive practices when teaching and working with marginalized populations. This research study aimed to discover baseline knowledge and attitudes of health sciences faculty, instructional staff, and students related to inclusive teaching/learning and interactions. Quantitative data were collected via an anonymous online survey (one designed for students and another designed for faculty/instructional staff) using a web-based program called Qualtrics. Quantitative data were analyzed amongst the faculty/instructional staff and students, respectively, using descriptive and comparative statistics (t-tests). Additionally, some participants voluntarily engaged in a focus group discussion in which qualitative data were collected around these same variables. Collecting qualitative data to triangulate the quantitative data added trustworthiness to the overall data. The research team analyzed collected data and compared identified categories and trends, comparing those data between faculty/staff and students, and reported results as well as implications for future study and professional practice.Keywords: inclusion, higher education, pedagogy, equity, diversity
Procedia PDF Downloads 6725774 Neural Networks Based Prediction of Long Term Rainfall: Nine Pilot Study Zones over the Mediterranean Basin
Authors: Racha El Kadiri, Mohamed Sultan, Henrique Momm, Zachary Blair, Rachel Schultz, Tamer Al-Bayoumi
Abstract:
The Mediterranean Basin is a very diverse region of nationalities and climate zones, with a strong dependence on agricultural activities. Predicting long term (with a lead of 1 to 12 months) rainfall, and future droughts could contribute in a sustainable management of water resources and economical activities. In this study, an integrated approach was adopted to construct predictive tools with lead times of 0 to 12 months to forecast rainfall amounts over nine subzones of the Mediterranean Basin region. The following steps were conducted: (1) acquire, assess and intercorrelate temporal remote sensing-based rainfall products (e.g. The CPC Merged Analysis of Precipitation [CMAP]) throughout the investigation period (1979 to 2016), (2) acquire and assess monthly values for all of the climatic indices influencing the regional and global climatic patterns (e.g., Northern Atlantic Oscillation [NOI], Southern Oscillation Index [SOI], and Tropical North Atlantic Index [TNA]); (3) delineate homogenous climatic regions and select nine pilot study zones, (4) apply data mining methods (e.g. neural networks, principal component analyses) to extract relationships between the observed rainfall and the controlling factors (i.e. climatic indices with multiple lead-time periods) and (5) use the constructed predictive tools to forecast monthly rainfall and dry and wet periods. Preliminary results indicate that rainfall and dry/wet periods were successfully predicted with lead zones of 0 to 12 months using the adopted methodology, and that the approach is more accurately applicable in the southern Mediterranean region.Keywords: rainfall, neural networks, climatic indices, Mediterranean
Procedia PDF Downloads 31225773 LIZTOXD: Inclusive Lizard Toxin Database by Using MySQL Protocol
Authors: Iftikhar A. Tayubi, Tabrej Khan, Mansoor M. Alsubei, Fahad A. Alsaferi
Abstract:
LIZTOXD provides a single source of high-quality information about proteinaceous lizard toxins that will be an invaluable resource for pharmacologists, neuroscientists, toxicologists, medicinal chemists, ion channel scientists, clinicians, and structural biologists. We will provide an intuitive, well-organized and user-friendly web interface that allows users to explore the detail information of Lizard and toxin proteins. It includes common name, scientific name, entry id, entry name, protein name and length of the protein sequence. The utility of this database is that it can provide a user-friendly interface for users to retrieve the information about Lizard, toxin and toxin protein of different Lizard species. These interfaces created in this database will satisfy the demands of the scientific community by providing in-depth knowledge about Lizard and its toxin. In the next phase of our project we will adopt methodology and by using A MySQL and Hypertext Preprocessor (PHP) which and for designing Smart Draw. A database is a wonderful piece of equipment for storing large quantities of data efficiently. The users can thus navigate from one section to another, depending on the field of interest of the user. This database contains a wealth of information on species, toxins, toxins, clinical data etc. LIZTOXD resource that provides comprehensive information about protein toxins from lizard toxins. The combination of specific classification schemes and a rich user interface allows researchers to easily locate and view information on the sequence, structure, and biological activity of these toxins. This manually curated database will be a valuable resource for both basic researchers as well as those interested in potential pharmaceutical and agricultural applications of lizard toxins.Keywords: LIZTOXD, MySQL, PHP, smart draw
Procedia PDF Downloads 16225772 Biological Hotspots in the Galápagos Islands: Exploring Seasonal Trends of Ocean Climate Drivers to Monitor Algal Blooms
Authors: Emily Kislik, Gabriel Mantilla Saltos, Gladys Torres, Mercy Borbor-Córdova
Abstract:
The Galápagos Marine Reserve (GMR) is an internationally-recognized region of consistent upwelling events, high productivity, and rich biodiversity. Despite its high-nutrient, low-chlorophyll condition, the archipelago has experienced phytoplankton blooms, especially in the western section between Isabela and Fernandina Islands. However, little is known about how climate variability will affect future phytoplankton standing stock in the Galápagos, and no consistent protocols currently exist to quantify phytoplankton biomass, identify species, or monitor for potential harmful algal blooms (HABs) within the archipelago. This analysis investigates physical, chemical, and biological oceanic variables that contribute to algal blooms within the GMR, using 4 km Aqua MODIS satellite imagery and 0.125-degree wind stress data from January 2003 to December 2016. Furthermore, this study analyzes chlorophyll-a concentrations at varying spatial scales— within the greater archipelago, as well as within five smaller bioregions based on species biodiversity in the GMR. Seasonal and interannual trend analyses, correlations, and hotspot identification were performed. Results demonstrate that chlorophyll-a is expressed in two seasons throughout the year in the GMR, most frequently in September and March, with a notable hotspot in the Elizabeth Bay bioregion. Interannual chlorophyll-a trend analyses revealed highest peaks in 2003, 2007, 2013, and 2016, and variables that correlate highly with chlorophyll-a include surface temperature and particulate organic carbon. This study recommends future in situ sampling locations for phytoplankton monitoring, including the Elizabeth Bay bioregion. Conclusions from this study contribute to the knowledge of oceanic drivers that catalyze primary productivity and consequently affect species biodiversity within the GMR. Additionally, this research can inform policy and decision-making strategies for species conservation and management within bioregions of the Galápagos.Keywords: bioregions, ecological monitoring, phytoplankton, remote sensing
Procedia PDF Downloads 26525771 Mean Shift-Based Preprocessing Methodology for Improved 3D Buildings Reconstruction
Authors: Nikolaos Vassilas, Theocharis Tsenoglou, Djamchid Ghazanfarpour
Abstract:
In this work we explore the capability of the mean shift algorithm as a powerful preprocessing tool for improving the quality of spatial data, acquired from airborne scanners, from densely built urban areas. On one hand, high resolution image data corrupted by noise caused by lossy compression techniques are appropriately smoothed while at the same time preserving the optical edges and, on the other, low resolution LiDAR data in the form of normalized Digital Surface Map (nDSM) is upsampled through the joint mean shift algorithm. Experiments on both the edge-preserving smoothing and upsampling capabilities using synthetic RGB-z data show that the mean shift algorithm is superior to bilateral filtering as well as to other classical smoothing and upsampling algorithms. Application of the proposed methodology for 3D reconstruction of buildings of a pilot region of Athens, Greece results in a significant visual improvement of the 3D building block model.Keywords: 3D buildings reconstruction, data fusion, data upsampling, mean shift
Procedia PDF Downloads 31525770 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman Electricity Transmission Company
Authors: Rahma Al Balushi
Abstract:
Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS dept. This paper will describe in detail the GIS data submission process and the journey to develop the current process. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting, and data alterations salso aided to reduce the missing attributes of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the year 2017 and the year 2021. Overall, concluding that by governance, asset information & GIS department can control GIS data process; collect, properly record, and manage asset data and information within OETC network. This control extends to other applications and systems integrated with/related to GIS systems.Keywords: asset management ISO55001, standard procedures process, governance, geodatabase, NOC, CMMS
Procedia PDF Downloads 20725769 Importance of Ethics in Cloud Security
Authors: Pallavi Malhotra
Abstract:
This paper examines the importance of ethics in cloud computing. In the modern society, cloud computing is offering individuals and businesses an unlimited space for storing and processing data or information. Most of the data and information stored in the cloud by various users such as banks, doctors, architects, engineers, lawyers, consulting firms, and financial institutions among others require a high level of confidentiality and safeguard. Cloud computing offers centralized storage and processing of data, and this has immensely contributed to the growth of businesses and improved sharing of information over the internet. However, the accessibility and management of data and servers by a third party raise concerns regarding the privacy of clients’ information and the possible manipulations of the data by third parties. This document suggests the approaches various stakeholders should take to address various ethical issues involving cloud-computing services. Ethical education and training is key to all stakeholders involved in the handling of data and information stored or being processed in the cloud.Keywords: IT ethics, cloud computing technology, cloud privacy and security, ethical education
Procedia PDF Downloads 32525768 The Feminism of Data Privacy and Protection in Africa
Authors: Olayinka Adeniyi, Melissa Omino
Abstract:
The field of data privacy and data protection in Africa is still an evolving area, with many African countries yet to enact legislation on the subject. While African Governments are bringing their legislation to speed in this field, how patriarchy pervades every sector of African thought and manifests in society needs to be considered. Moreover, the laws enacted ought to be inclusive, especially towards women. This, in a nutshell, is the essence of data feminism. Data feminism is a new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism. Feminising data privacy and protection will involve thinking women, considering women in the issues of data privacy and protection, particularly in legislation, as is the case in this paper. The line of thought of women inclusion is not uncommon when even international and regional human rights specific for women only came long after the general human rights. The consideration is that these should have been inserted or rather included in the original general instruments in the first instance. Since legislation on data privacy is coming in this century, having seen the rights and shortcomings of earlier instruments, then the cue should be taken to ensure inclusive wholistic legislation for data privacy and protection in the first instance. Data feminism is arguably an area that has been scantily researched, albeit a needful one. With the spate of increase in the violence against women spiraling in the cyber world, compounding the issue of COVID-19 and the needful response of governments, and the effect of these on women and their rights, fast forward, the research on the feminism of data privacy and protection in Africa becomes inevitable. This paper seeks to answer the questions, what is data feminism in the African context, why is it important in the issue of data privacy and protection legislation; what are the laws, if any, existing on data privacy and protection in Africa, are they women inclusive, if not, why; what are the measures put in place for the privacy and protection of women in Africa, and how can this be made possible. The paper aims to investigate the issue of data privacy and protection in Africa, the legal framework, and the protection or provision that it has for women if any. It further aims to research the importance and necessity of feminizing data privacy and protection, the effect of lack of it, the challenges or bottlenecks in attaining this feat and the possibilities of accessing data privacy and protection for African women. The paper also researches the emerging practices of data privacy and protection of women in other jurisprudences. It approaches the research through the methodology of review of papers, analysis of laws, and reports. It seeks to contribute to the existing literature in the field and is explorative in its suggestion. It suggests a draft of some clauses to make any data privacy and protection legislation women inclusive. It would be useful for policymaking, academic, and public enlightenment.Keywords: feminism, women, law, data, Africa
Procedia PDF Downloads 20625767 Evaluation of Practicality of On-Demand Bus Using Actual Taxi-Use Data through Exhaustive Simulations
Authors: Jun-ichi Ochiai, Itsuki Noda, Ryo Kanamori, Keiji Hirata, Hitoshi Matsubara, Hideyuki Nakashima
Abstract:
We conducted exhaustive simulations for data assimilation and evaluation of service quality for various setting in a new shared transportation system, called SAVS. Computational social simulation is a key technology to design recent social services like SAVS as new transportation service. One open issue in SAVS was to determine the service scale through the social simulation. Using our exhaustive simulation framework, OACIS, we did data-assimilation and evaluation of effects of SAVS based on actual tax-use data at Tajimi city, Japan. Finally, we get the conditions to realize the new service in a reasonable service quality.Keywords: on-demand bus sytem, social simulation, data assimilation, exhaustive simulation
Procedia PDF Downloads 32125766 Cloning and Characterization of Uridine-5’-Diphosphate -Glucose Pyrophosphorylases from Lactobacillus Kefiranofaciens and Rhodococcus Wratislaviensis
Authors: Mesfin Angaw Tesfay
Abstract:
Uridine-5’-diphosphate (UDP)-glucose is one of the most versatile building blocks within the metabolism of prokaryotes and eukaryotes serving as an activated sugar donor during the glycosylation of natural products. It is formed by the enzyme UDP-glucose pyrophosphorylase (UGPase) using uridine-5′-triphosphate (UTP) and α-d-glucose 1-phosphate as a substrate. Herein two UGPase genes from Lactobacillus kefiranofaciens ZW3 (LkUGPase) and Rhodococcus wratislaviensis IFP 2016 (RwUGPase) were identified through genome mining approaches. The LkUGPase and RwUGPase have 299 and 306 amino acids, respectively. Both UGPase has the conserved UTP binding site (G-X-G-T-R-X-L-P) and the glucose -1-phosphate binding site (V-E-K-P). The LkUGPase and RwUGPase were cloned in E. coli and SDS-PAGE analysis showed the expression of both enzymes forming about 36 KDa of protein band after induction. LkUGPase and RwUGPase have an activity of 1549.95 and 671.53 U/mg respectively. Currently, their kinetic properties are under investigation.Keywords: UGPase, LkUGPase, RwUGPase, UDP-glucose, Glycosylation
Procedia PDF Downloads 2125765 Optimal Pricing Based on Real Estate Demand Data
Authors: Vanessa Kummer, Maik Meusel
Abstract:
Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning
Procedia PDF Downloads 28525764 Unlocking the Puzzle of Borrowing Adult Data for Designing Hybrid Pediatric Clinical Trials
Authors: Rajesh Kumar G
Abstract:
A challenging aspect of any clinical trial is to carefully plan the study design to meet the study objective in optimum way and to validate the assumptions made during protocol designing. And when it is a pediatric study, there is the added challenge of stringent guidelines and difficulty in recruiting the necessary subjects. Unlike adult trials, there is not much historical data available for pediatrics, which is required to validate assumptions for planning pediatric trials. Typically, pediatric studies are initiated as soon as approval is obtained for a drug to be marketed for adults, so with the adult study historical information and with the available pediatric pilot study data or simulated pediatric data, the pediatric study can be well planned. Generalizing the historical adult study for new pediatric study is a tedious task; however, it is possible by integrating various statistical techniques and utilizing the advantage of hybrid study design, which will help to achieve the study objective in a smoother way even with the presence of many constraints. This research paper will explain how well the hybrid study design can be planned along with integrated technique (SEV) to plan the pediatric study; In brief the SEV technique (Simulation, Estimation (using borrowed adult data and applying Bayesian methods)) incorporates the use of simulating the planned study data and getting the desired estimates to Validate the assumptions.This method of validation can be used to improve the accuracy of data analysis, ensuring that results are as valid and reliable as possible, which allow us to make informed decisions well ahead of study initiation. With professional precision, this technique based on the collected data allows to gain insight into best practices when using data from historical study and simulated data alike.Keywords: adaptive design, simulation, borrowing data, bayesian model
Procedia PDF Downloads 7725763 Spectroscopic (Ir, Raman, Uv-Vis) and Biological Study of Copper and Zinc Complexes and Sodium Salt with Cichoric Acid
Authors: Renata Swislocka, Grzegorz Swiderski, Agata Jablonska-Trypuc, Wlodzimierz Lewandowski
Abstract:
Forming a complex of a phenolic compound with a metal not only alters the physicochemical properties of the ligand (including increase in stability or changes in lipophilicity), but also its biological activity, including antioxidant, antimicrobial and many others. As part of our previous projects, we examined the physicochemical and antimicrobial properties of phenolic acids and their complexes with metals naturally occurring in foods. Previously we studied the complexes of manganese(II), copper(II), cadmium(II) and alkali metals with ferulic, caffeic and p-coumaric acids. In the framework of this study, the physicochemical and biological properties of cicoric acid, its sodium salt, and complexes with copper and zinc were investigated. Cichoric acid is a derivative of both caffeic acid and tartaric acid. It has first been isolated from Cichorium intybus (chicory) but also it occurs in significant amounts in Echinacea, particularly E. purpurea, dandelion leaves, basil, lemon balm and in aquatic plants, including algae and sea grasses. For the study of spectroscopic and biological properties of cicoric acid, its sodium salt, and complexes with zinc and copper a variety of methods were used. Studies of antioxidant properties were carried out in relation to selected stable radicals (method of reduction of DPPH and reduction of FRAP). As a result, the structure and spectroscopic properties of cicoric acid and its complexes with selected metals in the solid state and in the solutions were defined. The IR and Raman spectra of cicoric acid displayed a number of bands that were derived from vibrations of caffeic and tartaric acids moieties. At 1746 and 1716 cm-1 the bands assigned to the vibrations of the carbonyl group of tartaric acid occurred. In the spectra of metal complexes with cichoric these bands disappeared what indicated that metal ion was coordinated by the carboxylic groups of tartaric acid. In the spectra of the sodium salt, a characteristic wide-band vibrations of carboxylate anion occurred. In the spectra of cicoric acid and its salt and complexes, a number of bands derived from the vibrations of the aromatic ring (caffeic acid) were assigned. Upon metal-ligand attachment, the changes in the values of the wavenumbers of these bands occurred. The impact of metals on the antioxidant properties of cicoric acid was also examined. Cichoric acid has a high antioxidant potential. Complexation by metals (zinc, copper) did not significantly affect its antioxidant capacity. The work was supported by the National Science Centre, Poland (grant no. 2015/17/B/NZ9/03581).Keywords: chicoric acid, metal complexes, natural antioxidant, phenolic acids
Procedia PDF Downloads 33825762 Analyzing Test Data Generation Techniques Using Evolutionary Algorithms
Authors: Arslan Ellahi, Syed Amjad Hussain
Abstract:
Software Testing is a vital process in software development life cycle. We can attain the quality of software after passing it through software testing phase. We have tried to find out automatic test data generation techniques that are a key research area of software testing to achieve test automation that can eventually decrease testing time. In this paper, we review some of the approaches presented in the literature which use evolutionary search based algorithms like Genetic Algorithm, Particle Swarm Optimization (PSO), etc. to validate the test data generation process. We also look into the quality of test data generation which increases or decreases the efficiency of testing. We have proposed test data generation techniques for model-based testing. We have worked on tuning and fitness function of PSO algorithm.Keywords: search based, evolutionary algorithm, particle swarm optimization, genetic algorithm, test data generation
Procedia PDF Downloads 19025761 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential
Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag
Abstract:
Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.Keywords: climate, reanalysis, renewable energy, solar radiation
Procedia PDF Downloads 20925760 Ribotaxa: Combined Approaches for Taxonomic Resolution Down to the Species Level from Metagenomics Data Revealing Novelties
Authors: Oshma Chakoory, Sophie Comtet-Marre, Pierre Peyret
Abstract:
Metagenomic classifiers are widely used for the taxonomic profiling of metagenomic data and estimation of taxa relative abundance. Small subunit rRNA genes are nowadays a gold standard for the phylogenetic resolution of complex microbial communities, although the power of this marker comes down to its use as full-length. We benchmarked the performance and accuracy of rRNA-specialized versus general-purpose read mappers, reference-targeted assemblers and taxonomic classifiers. We then built a pipeline called RiboTaxa to generate a highly sensitive and specific metataxonomic approach. Using metagenomics data, RiboTaxa gave the best results compared to other tools (Kraken2, Centrifuge (1), METAXA2 (2), PhyloFlash (3)) with precise taxonomic identification and relative abundance description, giving no false positive detection. Using real datasets from various environments (ocean, soil, human gut) and from different approaches (metagenomics and gene capture by hybridization), RiboTaxa revealed microbial novelties not seen by current bioinformatics analysis opening new biological perspectives in human and environmental health. In a study focused on corals’ health involving 20 metagenomic samples (4), an affiliation of prokaryotes was limited to the family level with Endozoicomonadaceae characterising healthy octocoral tissue. RiboTaxa highlighted 2 species of uncultured Endozoicomonas which were dominant in the healthy tissue. Both species belonged to a genus not yet described, opening new research perspectives on corals’ health. Applied to metagenomics data from a study on human gut and extreme longevity (5), RiboTaxa detected the presence of an uncultured archaeon in semi-supercentenarians (aged 105 to 109 years) highlighting an archaeal genus, not yet described, and 3 uncultured species belonging to the Enorma genus that could be species of interest participating in the longevity process. RiboTaxa is user-friendly, rapid, allowing microbiota structure description from any environment and the results can be easily interpreted. This software is freely available at https://github.com/oschakoory/RiboTaxa under the GNU Affero General Public License 3.0.Keywords: metagenomics profiling, microbial diversity, SSU rRNA genes, full-length phylogenetic marker
Procedia PDF Downloads 12125759 Viscoelastic Response of the Human Corneal Stroma Induced by Riboflavin/UVA Cross-Linking
Authors: C. Labate, M. P. De Santo, G. Lombardo, R. Barberi, M. Lombardo, N. M. Ziebarth
Abstract:
In the past decades, the importance of corneal biomechanics in the normal and pathological functions of the eye has gained its credibility. In fact, the mechanical properties of biological tissues are essential to their physiological function. We are convinced that an improved understanding of the nanomechanics of corneal tissue is important to understand the basic molecular interactions between collagen fibrils. Ultimately, this information will help in the development of new techniques to cure ocular diseases and in the development of biomimetic materials. Therefore, nanotechnology techniques are powerful tools and, in particular, Atomic Force Microscopy has demonstrated its ability to reliably characterize the biomechanics of biological tissues either at the micro- or nano-level. In the last years, we have investigated the mechanical anisotropy of the human corneal stroma at both the tissue and molecular levels. In particular, we have focused on corneal cross-linking, an established procedure aimed at slowing down or halting the progression of the disease known as keratoconus. We have obtained the first evidence that riboflavin/UV-A corneal cross-linking induces both an increase of the elastic response and a decrease of the viscous response of the most anterior stroma at the scale of stromal molecular interactions.Keywords: atomic force spectroscopy, corneal stroma, cross-linking, viscoelasticity
Procedia PDF Downloads 31225758 Rangeland Monitoring by Computerized Technologies
Abstract:
Every piece of rangeland has a different set of physical and biological characteristics. This requires the manager to synthesis various information for regular monitoring to define changes trend to get wright decision for sustainable management. So range managers need to use computerized technologies to monitor rangeland, and select. The best management practices. There are four examples of computerized technologies that can benefit sustainable management: (1) Photographic method for cover measurement: The method was tested in different vegetation communities in semi humid and arid regions. Interpretation of pictures of quadrats was done using Arc View software. Data analysis was done by SPSS software using paired t test. Based on the results, generally, photographic method can be used to measure ground cover in most vegetation communities. (2) GPS application for corresponding ground samples and satellite pixels: In two provinces of Tehran and Markazi, six reference points were selected and in each point, eight GPS models were tested. Significant relation among GPS model, time and location with accuracy of estimated coordinates was found. After selection of suitable method, in Markazi province coordinates of plots along four transects in each 6 sites of rangelands was recorded. The best time of GPS application was in the morning hours, Etrex Vista had less error than other models, and a significant relation among GPS model, time and location with accuracy of estimated coordinates was found. (3) Application of satellite data for rangeland monitoring: Focusing on the long term variation of vegetation parameters such as vegetation cover and production is essential. Our study in grass and shrub lands showed that there were significant correlations between quantitative vegetation characteristics and satellite data. So it is possible to monitor rangeland vegetation using digital data for sustainable utilization. (4) Rangeland suitability classification with GIS: Range suitability assessment can facilitate sustainable management planning. Three sub-models of sensitivity to erosion, water suitability and forage production out puts were entered to final range suitability classification model. GIS was facilitate classification of range suitability and produced suitability maps for sheep grazing. Generally digital computers assist range managers to interpret, modify, calibrate or integrating information for correct management.Keywords: computer, GPS, GIS, remote sensing, photographic method, monitoring, rangeland ecosystem, management, suitability, sheep grazing
Procedia PDF Downloads 36725757 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool
Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi
Abstract:
The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.Keywords: data analysis, deep learning, LSTM neural network, netflix
Procedia PDF Downloads 25125756 Analysis of User Data Usage Trends on Cellular and Wi-Fi Networks
Authors: Jayesh M. Patel, Bharat P. Modi
Abstract:
The availability of on mobile devices that can invoke the demonstrated that the total data demand from users is far higher than previously articulated by measurements based solely on a cellular-centric view of smart-phone usage. The ratio of Wi-Fi to cellular traffic varies significantly between countries, This paper is shown the compression between the cellular data usage and Wi-Fi data usage by the user. This strategy helps operators to understand the growing importance and application of yield management strategies designed to squeeze maximum returns from their investments into the networks and devices that enable the mobile data ecosystem. The transition from unlimited data plans towards tiered pricing and, in the future, towards more value-centric pricing offers significant revenue upside potential for mobile operators, but, without a complete insight into all aspects of smartphone customer behavior, operators will unlikely be able to capture the maximum return from this billion-dollar market opportunity.Keywords: cellular, Wi-Fi, mobile, smart phone
Procedia PDF Downloads 36525755 Local Image Features Emerging from Brain Inspired Multi-Layer Neural Network
Authors: Hui Wei, Zheng Dong
Abstract:
Object recognition has long been a challenging task in computer vision. Yet the human brain, with the ability to rapidly and accurately recognize visual stimuli, manages this task effortlessly. In the past decades, advances in neuroscience have revealed some neural mechanisms underlying visual processing. In this paper, we present a novel model inspired by the visual pathway in primate brains. This multi-layer neural network model imitates the hierarchical convergent processing mechanism in the visual pathway. We show that local image features generated by this model exhibit robust discrimination and even better generalization ability compared with some existing image descriptors. We also demonstrate the application of this model in an object recognition task on image data sets. The result provides strong support for the potential of this model.Keywords: biological model, feature extraction, multi-layer neural network, object recognition
Procedia PDF Downloads 54225754 Data Driven Infrastructure Planning for Offshore Wind farms
Authors: Isha Saxena, Behzad Kazemtabrizi, Matthias C. M. Troffaes, Christopher Crabtree
Abstract:
The calculations done at the beginning of the life of a wind farm are rarely reliable, which makes it important to conduct research and study the failure and repair rates of the wind turbines under various conditions. This miscalculation happens because the current models make a simplifying assumption that the failure/repair rate remains constant over time. This means that the reliability function is exponential in nature. This research aims to create a more accurate model using sensory data and a data-driven approach. The data cleaning and data processing is done by comparing the Power Curve data of the wind turbines with SCADA data. This is then converted to times to repair and times to failure timeseries data. Several different mathematical functions are fitted to the times to failure and times to repair data of the wind turbine components using Maximum Likelihood Estimation and the Posterior expectation method for Bayesian Parameter Estimation. Initial results indicate that two parameter Weibull function and exponential function produce almost identical results. Further analysis is being done using the complex system analysis considering the failures of each electrical and mechanical component of the wind turbine. The aim of this project is to perform a more accurate reliability analysis that can be helpful for the engineers to schedule maintenance and repairs to decrease the downtime of the turbine.Keywords: reliability, bayesian parameter inference, maximum likelihood estimation, weibull function, SCADA data
Procedia PDF Downloads 86