Search results for: missing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25446

Search results for: missing data

24756 Humanising Digital Healthcare to Build Capacity by Harnessing the Power of Patient Data

Authors: Durhane Wong-Rieger, Kawaldip Sehmi, Nicola Bedlington, Nicole Boice, Tamás Bereczky

Abstract:

Patient-generated health data should be seen as the expression of the experience of patients, including the outcomes reflecting the impact a treatment or service had on their physical health and wellness. We discuss how the healthcare system can reach a place where digital is a determinant of health - where data is generated by patients and is respected and which acknowledges their contribution to science. We explore the biggest barriers facing this. The International Experience Exchange with Patient Organisation’s Position Paper is based on a global patient survey conducted in Q3 2021 that received 304 responses. Results were discussed and validated by the 15 patient experts and supplemented with literature research. Results are a subset of this. Our research showed patient communities want to influence how their data is generated, shared, and used. Our study concludes that a reasonable framework is needed to protect the integrity of patient data and minimise abuse, and build trust. Results also demonstrated a need for patient communities to have more influence and control over how health data is generated, shared, and used. The results clearly highlight that the community feels there is a lack of clear policies on sharing data.

Keywords: digital health, equitable access, humanise healthcare, patient data

Procedia PDF Downloads 83
24755 Use of Machine Learning in Data Quality Assessment

Authors: Bruno Pinto Vieira, Marco Antonio Calijorne Soares, Armando Sérgio de Aguiar Filho

Abstract:

Nowadays, a massive amount of information has been produced by different data sources, including mobile devices and transactional systems. In this scenario, concerns arise on how to maintain or establish data quality, which is now treated as a product to be defined, measured, analyzed, and improved to meet consumers' needs, which is the one who uses these data in decision making and companies strategies. Information that reaches low levels of quality can lead to issues that can consume time and money, such as missed business opportunities, inadequate decisions, and bad risk management actions. The step of selecting, identifying, evaluating, and selecting data sources with significant quality according to the need has become a costly task for users since the sources do not provide information about their quality. Traditional data quality control methods are based on user experience or business rules limiting performance and slowing down the process with less than desirable accuracy. Using advanced machine learning algorithms, it is possible to take advantage of computational resources to overcome challenges and add value to companies and users. In this study, machine learning is applied to data quality analysis on different datasets, seeking to compare the performance of the techniques according to the dimensions of quality assessment. As a result, we could create a ranking of approaches used, besides a system that is able to carry out automatically, data quality assessment.

Keywords: machine learning, data quality, quality dimension, quality assessment

Procedia PDF Downloads 149
24754 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 153
24753 Physical Tests on Localized Fluidization in Offshore Suction Bucket Foundations

Authors: Li-Hua Luu, Alexis Doghmane, Abbas Farhat, Mohammad Sanayei, Pierre Philippe, Pablo Cuellar

Abstract:

Suction buckets are promising innovative foundations for offshore wind turbines. They generally feature the shape of an inverted bucket and rely on a suction system as a driving agent for their installation into the seabed. Water is pumped out of the buckets that are initially placed to rest on the seabed, creating a net pressure difference across the lid that generates a seepage flow, lowers the soil resistance below the foundation skirt, and drives them effectively into the seabed. The stability of the suction mechanism as well as the possibility of a piping failure (i.e., localized fluidization within the internal soil plug) during their installation are some of the key questions that remain open. The present work deals with an experimental study of localized fluidization by suction within a fixed bucket partially embedded into a submerged artificial soil made of spherical beads. The transient process, from the onset of granular motion until reaching a stationary regime for the fluidization at the embedded bucket wall, is recorded using the combined optical techniques of planar laser-induced fluorescence and refractive index matching. To conduct a systematic study of the piping threshold for the seepage flow, we vary the beads size, the suction pressure, and the initial depth for the bucket. This experimental modelling, by dealing with erosion-related phenomena from a micromechanical perspective, shall provide qualitative scenarios for the local processes at work which are missing in the offshore practice so far.

Keywords: fluidization, micromechanical approach, offshore foundations, suction bucket

Procedia PDF Downloads 183
24752 Nuclear Decay Data Evaluation for 217Po

Authors: S. S. Nafee, A. M. Al-Ramady, S. A. Shaheen

Abstract:

Evaluated nuclear decay data for the 217Po nuclide ispresented in the present work. These data include recommended values for the half-life T1/2, α-, β--, and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons has been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.

Keywords: nuclear decay data evaluation, mass evaluation, total converison coefficients, atomic mass evaluation

Procedia PDF Downloads 433
24751 Geographic Information System Using Google Fusion Table Technology for the Delivery of Disease Data Information

Authors: I. Nyoman Mahayasa Adiputra

Abstract:

Data in the field of health can be useful for the purposes of data analysis, one example of health data is disease data. Disease data is usually in a geographical plot in accordance with the area. Where the data was collected, in the city of Denpasar, Bali. Disease data report is still published in tabular form, disease information has not been mapped in GIS form. In this research, disease information in Denpasar city will be digitized in the form of a geographic information system with the smallest administrative area in the form of district. Denpasar City consists of 4 districts of North Denpasar, East Denpasar, West Denpasar and South Denpasar. In this research, we use Google fusion table technology for map digitization process, where this technology can facilitate from the administrator and from the recipient information. From the administrator side of the input disease, data can be done easily and quickly. From the receiving end of the information, the resulting GIS application can be published in a website-based application so that it can be accessed anywhere and anytime. In general, the results obtained in this study, divided into two, namely: (1) Geolocation of Denpasar and all of Denpasar districts, the process of digitizing the map of Denpasar city produces a polygon geolocation of each - district of Denpasar city. These results can be utilized in subsequent GIS studies if you want to use the same administrative area. (2) Dengue fever mapping in 2014 and 2015. Disease data used in this study is dengue fever case data taken in 2014 and 2015. Data taken from the profile report Denpasar Health Department 2015 and 2016. This mapping can be useful for the analysis of the spread of dengue hemorrhagic fever in the city of Denpasar.

Keywords: geographic information system, Google fusion table technology, delivery of disease data information, Denpasar city

Procedia PDF Downloads 131
24750 Inclusive Practices in Health Sciences: Equity Proofing Higher Education Programs

Authors: Mitzi S. Brammer

Abstract:

Given that the cultural make-up of programs of study in institutions of higher learning is becoming increasingly diverse, much has been written about cultural diversity from a university-level perspective. However, there are little data in the way of specific programs and how they address inclusive practices when teaching and working with marginalized populations. This research study aimed to discover baseline knowledge and attitudes of health sciences faculty, instructional staff, and students related to inclusive teaching/learning and interactions. Quantitative data were collected via an anonymous online survey (one designed for students and another designed for faculty/instructional staff) using a web-based program called Qualtrics. Quantitative data were analyzed amongst the faculty/instructional staff and students, respectively, using descriptive and comparative statistics (t-tests). Additionally, some participants voluntarily engaged in a focus group discussion in which qualitative data were collected around these same variables. Collecting qualitative data to triangulate the quantitative data added trustworthiness to the overall data. The research team analyzed collected data and compared identified categories and trends, comparing those data between faculty/staff and students, and reported results as well as implications for future study and professional practice.

Keywords: inclusion, higher education, pedagogy, equity, diversity

Procedia PDF Downloads 67
24749 An Analysis of Sequential Pattern Mining on Databases Using Approximate Sequential Patterns

Authors: J. Suneetha, Vijayalaxmi

Abstract:

Sequential Pattern Mining involves applying data mining methods to large data repositories to extract usage patterns. Sequential pattern mining methodologies used to analyze the data and identify patterns. The patterns have been used to implement efficient systems can recommend on previously observed patterns, in making predictions, improve usability of systems, detecting events, and in general help in making strategic product decisions. In this paper, identified performance of approximate sequential pattern mining defines as identifying patterns approximately shared with many sequences. Approximate sequential patterns can effectively summarize and represent the databases by identifying the underlying trends in the data. Conducting an extensive and systematic performance over synthetic and real data. The results demonstrate that ApproxMAP effective and scalable in mining large sequences databases with long patterns.

Keywords: multiple data, performance analysis, sequential pattern, sequence database scalability

Procedia PDF Downloads 346
24748 Medical Knowledge Management since the Integration of Heterogeneous Data until the Knowledge Exploitation in a Decision-Making System

Authors: Nadjat Zerf Boudjettou, Fahima Nader, Rachid Chalal

Abstract:

Knowledge management is to acquire and represent knowledge relevant to a domain, a task or a specific organization in order to facilitate access, reuse and evolution. This usually means building, maintaining and evolving an explicit representation of knowledge. The next step is to provide access to that knowledge, that is to say, the spread in order to enable effective use. Knowledge management in the medical field aims to improve the performance of the medical organization by allowing individuals in the care facility (doctors, nurses, paramedics, etc.) to capture, share and apply collective knowledge in order to make optimal decisions in real time. In this paper, we propose a knowledge management approach based on integration technique of heterogeneous data in the medical field by creating a data warehouse, a technique of extracting knowledge from medical data by choosing a technique of data mining, and finally an exploitation technique of that knowledge in a case-based reasoning system.

Keywords: data warehouse, data mining, knowledge discovery in database, KDD, medical knowledge management, Bayesian networks

Procedia PDF Downloads 396
24747 High-Rises and Urban Design: The Reasons for Unsuccessful Placemaking with Residential High-Rises in England

Authors: E. Kalcheva, A. Taki, Y. Hadi

Abstract:

High-rises and placemaking is an understudied combination which receives more and more interest with the proliferation of this typology in many British cities. The reason for studying three major cities in England: London, Birmingham and Manchester, is to learn from the latest advances in urban design in well-developed and prominent urban environment. The analysis of several high-rise sites reveals the weaknesses in urban design of contemporary British cities and presents an opportunity to study from the implemented examples. Therefore, the purpose of this research is to analyze design approaches towards creating a sustainable and varied urban environment when high-rises are involved. The research questions raised by the study are: what is the quality of high-rises and their surroundings; what facilities and features are deployed in the research area; what is the role of the high-rise buildings in the placemaking process; what urban design principles are applicable in this context. The methodology utilizes observation of the researched area by structured questions, developed by the author to evaluate the outdoor qualities of the high-rise surroundings. In this context, the paper argues that the quality of the public realm around the high-rises is quite low, missing basic but vital elements such as plazas, public art, and seating, along with landscaping and pocket parks. There is lack of coherence, the rhythm of the streets is often disrupted, and even though the high-rises are very aesthetically appealing, they fail to create a sense of place on their own. The implications of the study are that future planning can take into consideration the critique in this article and provide more opportunities for urban design interventions around high-rise buildings in the British cities.

Keywords: high-rises, placemaking, urban design, townscape

Procedia PDF Downloads 323
24746 Mean Shift-Based Preprocessing Methodology for Improved 3D Buildings Reconstruction

Authors: Nikolaos Vassilas, Theocharis Tsenoglou, Djamchid Ghazanfarpour

Abstract:

In this work we explore the capability of the mean shift algorithm as a powerful preprocessing tool for improving the quality of spatial data, acquired from airborne scanners, from densely built urban areas. On one hand, high resolution image data corrupted by noise caused by lossy compression techniques are appropriately smoothed while at the same time preserving the optical edges and, on the other, low resolution LiDAR data in the form of normalized Digital Surface Map (nDSM) is upsampled through the joint mean shift algorithm. Experiments on both the edge-preserving smoothing and upsampling capabilities using synthetic RGB-z data show that the mean shift algorithm is superior to bilateral filtering as well as to other classical smoothing and upsampling algorithms. Application of the proposed methodology for 3D reconstruction of buildings of a pilot region of Athens, Greece results in a significant visual improvement of the 3D building block model.

Keywords: 3D buildings reconstruction, data fusion, data upsampling, mean shift

Procedia PDF Downloads 316
24745 Importance of Ethics in Cloud Security

Authors: Pallavi Malhotra

Abstract:

This paper examines the importance of ethics in cloud computing. In the modern society, cloud computing is offering individuals and businesses an unlimited space for storing and processing data or information. Most of the data and information stored in the cloud by various users such as banks, doctors, architects, engineers, lawyers, consulting firms, and financial institutions among others require a high level of confidentiality and safeguard. Cloud computing offers centralized storage and processing of data, and this has immensely contributed to the growth of businesses and improved sharing of information over the internet. However, the accessibility and management of data and servers by a third party raise concerns regarding the privacy of clients’ information and the possible manipulations of the data by third parties. This document suggests the approaches various stakeholders should take to address various ethical issues involving cloud-computing services. Ethical education and training is key to all stakeholders involved in the handling of data and information stored or being processed in the cloud.

Keywords: IT ethics, cloud computing technology, cloud privacy and security, ethical education

Procedia PDF Downloads 326
24744 The Feminism of Data Privacy and Protection in Africa

Authors: Olayinka Adeniyi, Melissa Omino

Abstract:

The field of data privacy and data protection in Africa is still an evolving area, with many African countries yet to enact legislation on the subject. While African Governments are bringing their legislation to speed in this field, how patriarchy pervades every sector of African thought and manifests in society needs to be considered. Moreover, the laws enacted ought to be inclusive, especially towards women. This, in a nutshell, is the essence of data feminism. Data feminism is a new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism. Feminising data privacy and protection will involve thinking women, considering women in the issues of data privacy and protection, particularly in legislation, as is the case in this paper. The line of thought of women inclusion is not uncommon when even international and regional human rights specific for women only came long after the general human rights. The consideration is that these should have been inserted or rather included in the original general instruments in the first instance. Since legislation on data privacy is coming in this century, having seen the rights and shortcomings of earlier instruments, then the cue should be taken to ensure inclusive wholistic legislation for data privacy and protection in the first instance. Data feminism is arguably an area that has been scantily researched, albeit a needful one. With the spate of increase in the violence against women spiraling in the cyber world, compounding the issue of COVID-19 and the needful response of governments, and the effect of these on women and their rights, fast forward, the research on the feminism of data privacy and protection in Africa becomes inevitable. This paper seeks to answer the questions, what is data feminism in the African context, why is it important in the issue of data privacy and protection legislation; what are the laws, if any, existing on data privacy and protection in Africa, are they women inclusive, if not, why; what are the measures put in place for the privacy and protection of women in Africa, and how can this be made possible. The paper aims to investigate the issue of data privacy and protection in Africa, the legal framework, and the protection or provision that it has for women if any. It further aims to research the importance and necessity of feminizing data privacy and protection, the effect of lack of it, the challenges or bottlenecks in attaining this feat and the possibilities of accessing data privacy and protection for African women. The paper also researches the emerging practices of data privacy and protection of women in other jurisprudences. It approaches the research through the methodology of review of papers, analysis of laws, and reports. It seeks to contribute to the existing literature in the field and is explorative in its suggestion. It suggests a draft of some clauses to make any data privacy and protection legislation women inclusive. It would be useful for policymaking, academic, and public enlightenment.

Keywords: feminism, women, law, data, Africa

Procedia PDF Downloads 207
24743 Evaluation of Practicality of On-Demand Bus Using Actual Taxi-Use Data through Exhaustive Simulations

Authors: Jun-ichi Ochiai, Itsuki Noda, Ryo Kanamori, Keiji Hirata, Hitoshi Matsubara, Hideyuki Nakashima

Abstract:

We conducted exhaustive simulations for data assimilation and evaluation of service quality for various setting in a new shared transportation system, called SAVS. Computational social simulation is a key technology to design recent social services like SAVS as new transportation service. One open issue in SAVS was to determine the service scale through the social simulation. Using our exhaustive simulation framework, OACIS, we did data-assimilation and evaluation of effects of SAVS based on actual tax-use data at Tajimi city, Japan. Finally, we get the conditions to realize the new service in a reasonable service quality.

Keywords: on-demand bus sytem, social simulation, data assimilation, exhaustive simulation

Procedia PDF Downloads 321
24742 Unlocking the Puzzle of Borrowing Adult Data for Designing Hybrid Pediatric Clinical Trials

Authors: Rajesh Kumar G

Abstract:

A challenging aspect of any clinical trial is to carefully plan the study design to meet the study objective in optimum way and to validate the assumptions made during protocol designing. And when it is a pediatric study, there is the added challenge of stringent guidelines and difficulty in recruiting the necessary subjects. Unlike adult trials, there is not much historical data available for pediatrics, which is required to validate assumptions for planning pediatric trials. Typically, pediatric studies are initiated as soon as approval is obtained for a drug to be marketed for adults, so with the adult study historical information and with the available pediatric pilot study data or simulated pediatric data, the pediatric study can be well planned. Generalizing the historical adult study for new pediatric study is a tedious task; however, it is possible by integrating various statistical techniques and utilizing the advantage of hybrid study design, which will help to achieve the study objective in a smoother way even with the presence of many constraints. This research paper will explain how well the hybrid study design can be planned along with integrated technique (SEV) to plan the pediatric study; In brief the SEV technique (Simulation, Estimation (using borrowed adult data and applying Bayesian methods)) incorporates the use of simulating the planned study data and getting the desired estimates to Validate the assumptions.This method of validation can be used to improve the accuracy of data analysis, ensuring that results are as valid and reliable as possible, which allow us to make informed decisions well ahead of study initiation. With professional precision, this technique based on the collected data allows to gain insight into best practices when using data from historical study and simulated data alike.

Keywords: adaptive design, simulation, borrowing data, bayesian model

Procedia PDF Downloads 77
24741 Analyzing Test Data Generation Techniques Using Evolutionary Algorithms

Authors: Arslan Ellahi, Syed Amjad Hussain

Abstract:

Software Testing is a vital process in software development life cycle. We can attain the quality of software after passing it through software testing phase. We have tried to find out automatic test data generation techniques that are a key research area of software testing to achieve test automation that can eventually decrease testing time. In this paper, we review some of the approaches presented in the literature which use evolutionary search based algorithms like Genetic Algorithm, Particle Swarm Optimization (PSO), etc. to validate the test data generation process. We also look into the quality of test data generation which increases or decreases the efficiency of testing. We have proposed test data generation techniques for model-based testing. We have worked on tuning and fitness function of PSO algorithm.

Keywords: search based, evolutionary algorithm, particle swarm optimization, genetic algorithm, test data generation

Procedia PDF Downloads 191
24740 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential

Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag

Abstract:

Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.

Keywords: climate, reanalysis, renewable energy, solar radiation

Procedia PDF Downloads 209
24739 Data Mining Spatial: Unsupervised Classification of Geographic Data

Authors: Chahrazed Zouaoui

Abstract:

In recent years, the volume of geospatial information is increasing due to the evolution of communication technologies and information, this information is presented often by geographic information systems (GIS) and stored on of spatial databases (BDS). The classical data mining revealed a weakness in knowledge extraction at these enormous amounts of data due to the particularity of these spatial entities, which are characterized by the interdependence between them (1st law of geography). This gave rise to spatial data mining. Spatial data mining is a process of analyzing geographic data, which allows the extraction of knowledge and spatial relationships from geospatial data, including methods of this process we distinguish the monothematic and thematic, geo- Clustering is one of the main tasks of spatial data mining, which is registered in the part of the monothematic method. It includes geo-spatial entities similar in the same class and it affects more dissimilar to the different classes. In other words, maximize intra-class similarity and minimize inter similarity classes. Taking account of the particularity of geo-spatial data. Two approaches to geo-clustering exist, the dynamic processing of data involves applying algorithms designed for the direct treatment of spatial data, and the approach based on the spatial data pre-processing, which consists of applying clustering algorithms classic pre-processed data (by integration of spatial relationships). This approach (based on pre-treatment) is quite complex in different cases, so the search for approximate solutions involves the use of approximation algorithms, including the algorithms we are interested in dedicated approaches (clustering methods for partitioning and methods for density) and approaching bees (biomimetic approach), our study is proposed to design very significant to this problem, using different algorithms for automatically detecting geo-spatial neighborhood in order to implement the method of geo- clustering by pre-treatment, and the application of the bees algorithm to this problem for the first time in the field of geo-spatial.

Keywords: mining, GIS, geo-clustering, neighborhood

Procedia PDF Downloads 375
24738 Bekaadendang: A Principles-Focused Evaluation

Authors: Erin Brands-Saliba

Abstract:

In this evaluation study, we explore the efficacy and implementation of the five guiding principles of Bekaadendang “Being Peaceful,” a suite of services facilitated by our Anti-Human Trafficking Team, and a pivotal component of the Holistic Prevention Services department at NCFST. The guiding principles—trauma-informed care, cultural safety, 4-quadrant medicine wheel approach, harm reduction, and after-care peer support—are the foundation of Bekaadendang's mission to support at-risk individuals and survivors of human trafficking. This evaluation is of paramount importance given the profound impact of human trafficking on these communities and aims to ensure that Bekaadendang's principles are not only understood by staff but experienced by community members in a purposeful and meaningful manner. The issues at the heart of this evaluation are deeply entrenched in the historical and contemporary challenges faced by Indigenous communities, with a particular emphasis on Indigenous women and 2SLGBTQQIA+ individuals. Well-documented reports like the National Inquiry into Missing and Murdered Indigenous Women and Girls (MMIWG) have cast a glaring light on the disproportionately high rates of violence, exploitation, and trafficking experienced by these communities. The MMIWG report underlines the pressing need for holistic, culturally informed interventions like Bekaadendang. Furthermore, the research efforts of scholars, both Indigenous and non-Indigenous, shed light on the persistent systemic issues that make Indigenous individuals more vulnerable to trafficking and exploitation. Recognizing this broader context is crucial to truly grasp the importance of evaluating the guiding principles that underpin Bekaadendang's service model.

Keywords: human trafficking, indigenous healing, MMIWG, program evaluation

Procedia PDF Downloads 54
24737 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool

Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi

Abstract:

The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.

Keywords: data analysis, deep learning, LSTM neural network, netflix

Procedia PDF Downloads 253
24736 Analysis of User Data Usage Trends on Cellular and Wi-Fi Networks

Authors: Jayesh M. Patel, Bharat P. Modi

Abstract:

The availability of on mobile devices that can invoke the demonstrated that the total data demand from users is far higher than previously articulated by measurements based solely on a cellular-centric view of smart-phone usage. The ratio of Wi-Fi to cellular traffic varies significantly between countries, This paper is shown the compression between the cellular data usage and Wi-Fi data usage by the user. This strategy helps operators to understand the growing importance and application of yield management strategies designed to squeeze maximum returns from their investments into the networks and devices that enable the mobile data ecosystem. The transition from unlimited data plans towards tiered pricing and, in the future, towards more value-centric pricing offers significant revenue upside potential for mobile operators, but, without a complete insight into all aspects of smartphone customer behavior, operators will unlikely be able to capture the maximum return from this billion-dollar market opportunity.

Keywords: cellular, Wi-Fi, mobile, smart phone

Procedia PDF Downloads 366
24735 Data Driven Infrastructure Planning for Offshore Wind farms

Authors: Isha Saxena, Behzad Kazemtabrizi, Matthias C. M. Troffaes, Christopher Crabtree

Abstract:

The calculations done at the beginning of the life of a wind farm are rarely reliable, which makes it important to conduct research and study the failure and repair rates of the wind turbines under various conditions. This miscalculation happens because the current models make a simplifying assumption that the failure/repair rate remains constant over time. This means that the reliability function is exponential in nature. This research aims to create a more accurate model using sensory data and a data-driven approach. The data cleaning and data processing is done by comparing the Power Curve data of the wind turbines with SCADA data. This is then converted to times to repair and times to failure timeseries data. Several different mathematical functions are fitted to the times to failure and times to repair data of the wind turbine components using Maximum Likelihood Estimation and the Posterior expectation method for Bayesian Parameter Estimation. Initial results indicate that two parameter Weibull function and exponential function produce almost identical results. Further analysis is being done using the complex system analysis considering the failures of each electrical and mechanical component of the wind turbine. The aim of this project is to perform a more accurate reliability analysis that can be helpful for the engineers to schedule maintenance and repairs to decrease the downtime of the turbine.

Keywords: reliability, bayesian parameter inference, maximum likelihood estimation, weibull function, SCADA data

Procedia PDF Downloads 87
24734 Empirical Acceleration Functions and Fuzzy Information

Authors: Muhammad Shafiq

Abstract:

In accelerated life testing approaches life time data is obtained under various conditions which are considered more severe than usual condition. Classical techniques are based on obtained precise measurements, and used to model variation among the observations. In fact, there are two types of uncertainty in data: variation among the observations and the fuzziness. Analysis techniques, which do not consider fuzziness and are only based on precise life time observations, lead to pseudo results. This study was aimed to examine the behavior of empirical acceleration functions using fuzzy lifetimes data. The results showed an increased fuzziness in the transformed life times as compare to the input data.

Keywords: acceleration function, accelerated life testing, fuzzy number, non-precise data

Procedia PDF Downloads 300
24733 Neurotoxic Effects Assessment of Metformin in Danio rerio

Authors: Gustavo Axel Elizalde-Velázquez

Abstract:

Metformin is the first line of oral therapy to treat type II diabetes and is also employed as a treatment for other indications, such as polycystic ovary syndrome, cancer, and COVID-19. Recent data suggest it is the aspirin of the 21st century due to its antioxidant and anti-aging effects. However, increasingly current articles indicate its long-term consumption generates mitochondrial impairment. Up to date, it is known metformin increases the biogenesis of Alzheimer's amyloid peptides via up-regulating BACE1 transcription, but further information related to brain damage after its consumption is missing. Bearing in mind the above, this work aimed to establish whether or not chronic exposure to metformin may alter swimming behavior and induce neurotoxicity in Danio rerio adults. For this purpose, 250 Danio rerio grown-ups were assigned to six tanks of 50 L of capacity. Four of the six systems contained 50 fish, while the remaining two had 25 fish (≈1 male:1 female ratio). Every system with 50 fish was allocated one of the three metformin treatment concentrations (1, 20, and 40 μg/L), with one system as the control treatment. Systems with 25 fish, on the other hand, were used as positive controls for acetylcholinesterase (10 μg/L of Atrazine) and oxidative stress (3 μg/L of Atrazine). After four months of exposure, a mean of 32 fish (S.D. ± 2) per group of MET treatment survived, which were used for the evaluation of behavior with the Novel Tank test. Moreover, after the behavioral assessment, we aimed to collect the blood and brains of all fish from all treatment groups. For blood collection, fish were anesthetized with an MS-222 solution (150 mg/L), while for brain gathering, fish were euthanized using the hypothermic shock method (2–4 °C). Blood was employed to determine CASP3 activity and the percentage of apoptotic cells with the TUNEL assay, and brains were used to evaluate acetylcholinesterase activity, oxidative damage, and gene expression. After chronic exposure, MET-exposed fish exhibited less swimming activity when compared to control fish. Moreover, compared with the control group, MET significantly inhibited the activity of AChE and induced oxidative damage in the brain of fish. Concerning gene expression, MET significantly upregulated the expression of Nrf1, Nrf2, BAX, p53, BACE1, APP, PSEN1, and downregulated CASP3 and CASP9. Although MET did not overexpress the CASP3 gene, we saw a meaningful rise in the activity of this enzyme in the blood of fish exposed to MET compared to the control group, which we then confirmed by a high number of apoptotic cells in the TUNEL assay. To the best of our understanding, this is the first study that delivers evidence of oxidative impairment, apoptosis, AChE alteration, and overexpression of B- amyloid-related genes in the brain of fish exposed to metformin.

Keywords: AChE inhibition, CASP3 activity, NovelTank test, oxidative damage, TUNEL assay

Procedia PDF Downloads 86
24732 Evaluating Alternative Structures for Prefix Trees

Authors: Feras Hanandeh, Izzat Alsmadi, Muhammad M. Kwafha

Abstract:

Prefix trees or tries are data structures that are used to store data or index of data. The goal is to be able to store and retrieve data by executing queries in quick and reliable manners. In principle, the structure of the trie depends on having letters in nodes at the different levels to point to the actual words in the leafs. However, the exact structure of the trie may vary based on several aspects. In this paper, we evaluated different structures for building tries. Using datasets of words of different sizes, we evaluated the different forms of trie structures. Results showed that some characteristics may impact significantly, positively or negatively, the size and the performance of the trie. We investigated different forms and structures for the trie. Results showed that using an array of pointers in each level to represent the different alphabet letters is the best choice.

Keywords: data structures, indexing, tree structure, trie, information retrieval

Procedia PDF Downloads 452
24731 Data Management System for Environmental Remediation

Authors: Elizaveta Petelina, Anton Sizo

Abstract:

Environmental remediation projects deal with a wide spectrum of data, including data collected during site assessment, execution of remediation activities, and environmental monitoring. Therefore, an appropriate data management is required as a key factor for well-grounded decision making. The Environmental Data Management System (EDMS) was developed to address all necessary data management aspects, including efficient data handling and data interoperability, access to historical and current data, spatial and temporal analysis, 2D and 3D data visualization, mapping, and data sharing. The system focuses on support of well-grounded decision making in relation to required mitigation measures and assessment of remediation success. The EDMS is a combination of enterprise and desktop level data management and Geographic Information System (GIS) tools assembled to assist to environmental remediation, project planning, and evaluation, and environmental monitoring of mine sites. EDMS consists of seven main components: a Geodatabase that contains spatial database to store and query spatially distributed data; a GIS and Web GIS component that combines desktop and server-based GIS solutions; a Field Data Collection component that contains tools for field work; a Quality Assurance (QA)/Quality Control (QC) component that combines operational procedures for QA and measures for QC; Data Import and Export component that includes tools and templates to support project data flow; a Lab Data component that provides connection between EDMS and laboratory information management systems; and a Reporting component that includes server-based services for real-time report generation. The EDMS has been successfully implemented for the Project CLEANS (Clean-up of Abandoned Northern Mines). Project CLEANS is a multi-year, multimillion-dollar project aimed at assessing and reclaiming 37 uranium mine sites in northern Saskatchewan, Canada. The EDMS has effectively facilitated integrated decision-making for CLEANS project managers and transparency amongst stakeholders.

Keywords: data management, environmental remediation, geographic information system, GIS, decision making

Procedia PDF Downloads 162
24730 An Efficient Approach for Speed up Non-Negative Matrix Factorization for High Dimensional Data

Authors: Bharat Singh Om Prakash Vyas

Abstract:

Now a day’s applications deal with High Dimensional Data have tremendously used in the popular areas. To tackle with such kind of data various approached has been developed by researchers in the last few decades. To tackle with such kind of data various approached has been developed by researchers in the last few decades. One of the problems with the NMF approaches, its randomized valued could not provide absolute optimization in limited iteration, but having local optimization. Due to this, we have proposed a new approach that considers the initial values of the decomposition to tackle the issues of computationally expensive. We have devised an algorithm for initializing the values of the decomposed matrix based on the PSO (Particle Swarm Optimization). Through the experimental result, we will show the proposed method converse very fast in comparison to other row rank approximation like simple NMF multiplicative, and ACLS techniques.

Keywords: ALS, NMF, high dimensional data, RMSE

Procedia PDF Downloads 343
24729 Integrating Time-Series and High-Spatial Remote Sensing Data Based on Multilevel Decision Fusion

Authors: Xudong Guan, Ainong Li, Gaohuan Liu, Chong Huang, Wei Zhao

Abstract:

Due to the low spatial resolution of MODIS data, the accuracy of small-area plaque extraction with a high degree of landscape fragmentation is greatly limited. To this end, the study combines Landsat data with higher spatial resolution and MODIS data with higher temporal resolution for decision-level fusion. Considering the importance of the land heterogeneity factor in the fusion process, it is superimposed with the weighting factor, which is to linearly weight the Landsat classification result and the MOIDS classification result. Three levels were used to complete the process of data fusion, that is the pixel of MODIS data, the pixel of Landsat data, and objects level that connect between these two levels. The multilevel decision fusion scheme was tested in two sites of the lower Mekong basin. We put forth a comparison test, and it was proved that the classification accuracy was improved compared with the single data source classification results in terms of the overall accuracy. The method was also compared with the two-level combination results and a weighted sum decision rule-based approach. The decision fusion scheme is extensible to other multi-resolution data decision fusion applications.

Keywords: image classification, decision fusion, multi-temporal, remote sensing

Procedia PDF Downloads 124
24728 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: behavior pattern, cooperative learning, data analyze, k-means clustering algorithm

Procedia PDF Downloads 188
24727 Combining Diffusion Maps and Diffusion Models for Enhanced Data Analysis

Authors: Meng Su

Abstract:

High-dimensional data analysis often presents challenges in capturing the complex, nonlinear relationships and manifold structures inherent to the data. This article presents a novel approach that leverages the strengths of two powerful techniques, Diffusion Maps and Diffusion Probabilistic Models (DPMs), to address these challenges. By integrating the dimensionality reduction capability of Diffusion Maps with the data modeling ability of DPMs, the proposed method aims to provide a comprehensive solution for analyzing and generating high-dimensional data. The Diffusion Map technique preserves the nonlinear relationships and manifold structure of the data by mapping it to a lower-dimensional space using the eigenvectors of the graph Laplacian matrix. Meanwhile, DPMs capture the dependencies within the data, enabling effective modeling and generation of new data points in the low-dimensional space. The generated data points can then be mapped back to the original high-dimensional space, ensuring consistency with the underlying manifold structure. Through a detailed example implementation, the article demonstrates the potential of the proposed hybrid approach to achieve more accurate and effective modeling and generation of complex, high-dimensional data. Furthermore, it discusses possible applications in various domains, such as image synthesis, time-series forecasting, and anomaly detection, and outlines future research directions for enhancing the scalability, performance, and integration with other machine learning techniques. By combining the strengths of Diffusion Maps and DPMs, this work paves the way for more advanced and robust data analysis methods.

Keywords: diffusion maps, diffusion probabilistic models (DPMs), manifold learning, high-dimensional data analysis

Procedia PDF Downloads 111