Search results for: data mining analytics
24226 Survey on Arabic Sentiment Analysis in Twitter
Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb
Abstract:
Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.Keywords: big data, social networks, sentiment analysis, twitter
Procedia PDF Downloads 57624225 Estimating Current Suicide Rates Using Google Trends
Authors: Ladislav Kristoufek, Helen Susannah Moat, Tobias Preis
Abstract:
Data on the number of people who have committed suicide tends to be reported with a substantial time lag of around two years. We examine whether online activity measured by Google searches can help us improve estimates of the number of suicide occurrences in England before official figures are released. Specifically, we analyse how data on the number of Google searches for the terms “depression” and “suicide” relate to the number of suicides between 2004 and 2013. We find that estimates drawing on Google data are significantly better than estimates using previous suicide data alone. We show that a greater number of searches for the term “depression” is related to fewer suicides, whereas a greater number of searches for the term “suicide” is related to more suicides. Data on suicide related search behaviour can be used to improve current estimates of the number of suicide occurrences.Keywords: nowcasting, search data, Google Trends, official statistics
Procedia PDF Downloads 35724224 Functionalized Nano porous Ceramic Membranes for Electrodialysis Treatment of Harsh Wastewater
Authors: Emily Rabe, Stephanie Candelaria, Rachel Malone, Olivia Lenz, Greg Newbloom
Abstract:
Electrodialysis (ED) is a well-developed technology for ion removal in a variety of applications. However, many industries generate harsh wastewater streams that are incompatible with traditional ion exchange membranes. Membrion® has developed novel ceramic-based ion exchange membranes (IEMs) offering several advantages over traditional polymer membranes: high performance in low pH, chemical resistance to oxidizers, and a rigid structure that minimizes swelling. These membranes are synthesized with our patented silane-based sol-gel techniques. The pore size, shape, and network structure are engineered through a molecular self-assembly process where thermodynamic driving forces are used to direct where and how pores form. Either cationic or anionic groups can be added within the membrane nanopore structure to create cation- and anion-exchange membranes. The ceramic IEMs are produced on a roll-to-roll manufacturing line with low-temperature processing. Membrane performance testing is conducted using in-house permselectivity, area-specific resistance, and ED stack testing setups. Ceramic-based IEMs show comparable performance to traditional IEMs and offer some unique advantages. Long exposure to highly acidic solutions has a negligible impact on ED performance. Additionally, we have observed stable performance in the presence of strong oxidizing agents such as hydrogen peroxide. This stability is expected, as the ceramic backbone of these materials is already in a fully oxidized state. This data suggests ceramic membranes, made using sol-gel chemistry, could be an ideal solution for acidic and/or oxidizing wastewater streams from processes such as semiconductor manufacturing and mining.Keywords: ion exchange, membrane, silane chemistry, nanostructure, wastewater
Procedia PDF Downloads 8624223 On the Network Packet Loss Tolerance of SVM Based Activity Recognition
Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir
Abstract:
In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.Keywords: activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss
Procedia PDF Downloads 47524222 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman electricity Transmission Company
Authors: Rahma Saleh Hussein Al Balushi
Abstract:
Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS department. This paper will describe in detail the current GIS data submission process and the journey for developing it. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, and updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) for excavation permits and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting and data alterations has also contributed to reducing the missing attributes and enhance data quality index of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the years 2017 and year 2022. Overall, concluding that by governance, asset information & GIS department can control the GIS data process; collect, properly record, and manage asset data and information within the OETC network. This control extends to other applications and systems integrated with/related to GIS systems.Keywords: asset management ISO55001, standard procedures process, governance, CMMS
Procedia PDF Downloads 12524221 The Impact of the Enron Scandal on the Reputation of Corporate Social Responsibility Rating Agencies
Authors: Jaballah Jamil
Abstract:
KLD (Peter Kinder, Steve Lydenberg and Amy Domini) research & analytics is an independent intermediary of social performance information that adopts an investor-pay model. KLD rating agency does not have an explicit monitoring on the rated firm which suggests that KLD ratings may not include private informations. Moreover, the incapacity of KLD to predict accurately the extra-financial rating of Enron casts doubt on the reliability of KLD ratings. Therefore, we first investigate whether KLD ratings affect investors' perception by studying the effect of KLD rating changes on firms' financial performances. Second, we study the impact of the Enron scandal on investors' perception of KLD rating changes by comparing the effect of KLD rating changes on firms' financial performances before and after the failure of Enron. We propose an empirical study that relates a number of equally-weighted portfolios returns, excess stock returns and book-to-market ratio to different dimensions of KLD social responsibility ratings. We first find that over the last two decades KLD rating changes influence significantly and negatively stock returns and book-to-market ratio of rated firms. This finding suggests that a raise in corporate social responsibility rating lowers the firm's risk. Second, to assess the Enron scandal's effect on the perception of KLD ratings, we compare the effect of KLD rating changes before and after the Enron scandal. We find that after the Enron scandal this significant effect disappears. This finding supports the view that the Enron scandal annihilates the KLD's effect on Socially Responsible Investors. Therefore, our findings may question results of recent studies that use KLD ratings as a proxy for Corporate Social Responsibility behavior.Keywords: KLD social rating agency, investors' perception, investment decision, financial performance
Procedia PDF Downloads 43924220 Efects of Data Corelation in a Sparse-View Compresive Sensing Based Image Reconstruction
Authors: Sajid Abas, Jon Pyo Hong, Jung-Ryun Le, Seungryong Cho
Abstract:
Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.Keywords: computed tomography, computed laminography, compressive sending, low-dose
Procedia PDF Downloads 46424219 Fuzzy Wavelet Model to Forecast the Exchange Rate of IDR/USD
Authors: Tri Wijayanti Septiarini, Agus Maman Abadi, Muhammad Rifki Taufik
Abstract:
The exchange rate of IDR/USD can be the indicator to analysis Indonesian economy. The exchange rate as a important factor because it has big effect in Indonesian economy overall. So, it needs the analysis data of exchange rate. There is decomposition data of exchange rate of IDR/USD to be frequency and time. It can help the government to monitor the Indonesian economy. This method is very effective to identify the case, have high accurate result and have simple structure. In this paper, data of exchange rate that used is weekly data from December 17, 2010 until November 11, 2014.Keywords: the exchange rate, fuzzy mamdani, discrete wavelet transforms, fuzzy wavelet
Procedia PDF Downloads 57124218 Humanising Digital Healthcare to Build Capacity by Harnessing the Power of Patient Data
Authors: Durhane Wong-Rieger, Kawaldip Sehmi, Nicola Bedlington, Nicole Boice, Tamás Bereczky
Abstract:
Patient-generated health data should be seen as the expression of the experience of patients, including the outcomes reflecting the impact a treatment or service had on their physical health and wellness. We discuss how the healthcare system can reach a place where digital is a determinant of health - where data is generated by patients and is respected and which acknowledges their contribution to science. We explore the biggest barriers facing this. The International Experience Exchange with Patient Organisation’s Position Paper is based on a global patient survey conducted in Q3 2021 that received 304 responses. Results were discussed and validated by the 15 patient experts and supplemented with literature research. Results are a subset of this. Our research showed patient communities want to influence how their data is generated, shared, and used. Our study concludes that a reasonable framework is needed to protect the integrity of patient data and minimise abuse, and build trust. Results also demonstrated a need for patient communities to have more influence and control over how health data is generated, shared, and used. The results clearly highlight that the community feels there is a lack of clear policies on sharing data.Keywords: digital health, equitable access, humanise healthcare, patient data
Procedia PDF Downloads 8224217 Use of Machine Learning in Data Quality Assessment
Authors: Bruno Pinto Vieira, Marco Antonio Calijorne Soares, Armando Sérgio de Aguiar Filho
Abstract:
Nowadays, a massive amount of information has been produced by different data sources, including mobile devices and transactional systems. In this scenario, concerns arise on how to maintain or establish data quality, which is now treated as a product to be defined, measured, analyzed, and improved to meet consumers' needs, which is the one who uses these data in decision making and companies strategies. Information that reaches low levels of quality can lead to issues that can consume time and money, such as missed business opportunities, inadequate decisions, and bad risk management actions. The step of selecting, identifying, evaluating, and selecting data sources with significant quality according to the need has become a costly task for users since the sources do not provide information about their quality. Traditional data quality control methods are based on user experience or business rules limiting performance and slowing down the process with less than desirable accuracy. Using advanced machine learning algorithms, it is possible to take advantage of computational resources to overcome challenges and add value to companies and users. In this study, machine learning is applied to data quality analysis on different datasets, seeking to compare the performance of the techniques according to the dimensions of quality assessment. As a result, we could create a ranking of approaches used, besides a system that is able to carry out automatically, data quality assessment.Keywords: machine learning, data quality, quality dimension, quality assessment
Procedia PDF Downloads 14824216 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges
Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars
Abstract:
In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting
Procedia PDF Downloads 15324215 Nuclear Decay Data Evaluation for 217Po
Authors: S. S. Nafee, A. M. Al-Ramady, S. A. Shaheen
Abstract:
Evaluated nuclear decay data for the 217Po nuclide ispresented in the present work. These data include recommended values for the half-life T1/2, α-, β--, and γ-ray emission energies and probabilities. Decay data from 221Rn α and 217Bi β—decays are presented. Q(α) has been updated based on the recent published work of the Atomic Mass Evaluation AME2012. In addition, the logft values were calculated using the Logft program from the ENSDF evaluation package. Moreover, the total internal conversion electrons has been calculated using Bricc program. Meanwhile, recommendation values or the multi-polarities have been assigned based on recently measurement yield a better intensity balance at the 254 keV and 264 keV gamma transitions.Keywords: nuclear decay data evaluation, mass evaluation, total converison coefficients, atomic mass evaluation
Procedia PDF Downloads 43324214 Geographic Information System Using Google Fusion Table Technology for the Delivery of Disease Data Information
Authors: I. Nyoman Mahayasa Adiputra
Abstract:
Data in the field of health can be useful for the purposes of data analysis, one example of health data is disease data. Disease data is usually in a geographical plot in accordance with the area. Where the data was collected, in the city of Denpasar, Bali. Disease data report is still published in tabular form, disease information has not been mapped in GIS form. In this research, disease information in Denpasar city will be digitized in the form of a geographic information system with the smallest administrative area in the form of district. Denpasar City consists of 4 districts of North Denpasar, East Denpasar, West Denpasar and South Denpasar. In this research, we use Google fusion table technology for map digitization process, where this technology can facilitate from the administrator and from the recipient information. From the administrator side of the input disease, data can be done easily and quickly. From the receiving end of the information, the resulting GIS application can be published in a website-based application so that it can be accessed anywhere and anytime. In general, the results obtained in this study, divided into two, namely: (1) Geolocation of Denpasar and all of Denpasar districts, the process of digitizing the map of Denpasar city produces a polygon geolocation of each - district of Denpasar city. These results can be utilized in subsequent GIS studies if you want to use the same administrative area. (2) Dengue fever mapping in 2014 and 2015. Disease data used in this study is dengue fever case data taken in 2014 and 2015. Data taken from the profile report Denpasar Health Department 2015 and 2016. This mapping can be useful for the analysis of the spread of dengue hemorrhagic fever in the city of Denpasar.Keywords: geographic information system, Google fusion table technology, delivery of disease data information, Denpasar city
Procedia PDF Downloads 12924213 Inclusive Practices in Health Sciences: Equity Proofing Higher Education Programs
Authors: Mitzi S. Brammer
Abstract:
Given that the cultural make-up of programs of study in institutions of higher learning is becoming increasingly diverse, much has been written about cultural diversity from a university-level perspective. However, there are little data in the way of specific programs and how they address inclusive practices when teaching and working with marginalized populations. This research study aimed to discover baseline knowledge and attitudes of health sciences faculty, instructional staff, and students related to inclusive teaching/learning and interactions. Quantitative data were collected via an anonymous online survey (one designed for students and another designed for faculty/instructional staff) using a web-based program called Qualtrics. Quantitative data were analyzed amongst the faculty/instructional staff and students, respectively, using descriptive and comparative statistics (t-tests). Additionally, some participants voluntarily engaged in a focus group discussion in which qualitative data were collected around these same variables. Collecting qualitative data to triangulate the quantitative data added trustworthiness to the overall data. The research team analyzed collected data and compared identified categories and trends, comparing those data between faculty/staff and students, and reported results as well as implications for future study and professional practice.Keywords: inclusion, higher education, pedagogy, equity, diversity
Procedia PDF Downloads 6724212 Neural Networks Based Prediction of Long Term Rainfall: Nine Pilot Study Zones over the Mediterranean Basin
Authors: Racha El Kadiri, Mohamed Sultan, Henrique Momm, Zachary Blair, Rachel Schultz, Tamer Al-Bayoumi
Abstract:
The Mediterranean Basin is a very diverse region of nationalities and climate zones, with a strong dependence on agricultural activities. Predicting long term (with a lead of 1 to 12 months) rainfall, and future droughts could contribute in a sustainable management of water resources and economical activities. In this study, an integrated approach was adopted to construct predictive tools with lead times of 0 to 12 months to forecast rainfall amounts over nine subzones of the Mediterranean Basin region. The following steps were conducted: (1) acquire, assess and intercorrelate temporal remote sensing-based rainfall products (e.g. The CPC Merged Analysis of Precipitation [CMAP]) throughout the investigation period (1979 to 2016), (2) acquire and assess monthly values for all of the climatic indices influencing the regional and global climatic patterns (e.g., Northern Atlantic Oscillation [NOI], Southern Oscillation Index [SOI], and Tropical North Atlantic Index [TNA]); (3) delineate homogenous climatic regions and select nine pilot study zones, (4) apply data mining methods (e.g. neural networks, principal component analyses) to extract relationships between the observed rainfall and the controlling factors (i.e. climatic indices with multiple lead-time periods) and (5) use the constructed predictive tools to forecast monthly rainfall and dry and wet periods. Preliminary results indicate that rainfall and dry/wet periods were successfully predicted with lead zones of 0 to 12 months using the adopted methodology, and that the approach is more accurately applicable in the southern Mediterranean region.Keywords: rainfall, neural networks, climatic indices, Mediterranean
Procedia PDF Downloads 31224211 Mean Shift-Based Preprocessing Methodology for Improved 3D Buildings Reconstruction
Authors: Nikolaos Vassilas, Theocharis Tsenoglou, Djamchid Ghazanfarpour
Abstract:
In this work we explore the capability of the mean shift algorithm as a powerful preprocessing tool for improving the quality of spatial data, acquired from airborne scanners, from densely built urban areas. On one hand, high resolution image data corrupted by noise caused by lossy compression techniques are appropriately smoothed while at the same time preserving the optical edges and, on the other, low resolution LiDAR data in the form of normalized Digital Surface Map (nDSM) is upsampled through the joint mean shift algorithm. Experiments on both the edge-preserving smoothing and upsampling capabilities using synthetic RGB-z data show that the mean shift algorithm is superior to bilateral filtering as well as to other classical smoothing and upsampling algorithms. Application of the proposed methodology for 3D reconstruction of buildings of a pilot region of Athens, Greece results in a significant visual improvement of the 3D building block model.Keywords: 3D buildings reconstruction, data fusion, data upsampling, mean shift
Procedia PDF Downloads 31524210 GIS Data Governance: GIS Data Submission Process for Build-in Project, Replacement Project at Oman Electricity Transmission Company
Authors: Rahma Al Balushi
Abstract:
Oman Electricity Transmission Company's (OETC) vision is to be a renowned world-class transmission grid by 2025, and one of the indications of achieving the vision is obtaining Asset Management ISO55001 certification, which required setting out a documented Standard Operating Procedures (SOP). Hence, documented SOP for the Geographical information system data process has been established. Also, to effectively manage and improve OETC power transmission, asset data and information need to be governed as such by Asset Information & GIS dept. This paper will describe in detail the GIS data submission process and the journey to develop the current process. The methodology used to develop the process is based on three main pillars, which are system and end-user requirements, Risk evaluation, data availability, and accuracy. The output of this paper shows the dramatic change in the used process, which results subsequently in more efficient, accurate, updated data. Furthermore, due to this process, GIS has been and is ready to be integrated with other systems as well as the source of data for all OETC users. Some decisions related to issuing No objection certificates (NOC) and scheduling asset maintenance plans in Computerized Maintenance Management System (CMMS) have been made consequently upon GIS data availability. On the Other hand, defining agreed and documented procedures for data collection, data systems update, data release/reporting, and data alterations salso aided to reduce the missing attributes of GIS transmission data. A considerable difference in Geodatabase (GDB) completeness percentage was observed between the year 2017 and the year 2021. Overall, concluding that by governance, asset information & GIS department can control GIS data process; collect, properly record, and manage asset data and information within OETC network. This control extends to other applications and systems integrated with/related to GIS systems.Keywords: asset management ISO55001, standard procedures process, governance, geodatabase, NOC, CMMS
Procedia PDF Downloads 20724209 Importance of Ethics in Cloud Security
Authors: Pallavi Malhotra
Abstract:
This paper examines the importance of ethics in cloud computing. In the modern society, cloud computing is offering individuals and businesses an unlimited space for storing and processing data or information. Most of the data and information stored in the cloud by various users such as banks, doctors, architects, engineers, lawyers, consulting firms, and financial institutions among others require a high level of confidentiality and safeguard. Cloud computing offers centralized storage and processing of data, and this has immensely contributed to the growth of businesses and improved sharing of information over the internet. However, the accessibility and management of data and servers by a third party raise concerns regarding the privacy of clients’ information and the possible manipulations of the data by third parties. This document suggests the approaches various stakeholders should take to address various ethical issues involving cloud-computing services. Ethical education and training is key to all stakeholders involved in the handling of data and information stored or being processed in the cloud.Keywords: IT ethics, cloud computing technology, cloud privacy and security, ethical education
Procedia PDF Downloads 32524208 Smart Laboratory for Clean Rivers in India - An Indo-Danish Collaboration
Authors: Nikhilesh Singh, Shishir Gaur, Anitha K. Sharma
Abstract:
Climate change and anthropogenic stress have severely affected ecosystems all over the globe. Indian rivers are under immense pressure, facing challenges like pollution, encroachment, extreme fluctuation in the flow regime, local ignorance and lack of coordination between stakeholders. To counter all these issues a holistic river rejuvenation plan is needed that tests, innovates and implements sustainable solutions in the river space for sustainable river management. Smart Laboratory for Clean Rivers (SLCR) an Indo-Danish collaboration project, provides a living lab setup that brings all the stakeholders (government agencies, academic and industrial partners and locals) together to engage, learn, co-creating and experiment for a clean and sustainable river that last for ages. Just like every mega project requires piloting, SLCR has opted for a small catchment of the Varuna River, located in the Middle Ganga Basin in India. Considering the integrated approach of river rejuvenation, SLCR embraces various techniques and upgrades for rejuvenation. Likely, maintaining flow in the channel in the lean period, Managed Aquifer Recharge (MAR) is a proven technology. In SLCR, Floa-TEM high-resolution lithological data is used in MAR models to have better decision-making for MAR structures nearby of the river to enhance the river aquifer exchanges. Furthermore, the concerns of quality in the river are a big issue. A city like Varanasi which is located in the last stretch of the river, generates almost 260 MLD of domestic waste in the catchment. The existing STP system is working at full efficiency. Instead of installing a new STP for the future, SLCR is upgrading those STPs with an IoT-based system that optimizes according to the nutrient load and energy consumption. SLCR also advocate nature-based solutions like a reed bed for the drains having less flow. In search of micropollutants, SLCR uses fingerprint analysis involves employing advanced techniques like chromatography and mass spectrometry to create unique chemical profiles. However, rejuvenation attempts cannot be possible without involving the entire catchment. A holistic water management plan that includes storm management, water harvesting structure to efficiently manage the flow of water in the catchment and installation of several buffer zones to restrict pollutants entering into the river. Similarly, carbon (emission and sequestration) is also an important parameter for the catchment. By adopting eco-friendly practices, a ripple effect positively influences the catchment's water dynamics and aids in the revival of river systems. SLCR has adopted 4 villages to make them carbon-neutral and water-positive. Moreover, for the 24×7 monitoring of the river and the catchment, robust IoT devices are going to be installed to observe, river and groundwater quality, groundwater level, river discharge and carbon emission in the catchment and ultimately provide fuel for the data analytics. In its completion, SLCR will provide a river restoration manual, which will strategise the detailed plan and way of implementation for stakeholders. Lastly, the entire process is planned in such a way that will be managed by local administrations and stakeholders equipped with capacity-building activity. This holistic approach makes SLCR unique in the field of river rejuvenation.Keywords: sustainable management, holistic approach, living lab, integrated river management
Procedia PDF Downloads 6024207 The Feminism of Data Privacy and Protection in Africa
Authors: Olayinka Adeniyi, Melissa Omino
Abstract:
The field of data privacy and data protection in Africa is still an evolving area, with many African countries yet to enact legislation on the subject. While African Governments are bringing their legislation to speed in this field, how patriarchy pervades every sector of African thought and manifests in society needs to be considered. Moreover, the laws enacted ought to be inclusive, especially towards women. This, in a nutshell, is the essence of data feminism. Data feminism is a new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism. Feminising data privacy and protection will involve thinking women, considering women in the issues of data privacy and protection, particularly in legislation, as is the case in this paper. The line of thought of women inclusion is not uncommon when even international and regional human rights specific for women only came long after the general human rights. The consideration is that these should have been inserted or rather included in the original general instruments in the first instance. Since legislation on data privacy is coming in this century, having seen the rights and shortcomings of earlier instruments, then the cue should be taken to ensure inclusive wholistic legislation for data privacy and protection in the first instance. Data feminism is arguably an area that has been scantily researched, albeit a needful one. With the spate of increase in the violence against women spiraling in the cyber world, compounding the issue of COVID-19 and the needful response of governments, and the effect of these on women and their rights, fast forward, the research on the feminism of data privacy and protection in Africa becomes inevitable. This paper seeks to answer the questions, what is data feminism in the African context, why is it important in the issue of data privacy and protection legislation; what are the laws, if any, existing on data privacy and protection in Africa, are they women inclusive, if not, why; what are the measures put in place for the privacy and protection of women in Africa, and how can this be made possible. The paper aims to investigate the issue of data privacy and protection in Africa, the legal framework, and the protection or provision that it has for women if any. It further aims to research the importance and necessity of feminizing data privacy and protection, the effect of lack of it, the challenges or bottlenecks in attaining this feat and the possibilities of accessing data privacy and protection for African women. The paper also researches the emerging practices of data privacy and protection of women in other jurisprudences. It approaches the research through the methodology of review of papers, analysis of laws, and reports. It seeks to contribute to the existing literature in the field and is explorative in its suggestion. It suggests a draft of some clauses to make any data privacy and protection legislation women inclusive. It would be useful for policymaking, academic, and public enlightenment.Keywords: feminism, women, law, data, Africa
Procedia PDF Downloads 20624206 Evaluation of Practicality of On-Demand Bus Using Actual Taxi-Use Data through Exhaustive Simulations
Authors: Jun-ichi Ochiai, Itsuki Noda, Ryo Kanamori, Keiji Hirata, Hitoshi Matsubara, Hideyuki Nakashima
Abstract:
We conducted exhaustive simulations for data assimilation and evaluation of service quality for various setting in a new shared transportation system, called SAVS. Computational social simulation is a key technology to design recent social services like SAVS as new transportation service. One open issue in SAVS was to determine the service scale through the social simulation. Using our exhaustive simulation framework, OACIS, we did data-assimilation and evaluation of effects of SAVS based on actual tax-use data at Tajimi city, Japan. Finally, we get the conditions to realize the new service in a reasonable service quality.Keywords: on-demand bus sytem, social simulation, data assimilation, exhaustive simulation
Procedia PDF Downloads 32124205 Cloning and Characterization of Uridine-5’-Diphosphate -Glucose Pyrophosphorylases from Lactobacillus Kefiranofaciens and Rhodococcus Wratislaviensis
Authors: Mesfin Angaw Tesfay
Abstract:
Uridine-5’-diphosphate (UDP)-glucose is one of the most versatile building blocks within the metabolism of prokaryotes and eukaryotes serving as an activated sugar donor during the glycosylation of natural products. It is formed by the enzyme UDP-glucose pyrophosphorylase (UGPase) using uridine-5′-triphosphate (UTP) and α-d-glucose 1-phosphate as a substrate. Herein two UGPase genes from Lactobacillus kefiranofaciens ZW3 (LkUGPase) and Rhodococcus wratislaviensis IFP 2016 (RwUGPase) were identified through genome mining approaches. The LkUGPase and RwUGPase have 299 and 306 amino acids, respectively. Both UGPase has the conserved UTP binding site (G-X-G-T-R-X-L-P) and the glucose -1-phosphate binding site (V-E-K-P). The LkUGPase and RwUGPase were cloned in E. coli and SDS-PAGE analysis showed the expression of both enzymes forming about 36 KDa of protein band after induction. LkUGPase and RwUGPase have an activity of 1549.95 and 671.53 U/mg respectively. Currently, their kinetic properties are under investigation.Keywords: UGPase, LkUGPase, RwUGPase, UDP-glucose, Glycosylation
Procedia PDF Downloads 2124204 Optimal Pricing Based on Real Estate Demand Data
Authors: Vanessa Kummer, Maik Meusel
Abstract:
Real estate demand estimates are typically derived from transaction data. However, in regions with excess demand, transactions are driven by supply and therefore do not indicate what people are actually looking for. To estimate the demand for housing in Switzerland, search subscriptions from all important Swiss real estate platforms are used. These data do, however, suffer from missing information—for example, many users do not specify how many rooms they would like or what price they would be willing to pay. In economic analyses, it is often the case that only complete data is used. Usually, however, the proportion of complete data is rather small which leads to most information being neglected. Also, the data might have a strong distortion if it is complete. In addition, the reason that data is missing might itself also contain information, which is however ignored with that approach. An interesting issue is, therefore, if for economic analyses such as the one at hand, there is an added value by using the whole data set with the imputed missing values compared to using the usually small percentage of complete data (baseline). Also, it is interesting to see how different algorithms affect that result. The imputation of the missing data is done using unsupervised learning. Out of the numerous unsupervised learning approaches, the most common ones, such as clustering, principal component analysis, or neural networks techniques are applied. By training the model iteratively on the imputed data and, thereby, including the information of all data into the model, the distortion of the first training set—the complete data—vanishes. In a next step, the performances of the algorithms are measured. This is done by randomly creating missing values in subsets of the data, estimating those values with the relevant algorithms and several parameter combinations, and comparing the estimates to the actual data. After having found the optimal parameter set for each algorithm, the missing values are being imputed. Using the resulting data sets, the next step is to estimate the willingness to pay for real estate. This is done by fitting price distributions for real estate properties with certain characteristics, such as the region or the number of rooms. Based on these distributions, survival functions are computed to obtain the functional relationship between characteristics and selling probabilities. Comparing the survival functions shows that estimates which are based on imputed data sets do not differ significantly from each other; however, the demand estimate that is derived from the baseline data does. This indicates that the baseline data set does not include all available information and is therefore not representative for the entire sample. Also, demand estimates derived from the whole data set are much more accurate than the baseline estimation. Thus, in order to obtain optimal results, it is important to make use of all available data, even though it involves additional procedures such as data imputation.Keywords: demand estimate, missing-data imputation, real estate, unsupervised learning
Procedia PDF Downloads 28524203 Unlocking the Puzzle of Borrowing Adult Data for Designing Hybrid Pediatric Clinical Trials
Authors: Rajesh Kumar G
Abstract:
A challenging aspect of any clinical trial is to carefully plan the study design to meet the study objective in optimum way and to validate the assumptions made during protocol designing. And when it is a pediatric study, there is the added challenge of stringent guidelines and difficulty in recruiting the necessary subjects. Unlike adult trials, there is not much historical data available for pediatrics, which is required to validate assumptions for planning pediatric trials. Typically, pediatric studies are initiated as soon as approval is obtained for a drug to be marketed for adults, so with the adult study historical information and with the available pediatric pilot study data or simulated pediatric data, the pediatric study can be well planned. Generalizing the historical adult study for new pediatric study is a tedious task; however, it is possible by integrating various statistical techniques and utilizing the advantage of hybrid study design, which will help to achieve the study objective in a smoother way even with the presence of many constraints. This research paper will explain how well the hybrid study design can be planned along with integrated technique (SEV) to plan the pediatric study; In brief the SEV technique (Simulation, Estimation (using borrowed adult data and applying Bayesian methods)) incorporates the use of simulating the planned study data and getting the desired estimates to Validate the assumptions.This method of validation can be used to improve the accuracy of data analysis, ensuring that results are as valid and reliable as possible, which allow us to make informed decisions well ahead of study initiation. With professional precision, this technique based on the collected data allows to gain insight into best practices when using data from historical study and simulated data alike.Keywords: adaptive design, simulation, borrowing data, bayesian model
Procedia PDF Downloads 7724202 Analyzing Test Data Generation Techniques Using Evolutionary Algorithms
Authors: Arslan Ellahi, Syed Amjad Hussain
Abstract:
Software Testing is a vital process in software development life cycle. We can attain the quality of software after passing it through software testing phase. We have tried to find out automatic test data generation techniques that are a key research area of software testing to achieve test automation that can eventually decrease testing time. In this paper, we review some of the approaches presented in the literature which use evolutionary search based algorithms like Genetic Algorithm, Particle Swarm Optimization (PSO), etc. to validate the test data generation process. We also look into the quality of test data generation which increases or decreases the efficiency of testing. We have proposed test data generation techniques for model-based testing. We have worked on tuning and fitness function of PSO algorithm.Keywords: search based, evolutionary algorithm, particle swarm optimization, genetic algorithm, test data generation
Procedia PDF Downloads 19024201 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential
Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag
Abstract:
Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.Keywords: climate, reanalysis, renewable energy, solar radiation
Procedia PDF Downloads 20924200 Analysis and Prediction of Netflix Viewing History Using Netflixlatte as an Enriched Real Data Pool
Authors: Amir Mabhout, Toktam Ghafarian, Amirhossein Farzin, Zahra Makki, Sajjad Alizadeh, Amirhossein Ghavi
Abstract:
The high number of Netflix subscribers makes it attractive for data scientists to extract valuable knowledge from the viewers' behavioural analyses. This paper presents a set of statistical insights into viewers' viewing history. After that, a deep learning model is used to predict the future watching behaviour of the users based on previous watching history within the Netflixlatte data pool. Netflixlatte in an aggregated and anonymized data pool of 320 Netflix viewers with a length 250 000 data points recorded between 2008-2022. We observe insightful correlations between the distribution of viewing time and the COVID-19 pandemic outbreak. The presented deep learning model predicts future movie and TV series viewing habits with an average loss of 0.175.Keywords: data analysis, deep learning, LSTM neural network, netflix
Procedia PDF Downloads 25124199 Analysis of User Data Usage Trends on Cellular and Wi-Fi Networks
Authors: Jayesh M. Patel, Bharat P. Modi
Abstract:
The availability of on mobile devices that can invoke the demonstrated that the total data demand from users is far higher than previously articulated by measurements based solely on a cellular-centric view of smart-phone usage. The ratio of Wi-Fi to cellular traffic varies significantly between countries, This paper is shown the compression between the cellular data usage and Wi-Fi data usage by the user. This strategy helps operators to understand the growing importance and application of yield management strategies designed to squeeze maximum returns from their investments into the networks and devices that enable the mobile data ecosystem. The transition from unlimited data plans towards tiered pricing and, in the future, towards more value-centric pricing offers significant revenue upside potential for mobile operators, but, without a complete insight into all aspects of smartphone customer behavior, operators will unlikely be able to capture the maximum return from this billion-dollar market opportunity.Keywords: cellular, Wi-Fi, mobile, smart phone
Procedia PDF Downloads 36524198 Data Driven Infrastructure Planning for Offshore Wind farms
Authors: Isha Saxena, Behzad Kazemtabrizi, Matthias C. M. Troffaes, Christopher Crabtree
Abstract:
The calculations done at the beginning of the life of a wind farm are rarely reliable, which makes it important to conduct research and study the failure and repair rates of the wind turbines under various conditions. This miscalculation happens because the current models make a simplifying assumption that the failure/repair rate remains constant over time. This means that the reliability function is exponential in nature. This research aims to create a more accurate model using sensory data and a data-driven approach. The data cleaning and data processing is done by comparing the Power Curve data of the wind turbines with SCADA data. This is then converted to times to repair and times to failure timeseries data. Several different mathematical functions are fitted to the times to failure and times to repair data of the wind turbine components using Maximum Likelihood Estimation and the Posterior expectation method for Bayesian Parameter Estimation. Initial results indicate that two parameter Weibull function and exponential function produce almost identical results. Further analysis is being done using the complex system analysis considering the failures of each electrical and mechanical component of the wind turbine. The aim of this project is to perform a more accurate reliability analysis that can be helpful for the engineers to schedule maintenance and repairs to decrease the downtime of the turbine.Keywords: reliability, bayesian parameter inference, maximum likelihood estimation, weibull function, SCADA data
Procedia PDF Downloads 8624197 Empirical Acceleration Functions and Fuzzy Information
Authors: Muhammad Shafiq
Abstract:
In accelerated life testing approaches life time data is obtained under various conditions which are considered more severe than usual condition. Classical techniques are based on obtained precise measurements, and used to model variation among the observations. In fact, there are two types of uncertainty in data: variation among the observations and the fuzziness. Analysis techniques, which do not consider fuzziness and are only based on precise life time observations, lead to pseudo results. This study was aimed to examine the behavior of empirical acceleration functions using fuzzy lifetimes data. The results showed an increased fuzziness in the transformed life times as compare to the input data.Keywords: acceleration function, accelerated life testing, fuzzy number, non-precise data
Procedia PDF Downloads 298