Search results for: missing data imputation
24783 Syndromic Surveillance Framework Using Tweets Data Analytics
Authors: David Ming Liu, Benjamin Hirsch, Bashir Aden
Abstract:
Syndromic surveillance is to detect or predict disease outbreaks through the analysis of medical sources of data. Using social media data like tweets to do syndromic surveillance becomes more and more popular with the aid of open platform to collect data and the advantage of microblogging text and mobile geographic location features. In this paper, a Syndromic Surveillance Framework is presented with machine learning kernel using tweets data analytics. Influenza and the three cities Abu Dhabi, Al Ain and Dubai of United Arabic Emirates are used as the test disease and trial areas. Hospital cases data provided by the Health Authority of Abu Dhabi (HAAD) are used for the correlation purpose. In our model, Latent Dirichlet allocation (LDA) engine is adapted to do supervised learning classification and N-Fold cross validation confusion matrix are given as the simulation results with overall system recall 85.595% performance achieved.Keywords: Syndromic surveillance, Tweets, Machine Learning, data mining, Latent Dirichlet allocation (LDA), Influenza
Procedia PDF Downloads 11624782 Analysis of Urban Population Using Twitter Distribution Data: Case Study of Makassar City, Indonesia
Authors: Yuyun Wabula, B. J. Dewancker
Abstract:
In the past decade, the social networking app has been growing very rapidly. Geolocation data is one of the important features of social media that can attach the user's location coordinate in the real world. This paper proposes the use of geolocation data from the Twitter social media application to gain knowledge about urban dynamics, especially on human mobility behavior. This paper aims to explore the relation between geolocation Twitter with the existence of people in the urban area. Firstly, the study will analyze the spread of people in the particular area, within the city using Twitter social media data. Secondly, we then match and categorize the existing place based on the same individuals visiting. Then, we combine the Twitter data from the tracking result and the questionnaire data to catch the Twitter user profile. To do that, we used the distribution frequency analysis to learn the visitors’ percentage. To validate the hypothesis, we compare it with the local population statistic data and land use mapping released by the city planning department of Makassar local government. The results show that there is the correlation between Twitter geolocation and questionnaire data. Thus, integration the Twitter data and survey data can reveal the profile of the social media users.Keywords: geolocation, Twitter, distribution analysis, human mobility
Procedia PDF Downloads 31424781 Analysis and Rule Extraction of Coronary Artery Disease Data Using Data Mining
Authors: Rezaei Hachesu Peyman, Oliyaee Azadeh, Salahzadeh Zahra, Alizadeh Somayyeh, Safaei Naser
Abstract:
Coronary Artery Disease (CAD) is one major cause of disability in adults and one main cause of death in developed. In this study, data mining techniques including Decision Trees, Artificial neural networks (ANNs), and Support Vector Machine (SVM) analyze CAD data. Data of 4948 patients who had suffered from heart diseases were included in the analysis. CAD is the target variable, and 24 inputs or predictor variables are used for the classification. The performance of these techniques is compared in terms of sensitivity, specificity, and accuracy. The most significant factor influencing CAD is chest pain. Elderly males (age > 53) have a high probability to be diagnosed with CAD. SVM algorithm is the most useful way for evaluation and prediction of CAD patients as compared to non-CAD ones. Application of data mining techniques in analyzing coronary artery diseases is a good method for investigating the existing relationships between variables.Keywords: classification, coronary artery disease, data-mining, knowledge discovery, extract
Procedia PDF Downloads 65724780 Defective Autophagy Disturbs Neural Migration and Network Activity in hiPSC-Derived Cockayne Syndrome B Disease Models
Authors: Julia Kapr, Andrea Rossi, Haribaskar Ramachandran, Marius Pollet, Ilka Egger, Selina Dangeleit, Katharina Koch, Jean Krutmann, Ellen Fritsche
Abstract:
It is widely acknowledged that animal models do not always represent human disease. Especially human brain development is difficult to model in animals due to a variety of structural and functional species-specificities. This causes significant discrepancies between predicted and apparent drug efficacies in clinical trials and their subsequent failure. Emerging alternatives based on 3D in vitro approaches, such as human brain spheres or organoids, may in the future reduce and ultimately replace animal models. Here, we present a human induced pluripotent stem cell (hiPSC)-based 3D neural in a vitro disease model for the Cockayne Syndrome B (CSB). CSB is a rare hereditary disease and is accompanied by severe neurologic defects, such as microcephaly, ataxia and intellectual disability, with currently no treatment options. Therefore, the aim of this study is to investigate the molecular and cellular defects found in neural hiPSC-derived CSB models. Understanding the underlying pathology of CSB enables the development of treatment options. The two CSB models used in this study comprise a patient-derived hiPSC line and its isogenic control as well as a CSB-deficient cell line based on a healthy hiPSC line (IMR90-4) background thereby excluding genetic background-related effects. Neurally induced and differentiated brain sphere cultures were characterized via RNA Sequencing, western blot (WB), immunocytochemistry (ICC) and multielectrode arrays (MEAs). CSB-deficiency leads to an altered gene expression of markers for autophagy, focal adhesion and neural network formation. Cell migration was significantly reduced and electrical activity was significantly increased in the disease cell lines. These data hint that the cellular pathologies is possibly underlying CSB. By induction of autophagy, the migration phenotype could be partially rescued, suggesting a crucial role of disturbed autophagy in defective neural migration of the disease lines. Altered autophagy may also lead to inefficient mitophagy. Accordingly, disease cell lines were shown to have a lower mitochondrial base activity and a higher susceptibility to mitochondrial stress induced by rotenone. Since mitochondria play an important role in neurotransmitter cycling, we suggest that defective mitochondria may lead to altered electrical activity in the disease cell lines. Failure to clear the defective mitochondria by mitophagy and thus missing initiation cues for new mitochondrial production could potentiate this problem. With our data, we aim at establishing a disease adverse outcome pathway (AOP), thereby adding to the in-depth understanding of this multi-faced disorder and subsequently contributing to alternative drug development.Keywords: autophagy, disease modeling, in vitro, pluripotent stem cells
Procedia PDF Downloads 12024779 Sensor Data Analysis for a Large Mining Major
Authors: Sudipto Shanker Dasgupta
Abstract:
One of the largest mining companies wanted to look at health analytics for their driverless trucks. These trucks were the key to their supply chain logistics. The automated trucks had multi-level sub-assemblies which would send out sensor information. The use case that was worked on was to capture the sensor signal from the truck subcomponents and analyze the health of the trucks from repair and replacement purview. Open source software was used to stream the data into a clustered Hadoop setup in Amazon Web Services cloud and Apache Spark SQL was used to analyze the data. All of this was achieved through a 10 node amazon 32 core, 64 GB RAM setup real-time analytics was achieved on ‘300 million records’. To check the scalability of the system, the cluster was increased to 100 node setup. This talk will highlight how Open Source software was used to achieve the above use case and the insights on the high data throughput on a cloud set up.Keywords: streaming analytics, data science, big data, Hadoop, high throughput, sensor data
Procedia PDF Downloads 40424778 Data-Centric Anomaly Detection with Diffusion Models
Authors: Sheldon Liu, Gordon Wang, Lei Liu, Xuefeng Liu
Abstract:
Anomaly detection, also referred to as one-class classification, plays a crucial role in identifying product images that deviate from the expected distribution. This study introduces Data-centric Anomaly Detection with Diffusion Models (DCADDM), presenting a systematic strategy for data collection and further diversifying the data with image generation via diffusion models. The algorithm addresses data collection challenges in real-world scenarios and points toward data augmentation with the integration of generative AI capabilities. The paper explores the generation of normal images using diffusion models. The experiments demonstrate that with 30% of the original normal image size, modeling in an unsupervised setting with state-of-the-art approaches can achieve equivalent performances. With the addition of generated images via diffusion models (10% equivalence of the original dataset size), the proposed algorithm achieves better or equivalent anomaly localization performance.Keywords: diffusion models, anomaly detection, data-centric, generative AI
Procedia PDF Downloads 8224777 Regulation on the Protection of Personal Data Versus Quality Data Assurance in the Healthcare System Case Report
Authors: Elizabeta Krstić Vukelja
Abstract:
Digitization of personal data is a consequence of the development of information and communication technologies that create a new work environment with many advantages and challenges, but also potential threats to privacy and personal data protection. Regulation (EU) 2016/679 of the European Parliament and of the Council is becoming a law and obligation that should address the issues of personal data protection and information security. The existence of the Regulation leads to the conclusion that national legislation in the field of virtual environment, protection of the rights of EU citizens and processing of their personal data is insufficiently effective. In the health system, special emphasis is placed on the processing of special categories of personal data, such as health data. The healthcare industry is recognized as a particularly sensitive area in which a large amount of medical data is processed, the digitization of which enables quick access and quick identification of the health insured. The protection of the individual requires quality IT solutions that guarantee the technical protection of personal categories. However, the real problems are the technical and human nature and the spatial limitations of the application of the Regulation. Some conclusions will be drawn by analyzing the implementation of the basic principles of the Regulation on the example of the Croatian health care system and comparing it with similar activities in other EU member states.Keywords: regulation, healthcare system, personal dana protection, quality data assurance
Procedia PDF Downloads 3824776 Parallel Vector Processing Using Multi Level Orbital DATA
Authors: Nagi Mekhiel
Abstract:
Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing
Procedia PDF Downloads 27024775 Context-Aware Alert Method in Hajj Pilgrim Location-Based Tracking System
Authors: Syarif Hidayat
Abstract:
As millions of people with different backgrounds perform hajj every year in Saudi Arabia, it brings out several problems. Missing people is among many crucial problems need to be encountered. Some people might have had insufficient knowledge of using tracking system equipment. Other might become a victim of an accident, lose consciousness, or even died, prohibiting them to perform certain activity. For those reasons, people could not send proper SOS message. The major contribution of this paper is the application of the diverse alert method in pilgrims tracking system. It offers a simple yet robust solution to send SOS message by pilgrims during Hajj. Knowledge of context aware computing is assumed herein. This study presents four methods that could be utilized by pilgrims to send SOS. The first method is simple mobile application contains only a button. The second method is based on behavior analysis based off GPS location movement anomaly. The third method is by introducing pressing pattern to smartwatch physical button as a panic button. The fourth method is by identifying certain accelerometer pattern recognition as a sign of emergency situations. Presented method in this paper would be an important part of pilgrims tracking system. The discussion provided here includes easy to use design whilst maintaining tracking accuracy, privacy, and security of its users.Keywords: context aware computing, emergency alert system, GPS, hajj pilgrim tracking, location-based services
Procedia PDF Downloads 21624774 Reconstructability Analysis for Landslide Prediction
Authors: David Percy
Abstract:
Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.Keywords: reconstructability analysis, machine learning, landslides, raster analysis
Procedia PDF Downloads 6524773 Data Analytics in Hospitality Industry
Authors: Tammy Wee, Detlev Remy, Arif Perdana
Abstract:
In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing
Procedia PDF Downloads 17924772 Realization of a (GIS) for Drilling (DWS) through the Adrar Region
Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz
Abstract:
Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.Keywords: GIS, DWS, drilling, Adrar
Procedia PDF Downloads 30924771 Oral Health of Tobacco Chewers: A Cross-Sectional Study in Karachi, Pakistan
Authors: Warsi A. Ibrahim, Qureshi A. Ambrina, Younus M. Anjum
Abstract:
Introduction: Oral lesions related to commercially available Smokeless Tobacco (ST), such as, Pan, Gutka, Mahwa, Naswar is considered a serious challenge for dental health care providers in Pakistan. Majority of labored Pakistani population consume ST, where public transporters and drivers are no exception. It was necessary to identify individuals of this particular population group and screen their oral health and early signs of pre-cancerous lesions so that appropriate preventive measures could be taken to reduce the burden on health providers. Aim of Study: To estimate Prevalence of ST consumption and perception of use, and to evaluate Oral Health status among public drivers of Karachi. Material & methods: A cross-sectional study survey was conducted over duration of 2 months, through convenient sampling. Sample size (n=615) of public drivers (age > 18 years) all over Karachi was gathered. A structured proforma was used to record socio-demographics, addiction profile, perception of use and oral health status (oral lesions, oral sub-mucosal fibrosis and dental caries) of study participants. Data was entered and analyzed using SPSS version 16.0 using descriptive statistics only. Results: Prevalence of ST consumption among the study participants was figured to 92.5%. Out of these almost 70% suffered from one or the other form of oral lesion(s). Four major types of ST consumption were observed out of which 60 % of oral lesion were related to Gutka chewers showing early signs of oral cancer. In addition, occurrence of Oral sub-mucosal fibrosis (OSF) was found to be significantly high around 54.8%. Overall dental caries status was also high, showing on an average 5 teeth of an individual were decayed, missing or filled deviating from WHO normal criteria (mean < 3). It was thus proven from the study that public drivers relied on oral tobacco consumption because it helps them ‘Improve consciousness’ (p-value: < 0.01; using chi-square test). Multivariate analysis showed that there were higher prevalence of smokeless tobacco among highway drivers versus local drivers (A.O.R: 2.82 [0.83-9.61], p-value: < 0.01) Conclusion: Smokeless tobacco (ST) consumption has a direct effect on oral health. However, the type of ST, the duration of consumption are factors which are directly related to the severity. Moreover, Gutka may be considered as having most lethal effects on oral health which may lead to oral cancer and affect individual’s quality of life. Specific preventive programs must be undertaken to reduce the consumption of Gutka among public transporters and drivers.Keywords: smokeless tobacco, oral lesions, drivers, public transporters
Procedia PDF Downloads 30824770 Generic Data Warehousing for Consumer Electronics Retail Industry
Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel
Abstract:
The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry
Procedia PDF Downloads 41124769 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current
Authors: Lei Ren, Michael Hartnett, Stephen Nash
Abstract:
The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion
Procedia PDF Downloads 57324768 Measured versus Default Interstate Traffic Data in New Mexico, USA
Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder
Abstract:
This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution
Procedia PDF Downloads 34324767 Design of Knowledge Management System with Geographic Information System
Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan
Abstract:
Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.Keywords: 5C4C, data, information, knowledge
Procedia PDF Downloads 46124766 Clinicians' and Nurses' Documentation Practices in Palliative and Hospice Care: A Mixed Methods Study Providing Evidence for Quality Improvement at Mobile Hospice Mbarara, Uganda
Authors: G. Natuhwera, M. Rabwoni, P. Ellis, A. Merriman
Abstract:
Aims: Health workers are likely to document patients’ care inaccurately, especially when using new and revised case tools, and this could negatively impact patient care. This study set out to; (1) assess nurses’ and clinicians’ documentation practices when using a new patients’ continuation case sheet (PCCS) and (2) explore nurses’ and clinicians’ experiences regarding documentation of patients’ information in the new PCCS. The purpose of introducing the PCCS was to improve continuity of care for patients attending clinics at which they were unlikely to see the same clinician or nurse consistently. Methods: This was a mixed methods study. The cross-sectional inquiry retrospectively reviewed 100 case notes of active patients on hospice and palliative care program. Data was collected using a structured questionnaire with constructs formulated from the new PCCS under study. The qualitative element was face-to-face audio-recorded, open-ended interviews with a purposive sample of one palliative care clinician, and four palliative care nurse specialists. Thematic analysis was used. Results: Missing patients’ biogeographic information was prevalent at 5-10%. Spiritual and psychosocial issues were not documented in 42.6%, and vital signs in 49.2%. Poorest documentation practices were observed in past medical history part of the PCCS at 40-63%. Four themes emerged from interviews with clinicians and nurses-; (1) what remains unclear and challenges, (2) comparing the past with the present, (3) experiential thoughts, and (4) transition and adapting to change. Conclusions: The PCCS seems to be a comprehensive and simple tool to be used to document patients’ information at subsequent visits. The comprehensiveness and utility of the PCCS does paper to be limited by the failure to train staff in its use prior to introducing. The authors find the PCCS comprehensive and suitable to capture patients’ information and recommend it can be adopted and used in other palliative and hospice care settings, if suitable introductory training accompanies its introduction. Otherwise, the reliability and validity of patients’ information collected by this PCCS can be significantly reduced if some sections therein are unclear to the clinicians/nurses. The study identified clinicians- and nurses-related pitfalls in documentation of patients’ care. Clinicians and nurses need to prioritize accurate and complete documentation of patient care in the PCCS for quality care provision. This study should be extended to other sites using similar tools to ensure representative and generalizable findings.Keywords: documentation, information case sheet, palliative care, quality improvement
Procedia PDF Downloads 15124765 A Policy Strategy for Building Energy Data Management in India
Authors: Shravani Itkelwar, Deepak Tewari, Bhaskar Natarajan
Abstract:
The energy consumption data plays a vital role in energy efficiency policy design, implementation, and impact assessment. Any demand-side energy management intervention's success relies on the availability of accurate, comprehensive, granular, and up-to-date data on energy consumption. The Building sector, including residential and commercial, is one of the largest consumers of energy in India after the Industrial sector. With economic growth and increasing urbanization, the building sector is projected to grow at an unprecedented rate, resulting in a 5.6 times escalation in energy consumption till 2047 compared to 2017. Therefore, energy efficiency interventions will play a vital role in decoupling the floor area growth and associated energy demand, thereby increasing the need for robust data. In India, multiple institutions are involved in the collection and dissemination of data. This paper focuses on energy consumption data management in the building sector in India for both residential and commercial segments. It evaluates the robustness of data available through administrative and survey routes to estimate the key performance indicators and identify critical data gaps for making informed decisions. The paper explores several issues in the data, such as lack of comprehensiveness, non-availability of disaggregated data, the discrepancy in different data sources, inconsistent building categorization, and others. The identified data gaps are justified with appropriate examples. Moreover, the paper prioritizes required data in order of relevance to policymaking and groups it into "available," "easy to get," and "hard to get" categories. The paper concludes with recommendations to address the data gaps by leveraging digital initiatives, strengthening institutional capacity, institutionalizing exclusive building energy surveys, and standardization of building categorization, among others, to strengthen the management of building sector energy consumption data.Keywords: energy data, energy policy, energy efficiency, buildings
Procedia PDF Downloads 18524764 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures
Authors: Silvina Caíno-Lores, Jesús Carretero
Abstract:
Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.Keywords: data locality, data-centric computing, large scale infrastructures, cloud computing
Procedia PDF Downloads 25924763 Assessing the Impact of Human Behaviour on Water Resource Systems Performance: A Conceptual Framework
Authors: N. J. Shanono, J. G. Ndiritu
Abstract:
The poor performance of water resource systems (WRS) has been reportedly linked to not only climate variability and the water demand dynamics but also human behaviour-driven unlawful activities. Some of these unlawful activities that have been adversely affecting water sector include unauthorized water abstractions, water wastage behaviour, refusal of water re‐use measures, excessive operational losses, discharging untreated or improperly treated wastewater, over‐application of chemicals by agricultural users and fraudulent WRS operation. Despite advances in WRS planning, operation, and analysis incorporating such undesirable human activities to quantitatively assess their impact on WRS performance remain elusive. This study was then inspired by the need to develop a methodological framework for WRS performance assessment that integrates the impact of human behaviour with WRS performance assessment analysis. We, therefore, proposed a conceptual framework for assessing the impact of human behaviour on WRS performance using the concept of socio-hydrology. The framework identifies and couples four major sources of WRS-related values (water values, water systems, water managers, and water users) using three missing links between human and water in the management of WRS (interactions, outcomes, and feedbacks). The framework is to serve as a database for choosing relevant social and hydrological variables and to understand the intrinsic relations between the selected variables to study a specific human-water problem in the context of WRS management.Keywords: conceptual framework, human behaviour; socio-hydrology; water resource systems
Procedia PDF Downloads 13524762 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis
Authors: Hyun-Woo Cho
Abstract:
Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques
Procedia PDF Downloads 38724761 Environmental Effect of Empty Nest Households in Germany: An Empirical Approach
Authors: Dominik Kowitzke
Abstract:
Housing constructions have direct and indirect environmental impacts especially caused by soil sealing and gray energy consumption related to the use of construction materials. Accordingly, the German government introduced regulations limiting additional annual soil sealing. At the same time, in many regions like metropolitan areas the demand for further housing is high and of current concern in the media and politics. It is argued that meeting this demand by making better use of the existing housing supply is more sustainable than the construction of new housing units. In this context, targeting the phenomenon of so-called over the housing of empty nest households seems worthwhile to investigate for its potential to free living space and thus, reduce the need for new housing constructions and related environmental harm. Over housing occurs if no space adjustment takes place in household lifecycle stages when children move out from home and the space formerly created for the offspring is from then on under-utilized. Although in some cases the housing space consumption might actually meet households’ equilibrium preferences, frequently space-wise adjustments to the living situation doesn’t take place due to transaction or information costs, habit formation, or government intervention leading to increasing costs of relocations like real estate transfer taxes or tenant protection laws keeping tenure rents below the market price. Moreover, many detached houses are not long-term designed in a way that freed up space could be rent out. Findings of this research based on socio-economic survey data, indeed, show a significant difference between the living space of empty nest and a comparison group of households which never had children. The approach used to estimate the average difference in living space is a linear regression model regressing the response variable living space on a two-dimensional categorical variable distinguishing the two groups of household types and further controls. This difference is assumed to be the under-utilized space and is extrapolated to the total amount of empty nests in the population. Supporting this result, it is found that households that move, despite market frictions impairing the relocation, after children left their home tend to decrease the living space. In the next step, only for areas with tight housing markets in Germany and high construction activity, the total under-utilized space in empty nests is estimated. Under the assumption of full substitutability of housing space in empty nests and space in new dwellings in these locations, it is argued that in a perfect market with empty nest households consuming their equilibrium demand quantity of housing space, dwelling constructions in the amount of the excess consumption of living space could be saved. This, on the other hand, would prevent environmental harm quantified in carbon dioxide equivalence units related to average constructions of detached or multi-family houses. This study would thus provide information on the amount of under-utilized space inside dwellings which is missing in public data and further estimates the external effect of over housing in environmental terms.Keywords: empty nests, environment, Germany, households, over housing
Procedia PDF Downloads 17124760 Studies on Phylogeny of Helicoverpa armigera Populations from North Western Himalaya Region with Help of Cytochromeoxidase I Sequence
Authors: R. M. Srivastava, Subbanna A.R.N.S, Md Abbas Ahmad, S. P.More, Shivashankar, B. Kalyanbabu
Abstract:
The similar morphology associated with high genetic variability poses problems in phylogenetic studies of Helicoverpa armigera (Hubner). To identify genetic variation of North Western Himalayan population’s, partial (Mid to terminal region) cytochrome c oxidase subunit I (COX-1) gene was amplified and sequenced for three populations collected from Pantnagar, Almora, and Chinyalisaur. The alignment of sequences with other two populations, Nagpur representing central India population and Anhui, China representing complete COX-1 sequence revealed unanimity in middle region with eleven single nucleotide polymorphisms (SNPs) in Nagpur populations. However, the consensus is missing when approaching towards terminal region, which is associated with 15 each SNPs and pair base substitutions in Chinyalisaur populations. In minimum evolution tree, all the five populations were majorly separated into two clades, one comprising of only Nagpur population and the other with rest. Amongst, North Western populations, Chinyalisaur one is promising by farming a separate clade. The pairwise genetic distance ranges from 0.025 to 0.192 with the maximum between H. armigera populations of Nagpur and Chinyalisaur. This genetic isolation of populations can be attributed to a key role of topological barriers of weather and mountain ranges and temporal barriers due to cropping patterns.Keywords: cytochrome c oxidase subunit I, northwestern Himalayan population, Helicoverpa armigera (Noctuidae: Lepidoptera), phylogenetic relationship, genetic variation
Procedia PDF Downloads 30924759 Reversibility of Photosynthetic Activity and Pigment-protein Complexes Expression During Seed Development of Soybean and Black Soybean
Authors: Tzan-Chain Lee
Abstract:
Seeds are non-leaves green tissues. Photosynthesis begins with light absorption by chlorophyll and then the energy transfer between two pigment-protein complexes (PPC). Most studies of photosynthesis and PPC expression were focused on leaves; however, during seeds’ development were rare. Developed seeds from beginning pod (stage R3) to dried seed (stage R8), and the dried seed after sowing for 1-4 day, were analyzed for their chlorophyll contents. Thornber and MARS gel systems analysis compositions of PPC. Chlorophyll fluorescence was used to detect maximal photosynthetic efficiency (Fv/Fm). During soybean and black soybean seeds development (stages R3-R6), Fv/Fm up to 0.8, and then down-regulated after full seed (stage R7). In dried seed (stage R8), the two plant seeds lost photosynthetic activity (Fv/Fm=0), but chlorophyll degradation only occurred in soybean after full seed. After seeds sowing for 4 days, chlorophyll drastically increased in soybean seeds, and Fv/Fm recovered to 0.8 in the two seeds. In PPC, the two soybean seeds contained all PPC during seeds development (stages R3-R6), including CPI, CPII, A1, AB1, AB2, and AB3. However, many proteins A1, AB1, AB2, and CPI were totally missing in the two dried seeds (stage R8). The deficiency of these proteins in dried seeds might be caused by the incomplete photosynthetic activity. After seeds germination and seedling exposed to light for 4 days, all PPC were recovered, suggesting that completed PPC took place in the two soybean seeds. This study showed the reversibility of photosynthetic activity and pigment-protein complexes during soybean and black soybean seeds development.Keywords: light-harvesting complex, pigment–protein complexes, soybean cotyledon, grana development
Procedia PDF Downloads 14824758 Recommender System Based on Mining Graph Databases for Data-Intensive Applications
Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi
Abstract:
In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.Keywords: graph databases, NLP, recommendation systems, similarity metrics
Procedia PDF Downloads 10424757 Digital Revolution a Veritable Infrastructure for Technological Development
Authors: Osakwe Jude Odiakaosa
Abstract:
Today’s digital society is characterized by e-education or e-learning, e-commerce, and so on. All these have been propelled by digital revolution. Digital technology such as computer technology, Global Positioning System (GPS) and Geographic Information System (GIS) has been having a tremendous impact on the field of technology. This development has positively affected the scope, methods, speed of data acquisition, data management and the rate of delivery of the results (map and other map products) of data processing. This paper tries to address the impact of revolution brought by digital technology.Keywords: digital revolution, internet, technology, data management
Procedia PDF Downloads 44924756 Majority through the Eyes of Minority: The Role of Social Norms in the Link between Intergroup Contact and Attitudes of the Roma toward Majority Society
Authors: Roman Koky, Sylvie Graf
Abstract:
The relationship between the Roma and members of the majority is tense across Europe due to the fact that the Roma people are the most stigmatized minorities. Studies show that Roma is discriminated against on all levels of society. Improving intergroup relations between the Roma and members of the majority (i.e., non-Roma) is thus one of the most pressing issues of social psychological research. Intergroup contact theory is one of the most effective strategies for improving intergroup relations. However, current research has some limitations, such as the fact that most researchers focus primarily on the perspective of the majority, while the perspective of minorities (e.g., the Roma) is largely missing. Due to the persisting segregation of Roma, and thus the lack of opportunities for direct intergroup contact between the Roma and the majority, using direct intergroup contact as an intervention to reduce prejudice is difficult. In this research, we, therefore, focused on the effect of indirect forms of intergroup contact, particularly extended contact (i.e., experiences with outgroup members shared by fellow ingroup members such as friends or family). Extended contact functions as a descriptive social norm that informs about the actual amount of contact in one’s environment. In a group of Czech Roma (N = 226), the descriptive social norm was associated with ingroup injunctive social norm (e.g., the perceived support of intergroup contact with non-Roma by fellow ingroup members) and lower amount of prejudice toward the non-Roma. We discuss the findings with respect to possibilities to improve the relations between Roma and members of the majority across Europe.Keywords: intergroup contact, prejudice, majority, minority, social norms
Procedia PDF Downloads 11424755 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy
Authors: Abdullah Al Mamun, Talal Alkharobi
Abstract:
As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.Keywords: big data, cloud computing, cryptography, hadoop, public key
Procedia PDF Downloads 32024754 Implementation of Big Data Concepts Led by the Business Pressures
Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski
Abstract:
Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.Keywords: big data, unstructured data, SAP ERP, documentum
Procedia PDF Downloads 271