Search results for: interval estimation

5 Towards Dynamic Estimation of Residential Building Energy Consumption in Germany: Leveraging Machine Learning and Public Data from England and Wales

Authors: Philipp Sommer, Amgad Agoub

Abstract:

The construction sector significantly impacts global CO₂ emissions, particularly through the energy usage of residential buildings. To address this, various governments, including Germany's, are focusing on reducing emissions via sustainable refurbishment initiatives. This study examines the application of machine learning (ML) to estimate energy demands dynamically in residential buildings and enhance the potential for large-scale sustainable refurbishment. A major challenge in Germany is the lack of extensive publicly labeled datasets for energy performance, as energy performance certificates, which provide critical data on building-specific energy requirements and consumption, are not available for all buildings or require on-site inspections. Conversely, England and other countries in the European Union (EU) have rich public datasets, providing a viable alternative for analysis. This research adapts insights from these English datasets to the German context by developing a comprehensive data schema and calibration dataset capable of predicting building energy demand effectively. The study proposes a minimal feature set, determined through feature importance analysis, to optimize the ML model. Findings indicate that ML significantly improves the scalability and accuracy of energy demand forecasts, supporting more effective emissions reduction strategies in the construction industry. Integrating energy performance certificates into municipal heat planning in Germany highlights the transformative impact of data-driven approaches on environmental sustainability. The goal is to identify and utilize key features from open data sources that significantly influence energy demand, creating an efficient forecasting model. Using Extreme Gradient Boosting (XGB) and data from energy performance certificates, effective features such as building type, year of construction, living space, insulation level, and building materials were incorporated. These were supplemented by data derived from descriptions of roofs, walls, windows, and floors, integrated into three datasets. The emphasis was on features accessible via remote sensing, which, along with other correlated characteristics, greatly improved the model's accuracy. The model was further validated using SHapley Additive exPlanations (SHAP) values and aggregated feature importance, which quantified the effects of individual features on the predictions. The refined model using remote sensing data showed a coefficient of determination (R²) of 0.64 and a mean absolute error (MAE) of 4.12, indicating predictions based on efficiency class 1-100 (G-A) may deviate by 4.12 points. This R² increased to 0.84 with the inclusion of more samples, with wall type emerging as the most predictive feature. After optimizing and incorporating related features like estimated primary energy consumption, the R² score for the training and test set reached 0.94, demonstrating good generalization. The study concludes that ML models significantly improve prediction accuracy over traditional methods, illustrating the potential of ML in enhancing energy efficiency analysis and planning. This supports better decision-making for energy optimization and highlights the benefits of developing and refining data schemas using open data to bolster sustainability in the building sector. The study underscores the importance of supporting open data initiatives to collect similar features and support the creation of comparable models in Germany, enhancing the outlook for environmental sustainability.

Keywords: machine learning, remote sensing, residential building, energy performance certificates, data-driven, heat planning

Procedia PDF Downloads 57

4 Modelling Farmer’s Perception and Intention to Join Cashew Marketing Cooperatives: An Expanded Version of the Theory of Planned Behaviour

Authors: Gospel Iyioku, Jana Mazancova, Jiri Hejkrlik

Abstract:

The “Agricultural Promotion Policy (2016–2020)” represents a strategic initiative by the Nigerian government to address domestic food shortages and the challenges in exporting products at the required quality standards. Hindered by an inefficient system for setting and enforcing food quality standards, coupled with a lack of market knowledge, the Federal Ministry of Agriculture and Rural Development (FMARD) aims to enhance support for the production and activities of key crops like cashew. By collaborating with farmers, processors, investors, and stakeholders in the cashew sector, the policy seeks to define and uphold high-quality standards across the cashew value chain. Given the challenges and opportunities faced by Nigerian cashew farmers, active participation in cashew marketing groups becomes imperative. These groups serve as essential platforms for farmers to collectively navigate market intricacies, access resources, share knowledge, improve output quality, and bolster their overall bargaining power. Through engagement in these cooperative initiatives, farmers not only boost their economic prospects but can also contribute significantly to the sustainable growth of the cashew industry, fostering resilience and community development. This study explores the perceptions and intentions of farmers regarding their involvement in cashew marketing cooperatives, utilizing an expanded version of the Theory of Planned Behaviour. Drawing insights from a diverse sample of 321 cashew farmers in Southwest Nigeria, the research sheds light on the factors influencing decision-making in cooperative participation. The demographic analysis reveals a diverse landscape, with a substantial presence of middle-aged individuals contributing significantly to the agricultural sector and cashew-related activities emerging as a primary income source for a substantial proportion (23.99%). Employing Structural Equation Modelling (SEM) with Maximum Likelihood Robust (MLR) estimation in R, the research elucidates the associations among latent variables. Despite the model’s complexity, the goodness-of-fit indices attest to the validity of the structural model, explaining approximately 40% of the variance in the intention to join cooperatives. Moral norms emerge as a pivotal construct, highlighting the profound influence of ethical considerations in decision-making processes, while perceived behavioural control presents potential challenges in active participation. Attitudes toward joining cooperatives reveal nuanced perspectives, with strong beliefs in enhanced connections with other farmers but varying perceptions on improved access to essential information. The SEM analysis establishes positive and significant effects of moral norms, perceived behavioural control, subjective norms, and attitudes on farmers’ intention to join cooperatives. The knowledge construct positively affects key factors influencing intention, emphasizing the importance of informed decision-making. A supplementary analysis using partial least squares (PLS) SEM corroborates the robustness of our findings, aligning with covariance-based SEM results. This research unveils the determinants of cooperative participation and provides valuable insights for policymakers and practitioners aiming to empower and support this vital demographic in the cashew industry.

Keywords: marketing cooperatives, theory of planned behaviour, structural equation modelling, cashew farmers

Procedia PDF Downloads 84

3 XAI Implemented Prognostic Framework: Condition Monitoring and Alert System Based on RUL and Sensory Data

Authors: Faruk Ozdemir, Roy Kalawsky, Peter Hubbard

Abstract:

Accurate estimation of RUL provides a basis for effective predictive maintenance, reducing unexpected downtime for industrial equipment. However, while models such as the Random Forest have effective predictive capabilities, they are the so-called ‘black box’ models, where interpretability is at a threshold to make critical diagnostic decisions involved in industries related to aviation. The purpose of this work is to present a prognostic framework that embeds Explainable Artificial Intelligence (XAI) techniques in order to provide essential transparency in Machine Learning methods' decision-making mechanisms based on sensor data, with the objective of procuring actionable insights for the aviation industry. Sensor readings have been gathered from critical equipment such as turbofan jet engine and landing gear, and the prediction of the RUL is done by a Random Forest model. It involves steps such as data gathering, feature engineering, model training, and evaluation. These critical components’ datasets are independently trained and evaluated by the models. While suitable predictions are served, their performance metrics are reasonably good; such complex models, however obscure reasoning for the predictions made by them and may even undermine the confidence of the decision-maker or the maintenance teams. This is followed by global explanations using SHAP and local explanations using LIME in the second phase to bridge the gap in reliability within industrial contexts. These tools analyze model decisions, highlighting feature importance and explaining how each input variable affects the output. This dual approach offers a general comprehension of the overall model behavior and detailed insight into specific predictions. The proposed framework, in its third component, incorporates the techniques of causal analysis in the form of Granger causality tests in order to move beyond correlation toward causation. This will not only allow the model to predict failures but also present reasons, from the key sensor features linked to possible failure mechanisms to relevant personnel. The causality between sensor behaviors and equipment failures creates much value for maintenance teams due to better root cause identification and effective preventive measures. This step contributes to the system being more explainable. Surrogate Several simple models, including Decision Trees and Linear Models, can be used in yet another stage to approximately represent the complex Random Forest model. These simpler models act as backups, replicating important jobs of the original model's behavior. If the feature explanations obtained from the surrogate model are cross-validated with the primary model, the insights derived would be more reliable and provide an intuitive sense of how the input variables affect the predictions. We then create an iterative explainable feedback loop, where the knowledge learned from the explainability methods feeds back into the training of the models. This feeds into a cycle of continuous improvement both in model accuracy and interpretability over time. By systematically integrating new findings, the model is expected to adapt to changed conditions and further develop its prognosis capability. These components are then presented to the decision-makers through the development of a fully transparent condition monitoring and alert system. The system provides a holistic tool for maintenance operations by leveraging RUL predictions, feature importance scores, persistent sensor threshold values, and autonomous alert mechanisms. Since the system will provide explanations for the predictions given, along with active alerts, the maintenance personnel can make informed decisions on their end regarding correct interventions to extend the life of the critical machinery.

Keywords: predictive maintenance, explainable artificial intelligence, prognostic, RUL, machine learning, turbofan engines, C-MAPSS dataset

Procedia PDF Downloads 6

2 Supply Side Readiness for Universal Health Coverage: Assessing the Availability and Depth of Essential Health Package in Rural, Remote and Conflict Prone District

Authors: Veenapani Rajeev Verma

Abstract:

Context: Assessing facility readiness is paramount as it can indicate capacity of facilities to provide essential care for resilience to health challenges. In the context of decentralization, estimation of supply side readiness indices at sub national level is imperative for effective evidence based policy but remains a colossal challenge due to lack of dependable and representative data sources. Setting: District Poonch of Jammu and Kashmir was selected for this study. It is remote, rural district with unprecedented topographical barriers and is identified as high priority by government. It is also a fragile area as is bounded by Line of Control with Pakistan bearing the brunt of cease fire violations, military skirmishes and sporadic militant attacks. Hilly geographical terrain, rudimentary/absence of road network and impoverishment are quintessential to this area. Objectives: Objective of the study is to a) Evaluate the service readiness of health facilities and create a concise index subsuming plethora of discrete indicators and b) Ascertain supply side barriers in service provisioning via stakeholder’s analysis. Study also strives to expand analytical domain unravelling context and area specific intricacies associated with service delivery. Methodology: Mixed method approach was employed to triangulate quantitative analysis with qualitative nuances. Facility survey encompassing 90 Subcentres, 44 Primary health centres, 3 Community health centres and 1 District hospital was conducted to gauge general service availability and service specific availability (depth of coverage). Compendium of checklist was designed using Indian Public Health Standards (IPHS) in form of standard core questionnaire and scorecard generated for each facility. Information was collected across dimensions of amenities, equipment, medicines, laboratory and infection control protocols as proposed in WHO’s Service Availability and Readiness Assesment (SARA). Two stage polychoric principal component analysis employed to generate a parsimonious index by coalescing an array of tracer indicators. OLS regression method used to determine factors explaining composite index generated from PCA. Stakeholder analysis was conducted to discern qualitative information. Myriad of techniques like observations, key informant interviews and focus group discussions using semi structured questionnaires on both leaders and laggards were administered for critical stakeholder’s analysis. Results: General readiness score of health facilities was found to be 0.48. Results indicated poorest readiness for subcentres and PHC’s (first point of contact) with composite score of 0.47 and 0.41 respectively. For primary care facilities; principal component was characterized by basic newborn care as well as preparedness for delivery. Results revealed availability of equipment and surgical preparedness having lowest score (0.46 and 0.47) for facilities providing secondary care. Presence of contractual staff, more than 1 hr walk to facility, facilities in zone A (most vulnerable) to cross border shelling and facilities inaccessible due to snowfall and thick jungles was negatively associated with readiness index. Nonchalant staff attitude, unavailability of staff quarters, leakages and constraint in supply chain of drugs and consumables were other impediments identified. Conclusions/Policy Implications: It is pertinent to first strengthen primary care facilities in this setting. Complex dimensions such as geographic barriers, user and provider behavior is not under precinct of this methodology.

Keywords: effective coverage, principal component analysis, readiness index, universal health coverage

Procedia PDF Downloads 121

1 Times2D: A Time-Frequency Method for Time Series Forecasting

Authors: Reza Nematirad, Anil Pahwa, Balasubramaniam Natarajan

Abstract:

Time series data consist of successive data points collected over a period of time. Accurate prediction of future values is essential for informed decision-making in several real-world applications, including electricity load demand forecasting, lifetime estimation of industrial machinery, traffic planning, weather prediction, and the stock market. Due to their critical relevance and wide application, there has been considerable interest in time series forecasting in recent years. However, the proliferation of sensors and IoT devices, real-time monitoring systems, and high-frequency trading data introduce significant intricate temporal variations, rapid changes, noise, and non-linearities, making time series forecasting more challenging. Classical methods such as Autoregressive integrated moving average (ARIMA) and Exponential Smoothing aim to extract pre-defined temporal variations, such as trends and seasonality. While these methods are effective for capturing well-defined seasonal patterns and trends, they often struggle with more complex, non-linear patterns present in real-world time series data. In recent years, deep learning has made significant contributions to time series forecasting. Recurrent Neural Networks (RNNs) and their variants, such as Long short-term memory (LSTMs) and Gated Recurrent Units (GRUs), have been widely adopted for modeling sequential data. However, they often suffer from the locality, making it difficult to capture local trends and rapid fluctuations. Convolutional Neural Networks (CNNs), particularly Temporal Convolutional Networks (TCNs), leverage convolutional layers to capture temporal dependencies by applying convolutional filters along the temporal dimension. Despite their advantages, TCNs struggle with capturing relationships between distant time points due to the locality of one-dimensional convolution kernels. Transformers have revolutionized time series forecasting with their powerful attention mechanisms, effectively capturing long-term dependencies and relationships between distant time points. However, the attention mechanism may struggle to discern dependencies directly from scattered time points due to intricate temporal patterns. Lastly, Multi-Layer Perceptrons (MLPs) have also been employed, with models like N-BEATS and LightTS demonstrating success. Despite this, MLPs often face high volatility and computational complexity challenges in long-horizon forecasting. To address intricate temporal variations in time series data, this study introduces Times2D, a novel framework that parallelly integrates 2D spectrogram and derivative heatmap techniques. The spectrogram focuses on the frequency domain, capturing periodicity, while the derivative patterns emphasize the time domain, highlighting sharp fluctuations and turning points. This 2D transformation enables the utilization of powerful computer vision techniques to capture various intricate temporal variations. To evaluate the performance of Times2D, extensive experiments were conducted on standard time series datasets and compared with various state-of-the-art algorithms, including DLinear (2023), TimesNet (2023), Non-stationary Transformer (2022), PatchTST (2023), N-HiTS (2023), Crossformer (2023), MICN (2023), LightTS (2022), FEDformer (2022), FiLM (2022), SCINet (2022a), Autoformer (2021), and Informer (2021) under the same modeling conditions. The initial results demonstrated that Times2D achieves consistent state-of-the-art performance in both short-term and long-term forecasting tasks. Furthermore, the generality of the Times2D framework allows it to be applied to various tasks such as time series imputation, clustering, classification, and anomaly detection, offering potential benefits in any domain that involves sequential data analysis.

Keywords: derivative patterns, spectrogram, time series forecasting, times2D, 2D representation

Procedia PDF Downloads 42