Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 25577

Search results for: meteorological prediction data

25487 Prediction on Housing Price Based on Deep Learning

Authors: Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang

Abstract:

In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: deep learning, convolutional neural network, LSTM, housing prediction

Procedia PDF Downloads 288

25486 Producing Outdoor Design Conditions based on the Dependency between Meteorological Elements: Copula Approach

Authors: Zhichao Jiao, Craig Farnham, Jihui Yuan, Kazuo Emura

Abstract:

It is common to use the outdoor design weather data to select the air-conditioning capacity in the building design stage. The outdoor design weather data are usually comprised of multiple meteorological elements for a 24-hour period separately, but the dependency between the elements is not well considered, which may cause an overestimation of selecting air-conditioning capacity. Considering the dependency between the air temperature and global solar radiation, we used the copula approach to model the joint distributions of those two weather elements and suggest a new method of selecting more credible outdoor design conditions based on the specific simultaneous occurrence probability of air temperature and global solar radiation. In this paper, the 10-year period hourly weather data from 2001 to 2010 in Osaka, Japan, was used to analyze the dependency structure and joint distribution, the result shows that the Joe-Frank copula fit for almost all hourly data. According to calculating the simultaneous occurrence probability and the common exceeding probability of air temperature and global solar radiation, the results have shown that the maximum difference in design air temperature and global solar radiation of the day is about 2 degrees Celsius and 30W/m2, respectively.

Keywords: energy conservation, design weather database, HVAC, copula approach

Procedia PDF Downloads 242

25485 Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia

Authors: The Danh Phan

Abstract:

House price forecasting is a main topic in the real estate market research. Effective house price prediction models could not only allow home buyers and real estate agents to make better data-driven decisions but may also be beneficial for the property policymaking process. This study investigates the housing market by using machine learning techniques to analyze real historical house sale transactions in Australia. It seeks useful models which could be deployed as an application for house buyers and sellers. Data analytics show a high discrepancy between the house price in the most expensive suburbs and the most affordable suburbs in the city of Melbourne. In addition, experiments demonstrate that the combination of Stepwise and Support Vector Machine (SVM), based on the Mean Squared Error (MSE) measurement, consistently outperforms other models in terms of prediction accuracy.

Keywords: house price prediction, regression trees, neural network, support vector machine, stepwise

Procedia PDF Downloads 202

25484 Nonlinear Estimation Model for Rail Track Deterioration

Authors: M. Karimpour, L. Hitihamillage, N. Elkhoury, S. Moridpour, R. Hesami

Abstract:

Rail transport authorities around the world have been facing a significant challenge when predicting rail infrastructure maintenance work for a long period of time. Generally, maintenance monitoring and prediction is conducted manually. With the restrictions in economy, the rail transport authorities are in pursuit of improved modern methods, which can provide precise prediction of rail maintenance time and location. The expectation from such a method is to develop models to minimize the human error that is strongly related to manual prediction. Such models will help them in understanding how the track degradation occurs overtime under the change in different conditions (e.g. rail load, rail type, rail profile). They need a well-structured technique to identify the precise time that rail tracks fail in order to minimize the maintenance cost/time and secure the vehicles. The rail track characteristics that have been collected over the years will be used in developing rail track degradation prediction models. Since these data have been collected in large volumes and the data collection is done both electronically and manually, it is possible to have some errors. Sometimes these errors make it impossible to use them in prediction model development. This is one of the major drawbacks in rail track degradation prediction. An accurate model can play a key role in the estimation of the long-term behavior of rail tracks. Accurate models increase the track safety and decrease the cost of maintenance in long term. In this research, a short review of rail track degradation prediction models has been discussed before estimating rail track degradation for the curve sections of Melbourne tram track system using Adaptive Network-based Fuzzy Inference System (ANFIS) model.

Keywords: ANFIS, MGT, prediction modeling, rail track degradation

Procedia PDF Downloads 303

25483 The Culex Pipiens Niche: Assessment with Climatic and Physiographic Variables via a Geographic Information System

Authors: Maria C. Proença, Maria T. Rebelo, Marília Antunes, Maria J. Alves, Hugo Osório, Sofia Cunha, João Casaca

Abstract:

Using a geographic information system (GIS), the relations between a georeferenced data set of Culex pipiens sl. mosquitoes collected in Portugal mainland during seven years (2006-2012) and meteorological and physiographic parameters such as: air relative humidity, air temperature (minima, maxima and mean daily temperatures), daily total rainfall, altitude, land use/land cover and proximity to water bodies are evaluated. Focus is on the mosquito females; the characterization of its habitat is the key for the planning of chirurgical non-aggressive prophylactic countermeasures to avoid ambient degradation. The GIS allow for the spatial determination of the zones were the mosquito mean captures has been above average; using the meteorological values at these coordinates, the limits of each parameter are identified/computed. The meteorological parameters measured at the net of weather stations all over the country are averaged by month and interpolated to produce raster maps that can be segmented according to the thresholds obtained for each parameter. The intersection of the maps obtained for each month show the evolution of the area favorable to the species through the mosquito season, which is from May to October at these latitudes. In parallel, mean and above average captures were related to the physiographic parameters. Three levels of risk could be identified for each parameter, using above average captures as an index. The results were applied to the suitability meteorological maps of each month. The Culex pipiens critical niche is delimited, reflecting the critical areas and the level of risk for transmission of the pathogens to which they are competent vectors (West Nile virus, iridoviruses, rheoviruses and parvoviruses).

Keywords: Culex pipiens, ecological niche, risk assessment, risk management

Procedia PDF Downloads 521

25482 Big Data: Appearance and Disappearance

Authors: James Moir

Abstract:

The mainstay of Big Data is prediction in that it allows practitioners, researchers, and policy analysts to predict trends based upon the analysis of large and varied sources of data. These can range from changing social and political opinions, patterns in crimes, and consumer behaviour. Big Data has therefore shifted the criterion of success in science from causal explanations to predictive modelling and simulation. The 19th-century science sought to capture phenomena and seek to show the appearance of it through causal mechanisms while 20th-century science attempted to save the appearance and relinquish causal explanations. Now 21st-century science in the form of Big Data is concerned with the prediction of appearances and nothing more. However, this pulls social science back in the direction of a more rule- or law-governed reality model of science and away from a consideration of the internal nature of rules in relation to various practices. In effect Big Data offers us no more than a world of surface appearance and in doing so it makes disappear any context-specific conceptual sensitivity.

Keywords: big data, appearance, disappearance, surface, epistemology

Procedia PDF Downloads 395

25481 Genomic Prediction Reliability Using Haplotypes Defined by Different Methods

Authors: Sohyoung Won, Heebal Kim, Dajeong Lim

Abstract:

Genomic prediction is an effective way to measure the abilities of livestock for breeding based on genomic estimated breeding values, statistically predicted values from genotype data using best linear unbiased prediction (BLUP). Using haplotypes, clusters of linked single nucleotide polymorphisms (SNPs), as markers instead of individual SNPs can improve the reliability of genomic prediction since the probability of a quantitative trait loci to be in strong linkage disequilibrium (LD) with markers is higher. To efficiently use haplotypes in genomic prediction, finding optimal ways to define haplotypes is needed. In this study, 770K SNP chip data was collected from Hanwoo (Korean cattle) population consisted of 2506 cattle. Haplotypes were first defined in three different ways using 770K SNP chip data: haplotypes were defined based on 1) length of haplotypes (bp), 2) the number of SNPs, and 3) k-medoids clustering by LD. To compare the methods in parallel, haplotypes defined by all methods were set to have comparable sizes; in each method, haplotypes defined to have an average number of 5, 10, 20 or 50 SNPs were tested respectively. A modified GBLUP method using haplotype alleles as predictor variables was implemented for testing the prediction reliability of each haplotype set. Also, conventional genomic BLUP (GBLUP) method, which uses individual SNPs were tested to evaluate the performance of the haplotype sets on genomic prediction. Carcass weight was used as the phenotype for testing. As a result, using haplotypes defined by all three methods showed increased reliability compared to conventional GBLUP. There were not many differences in the reliability between different haplotype defining methods. The reliability of genomic prediction was highest when the average number of SNPs per haplotype was 20 in all three methods, implying that haplotypes including around 20 SNPs can be optimal to use as markers for genomic prediction. When the number of alleles generated by each haplotype defining methods was compared, clustering by LD generated the least number of alleles. Using haplotype alleles for genomic prediction showed better performance, suggesting improved accuracy in genomic selection. The number of predictor variables was decreased when the LD-based method was used while all three haplotype defining methods showed similar performances. This suggests that defining haplotypes based on LD can reduce computational costs and allows efficient prediction. Finding optimal ways to define haplotypes and using the haplotype alleles as markers can provide improved performance and efficiency in genomic prediction.

Keywords: best linear unbiased predictor, genomic prediction, haplotype, linkage disequilibrium

Procedia PDF Downloads 127

25480 Prediction of Oil Recovery Factor Using Artificial Neural Network

Authors: O. P. Oladipo, O. A. Falode

Abstract:

The determination of Recovery Factor is of great importance to the reservoir engineer since it relates reserves to the initial oil in place. Reserves are the producible portion of reservoirs and give an indication of the profitability of a field Development. The core objective of this project is to develop an artificial neural network model using selected reservoir data to predict Recovery Factors (RF) of hydrocarbon reservoirs and compare the model with a couple of the existing correlations. The type of Artificial Neural Network model developed was the Single Layer Feed Forward Network. MATLAB was used as the network simulator and the network was trained using the supervised learning method, Afterwards, the network was tested with input data never seen by the network. The results of the predicted values of the recovery factors of the Artificial Neural Network Model, API Correlation for water drive reservoirs (Sands and Sandstones) and Guthrie and Greenberger Correlation Equation were obtained and compared. It was noted that the coefficient of correlation of the Artificial Neural Network Model was higher than the coefficient of correlations of the other two correlation equations, thus making it a more accurate prediction tool. The Artificial Neural Network, because of its accurate prediction ability is helpful in the correct prediction of hydrocarbon reservoir factors. Artificial Neural Network could be applied in the prediction of other Petroleum Engineering parameters because it is able to recognise complex patterns of data set and establish a relationship between them.

Keywords: recovery factor, reservoir, reserves, artificial neural network, hydrocarbon, MATLAB, API, Guthrie, Greenberger

Procedia PDF Downloads 419

25479 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 46

25478 Uplink Throughput Prediction in Cellular Mobile Networks

Authors: Engin Eyceyurt, Josko Zec

Abstract:

The current and future cellular mobile communication networks generate enormous amounts of data. Networks have become extremely complex with extensive space of parameters, features and counters. These networks are unmanageable with legacy methods and an enhanced design and optimization approach is necessary that is increasingly reliant on machine learning. This paper proposes that machine learning as a viable approach for uplink throughput prediction. LTE radio metric, such as Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), and Signal to Noise Ratio (SNR) are used to train models to estimate expected uplink throughput. The prediction accuracy with high determination coefficient of 91.2% is obtained from measurements collected with a simple smartphone application.

Keywords: drive test, LTE, machine learning, uplink throughput prediction

Procedia PDF Downloads 138

25477 Free Fatty Acid Assessment of Crude Palm Oil Using a Non-Destructive Approach

Authors: Siti Nurhidayah Naqiah Abdull Rani, Herlina Abdul Rahim, Rashidah Ghazali, Noramli Abdul Razak

Abstract:

Near infrared (NIR) spectroscopy has always been of great interest in the food and agriculture industries. The development of prediction models has facilitated the estimation process in recent years. In this study, 110 crude palm oil (CPO) samples were used to build a free fatty acid (FFA) prediction model. 60% of the collected data were used for training purposes and the remaining 40% used for testing. The visible peaks on the NIR spectrum were at 1725 nm and 1760 nm, indicating the existence of the first overtone of C-H bands. Principal component regression (PCR) was applied to the data in order to build this mathematical prediction model. The optimal number of principal components was 10. The results showed R2=0.7147 for the training set and R2=0.6404 for the testing set.

Keywords: palm oil, fatty acid, NIRS, regression

Procedia PDF Downloads 488

25476 Statistical Analysis with Prediction Models of User Satisfaction in Software Project Factors

Authors: Katawut Kaewbanjong

Abstract:

We analyzed a volume of data and found significant user satisfaction in software project factors. A statistical significance analysis (logistic regression) and collinearity analysis determined the significance factors from a group of 71 pre-defined factors from 191 software projects in ISBSG Release 12. The eight prediction models used for testing the prediction potential of these factors were Neural network, k-NN, Naïve Bayes, Random forest, Decision tree, Gradient boosted tree, linear regression and logistic regression prediction model. Fifteen pre-defined factors were truly significant in predicting user satisfaction, and they provided 82.71% prediction accuracy when used with a neural network prediction model. These factors were client-server, personnel changes, total defects delivered, project inactive time, industry sector, application type, development type, how methodology was acquired, development techniques, decision making process, intended market, size estimate approach, size estimate method, cost recording method, and effort estimate method. These findings may benefit software development managers considerably.

Keywords: prediction model, statistical analysis, software project, user satisfaction factor

Procedia PDF Downloads 106

25475 Reservoir Inflow Prediction for Pump Station Using Upstream Sewer Depth Data

Authors: Osung Im, Neha Yadav, Eui Hoon Lee, Joong Hoon Kim

Abstract:

Artificial Neural Network (ANN) approach is commonly used in lots of fields for forecasting. In water resources engineering, forecast of water level or inflow of reservoir is useful for various kind of purposes. Due to advantages of ANN, many papers were written for inflow prediction in river networks, but in this study, ANN is used in urban sewer networks. The growth of severe rain storm in Korea has increased flood damage severely, and the precipitation distribution is getting more erratic. Therefore, effective pump operation in pump station is an essential task for the reduction in urban area. If real time inflow of pump station reservoir can be predicted, it is possible to operate pump effectively for reducing the flood damage. This study used ANN model for pump station reservoir inflow prediction using upstream sewer depth data. For this study, rainfall events, sewer depth, and inflow into Banpo pump station reservoir between years of 2013-2014 were considered. Feed – Forward Back Propagation (FFBF), Cascade – Forward Back Propagation (CFBP), Elman Back Propagation (EBP) and Nonlinear Autoregressive Exogenous (NARX) were used as ANN model for prediction. A comparison of results with ANN model suggests that ANN is a powerful tool for inflow prediction using the sewer depth data.

Keywords: artificial neural network, forecasting, reservoir inflow, sewer depth

Procedia PDF Downloads 295

25474 Seismic Hazard Prediction Using Seismic Bumps: Artificial Neural Network Technique

Authors: Belkacem Selma, Boumediene Selma, Tourkia Guerzou, Abbes Labdelli

Abstract:

Natural disasters have occurred and will continue to cause human and material damage. Therefore, the idea of "preventing" natural disasters will never be possible. However, their prediction is possible with the advancement of technology. Even if natural disasters are effectively inevitable, their consequences may be partly controlled. The rapid growth and progress of artificial intelligence (AI) had a major impact on the prediction of natural disasters and risk assessment which are necessary for effective disaster reduction. The Earthquakes prediction to prevent the loss of human lives and even property damage is an important factor; that is why it is crucial to develop techniques for predicting this natural disaster. This present study aims to analyze the ability of artificial neural networks (ANNs) to predict earthquakes that occur in a given area. The used data describe the problem of high energy (higher than 10^4J) seismic bumps forecasting in a coal mine using two long walls as an example. For this purpose, seismic bumps data obtained from mines has been analyzed. The results obtained show that the ANN with high accuracy was able to predict earthquake parameters; the classification accuracy through neural networks is more than 94%, and that the models developed are efficient and robust and depend only weakly on the initial database.

Keywords: earthquake prediction, ANN, seismic bumps

Procedia PDF Downloads 108

25473 Neural Network Based Approach of Software Maintenance Prediction for Laboratory Information System

Authors: Vuk M. Popovic, Dunja D. Popovic

Abstract:

Software maintenance phase is started once a software project has been developed and delivered. After that, any modification to it corresponds to maintenance. Software maintenance involves modifications to keep a software project usable in a changed or a changing environment, to correct discovered faults, and modifications, and to improve performance or maintainability. Software maintenance and management of software maintenance are recognized as two most important and most expensive processes in a life of a software product. This research is basing the prediction of maintenance, on risks and time evaluation, and using them as data sets for working with neural networks. The aim of this paper is to provide support to project maintenance managers. They will be able to pass the issues planned for the next software-service-patch to the experts, for risk and working time evaluation, and afterward to put all data to neural networks in order to get software maintenance prediction. This process will lead to the more accurate prediction of the working hours needed for the software-service-patch, which will eventually lead to better planning of budget for the software maintenance projects.

Keywords: laboratory information system, maintenance engineering, neural networks, software maintenance, software maintenance costs

Procedia PDF Downloads 332

25472 Computation of Flood and Drought Years over the North-West Himalayan Region Using Indian Meteorological Department Rainfall Data

Authors: Sudip Kumar Kundu, Charu Singh

Abstract:

The climatic condition over Indian region is highly dependent on monsoon. India receives maximum amount of rainfall during southwest monsoon. Indian economy is highly dependent on agriculture. The presence of flood and drought years influenced the total cultivation system as well as the economy of the country as Indian agricultural systems is still highly dependent on the monsoon rainfall. The present study has been planned to investigate the flood and drought years for the north-west Himalayan region from 1951 to 2014 by using area average Indian Meteorological Department (IMD) rainfall data. For this investigation the Normalized index (NI) has been utilized to find out whether the particular year is drought or flood. The data have been extracted for the north-west Himalayan (NWH) region states namely Uttarakhand (UK), Himachal Pradesh (HP) and Jammu and Kashmir (J&K) to find out the rainy season average rainfall for each year, climatological mean and the standard deviation. After calculation it has been plotted by the diagrams (or graphs) to show the results- some of the years associated with drought years, some are flood years and rest are neutral. The flood and drought years can also relate with the large-scale phenomena El-Nino and La-Lina.

Keywords: IMD, rainfall, normalized index, flood, drought, NWH

Procedia PDF Downloads 273

25471 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction

Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi

Abstract:

For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.

Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy

Procedia PDF Downloads 93

25470 Air Dispersion Modeling for Prediction of Accidental Emission in the Atmosphere along Northern Coast of Egypt

Authors: Moustafa Osman

Abstract:

Modeling of air pollutants from the accidental release is performed for quantifying the impact of industrial facilities into the ambient air. The mathematical methods are requiring for the prediction of the accidental scenario in probability of failure-safe mode and analysis consequences to quantify the environmental damage upon human health. The initial statement of mitigation plan is supporting implementation during production and maintenance periods. In a number of mathematical methods, the flow rate at which gaseous and liquid pollutants might be accidentally released is determined from various types in term of point, line and area sources. These emissions are integrated meteorological conditions in simplified stability parameters to compare dispersion coefficients from non-continuous air pollution plumes. The differences are reflected in concentrations levels and greenhouse effect to transport the parcel load in both urban and rural areas. This research reveals that the elevation effect nearby buildings with other structure is higher 5 times more than open terrains. These results are agreed with Sutton suggestion for dispersion coefficients in different stability classes.

Keywords: air pollutants, dispersion modeling, GIS, health effect, urban planning

Procedia PDF Downloads 351

25469 The Challenge of Characterising Drought Risk in Data Scarce Regions: The Case of the South of Angola

Authors: Natalia Limones, Javier Marzo, Marcus Wijnen, Aleix Serrat-Capdevila

Abstract:

In this research we developed a structured approach for the detection of areas under the highest levels of drought risk that is suitable for data-scarce environments. The methodology is based on recent scientific outcomes and methods and can be easily adapted to different contexts in successive exercises. The research reviews the history of drought in the south of Angola and characterizes the experienced hazard in the episode from 2012, focusing on the meteorological and the hydrological drought types. Only global open data information coming from modeling or remote sensing was used for the description of the hydroclimatological variables since there is almost no ground data in this part of the country. Also, the study intends to portray the socioeconomic vulnerabilities and the exposure to the phenomenon in the region to fully understand the risk. As a result, a map of the areas under the highest risk in the south of the country is produced, which is one of the main outputs of this work. It was also possible to confirm that the set of indicators used revealed different drought vulnerability profiles in the South of Angola and, as a result, several varieties of priority areas prone to distinctive impacts were recognized. The results demonstrated that most of the region experienced a severe multi-year meteorological drought that triggered an unprecedent exhaustion of the surface water resources, and that the majority of their socioeconomic impacts started soon after the identified onset of these processes.

Keywords: drought risk, exposure, hazard, vulnerability

Procedia PDF Downloads 178

25468 Predicting Data Center Resource Usage Using Quantile Regression to Conserve Energy While Fulfilling the Service Level Agreement

Authors: Ahmed I. Alutabi, Naghmeh Dezhabad, Sudhakar Ganti

Abstract:

Data centers have been growing in size and dema nd continuously in the last two decades. Planning for the deployment of resources has been shallow and always resorted to over-provisioning. Data center operators try to maximize the availability of their services by allocating multiple of the needed resources. One resource that has been wasted, with little thought, has been energy. In recent years, programmable resource allocation has paved the way to allow for more efficient and robust data centers. In this work, we examine the predictability of resource usage in a data center environment. We use a number of models that cover a wide spectrum of machine learning categories. Then we establish a framework to guarantee the client service level agreement (SLA). Our results show that using prediction can cut energy loss by up to 55%.

Keywords: machine learning, artificial intelligence, prediction, data center, resource allocation, green computing

Procedia PDF Downloads 92

25467 Groundwater Monitoring Using a Community: Science Approach

Authors: Shobha Kumari Yadav, Yubaraj Satyal, Ajaya Dixit

Abstract:

In addressing groundwater depletion, it is important to develop evidence base so to be used in assessing the state of its degradation. Groundwater data is limited compared to meteorological data, which impedes the groundwater use and management plan. Monitoring of groundwater levels provides information base to assess the condition of aquifers, their responses to water extraction, land-use change, and climatic variability. It is important to maintain a network of spatially distributed, long-term monitoring wells to support groundwater management plan. Monitoring involving local community is a cost effective approach that generates real time data to effectively manage groundwater use. This paper presents the relationship between rainfall and spring flow, which are the main source of freshwater for drinking, household consumptions and agriculture in hills of Nepal. The supply and withdrawal of water from springs depends upon local hydrology and the meteorological characteristics- such as rainfall, evapotranspiration and interflow. The study offers evidence of the use of scientific method and community based initiative for managing groundwater and springshed. The approach presents a method to replicate similar initiative in other parts of the country for maintaining integrity of springs.

Keywords: citizen science, groundwater, water resource management, Nepal

Procedia PDF Downloads 182

25466 Aggregate Angularity on the Permanent Deformation Zones of Hot Mix Asphalt

Authors: Lee P. Leon, Raymond Charles

Abstract:

This paper presents a method of evaluating the effect of aggregate angularity on hot mix asphalt (HMA) properties and its relationship to the Permanent Deformation resistance. The research concluded that aggregate particle angularity had a significant effect on the Permanent Deformation performance, and also that with an increase in coarse aggregate angularity there was an increase in the resistance of mixes to Permanent Deformation. A comparison between the measured data and predictive data of permanent deformation predictive models showed the limits of existing prediction models. The numerical analysis described the permanent deformation zones and concluded that angularity has an effect of the onset of these zones. Prediction of permanent deformation help road agencies and by extension economists and engineers determine the best approach for maintenance, rehabilitation, and new construction works of the road infrastructure.

Keywords: aggregate angularity, asphalt concrete, permanent deformation, rutting prediction

Procedia PDF Downloads 384

25465 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 345

25464 Satellite-Based Drought Monitoring in Korea: Methodologies and Merits

Authors: Joo-Heon Lee, Seo-Yeon Park, Chanyang Sur, Ho-Won Jang

Abstract:

Satellite-based remote sensing technique has been widely used in the area of drought and environmental monitoring to overcome the weakness of in-situ based monitoring. There are many advantages of remote sensing for drought watch in terms of data accessibility, monitoring resolution and types of available hydro-meteorological data including environmental areas. This study was focused on the applicability of drought monitoring based on satellite imageries by applying to the historical drought events, which had a huge impact on meteorological, agricultural, and hydrological drought. Satellite-based drought indices, the Standardized Precipitation Index (SPI) using Tropical Rainfall Measuring Mission (TRMM) and Global Precipitation Mission (GPM); Vegetation Health Index (VHI) using MODIS based Land Surface Temperature (LST), and Normalized Difference Vegetation Index (NDVI); and Scaled Drought Condition Index (SDCI) were evaluated to assess its capability to analyze the complex topography of the Korean peninsula. While the VHI was accurate when capturing moderate drought conditions in agricultural drought-damaged areas, the SDCI was relatively well monitored in hydrological drought-damaged areas. In addition, this study found correlations among various drought indices and applicability using Receiver Operating Characteristic (ROC) method, which will expand our understanding of the relationships between hydro-meteorological variables and drought events at global scale. The results of this research are expected to assist decision makers in taking timely and appropriate action in order to save millions of lives in drought-damaged areas.

Keywords: drought monitoring, moderate resolution imaging spectroradiometer (MODIS), remote sensing, receiver operating characteristic (ROC)

Procedia PDF Downloads 313

25463 Integration of Big Data to Predict Transportation for Smart Cities

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system. The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Keywords: big data, machine learning, smart city, social cost, transportation network

Procedia PDF Downloads 238

25462 Evaluating the Accuracy of Biologically Relevant Variables Generated by ClimateAP

Authors: Jing Jiang, Wenhuan XU, Lei Zhang, Shiyi Zhang, Tongli Wang

Abstract:

Climate data quality significantly affects the reliability of ecological modeling. In the Asia Pacific (AP) region, low-quality climate data hinders ecological modeling. ClimateAP, a software developed in 2017, generates high-quality climate data for the AP region, benefiting researchers in forestry and agriculture. However, its adoption remains limited. This study aims to confirm the validity of biologically relevant variable data generated by ClimateAP during the normal climate period through comparison with the currently available gridded data. Climate data from 2,366 weather stations were used to evaluate the prediction accuracy of ClimateAP in comparison with the commonly used gridded data from WorldClim1.4. Univariate regressions were applied to 48 monthly biologically relevant variables, and the relationship between the observational data and the predictions made by ClimateAP and WorldClim was evaluated using Adjusted R-Squared and Root Mean Squared Error (RMSE). Locations were categorized into mountainous and flat landforms, considering elevation, slope, ruggedness, and Topographic Position Index. Univariate regressions were then applied to all biologically relevant variables for each landform category. Random Forest (RF) models were implemented for the climatic niche modeling of Cunninghamia lanceolata. A comparative analysis of the prediction accuracies of RF models constructed with distinct climate data sources was conducted to evaluate their relative effectiveness. Biologically relevant variables were obtained from three unpublished Chinese meteorological datasets. ClimateAPv3.0 and WorldClim predictions were obtained from weather station coordinates and WorldClim1.4 rasters, respectively, for the normal climate period of 1961-1990. Occurrence data for Cunninghamia lanceolata came from integrated biodiversity databases with 3,745 unique points. ClimateAP explains a minimum of 94.74%, 97.77%, 96.89%, and 94.40% of monthly maximum, minimum, average temperature, and precipitation variances, respectively. It outperforms WorldClim in 37 biologically relevant variables with lower RMSE values. ClimateAP achieves higher R-squared values for the 12 monthly minimum temperature variables and consistently higher Adjusted R-squared values across all landforms for precipitation. ClimateAP's temperature data yields lower Adjusted R-squared values than gridded data in high-elevation, rugged, and mountainous areas but achieves higher values in mid-slope drainages, plains, open slopes, and upper slopes. Using ClimateAP improves the prediction accuracy of tree occurrence from 77.90% to 82.77%. The biologically relevant climate data produced by ClimateAP is validated based on evaluations using observations from weather stations. The use of ClimateAP leads to an improvement in data quality, especially in non-mountainous regions. The results also suggest that using biologically relevant variables generated by ClimateAP can slightly enhance climatic niche modeling for tree species, offering a better understanding of tree species adaptation and resilience compared to using gridded data.

Keywords: climate data validation, data quality, Asia pacific climate, climatic niche modeling, random forest models, tree species

Procedia PDF Downloads 52

25461 Satellite Statistical Data Approach for Upwelling Identification and Prediction in South of East Java and Bali Sea

Authors: Hary Aprianto Wijaya Siahaan, Bayu Edo Pratama

Abstract:

Sea fishery's potential to become one of the nation's assets which very contributed to Indonesia's economy. This fishery potential not in spite of the availability of the chlorophyll in the territorial waters of Indonesia. The research was conducted using three methods, namely: statistics, comparative and analytical. The data used include MODIS sea temperature data imaging results in Aqua satellite with a resolution of 4 km in 2002-2015, MODIS data of chlorophyll-a imaging results in Aqua satellite with a resolution of 4 km in 2002-2015, and Imaging results data ASCAT on MetOp and NOAA satellites with 27 km resolution in 2002-2015. The results of the processing of the data show that the incidence of upwelling in the south of East Java Sea began to happen in June identified with sea surface temperature anomaly below normal, the mass of the air that moves from the East to the West, and chlorophyll-a concentrations are high. In July the region upwelling events are increasingly expanding towards the West and reached its peak in August. Chlorophyll-a concentration prediction using multiple linear regression equations demonstrate excellent results to chlorophyll-a concentrations prediction in 2002 until 2015 with the correlation of predicted chlorophyll-a concentration indicate a value of 0.8 and 0.3 with RMSE value. On the chlorophyll-a concentration prediction in 2016 indicate good results despite a decline in the value of the correlation, where the correlation of predicted chlorophyll-a concentration in the year 2016 indicate a value 0.6, but showed improvement in RMSE values with 0.2.

Keywords: satellite, sea surface temperature, upwelling, wind stress

Procedia PDF Downloads 143

25460 Project Progress Prediction in Software Devlopment Integrating Time Prediction Algorithms and Large Language Modeling

Authors: Dong Wu, Michael Grenn

Abstract:

Managing software projects effectively is crucial for meeting deadlines, ensuring quality, and managing resources well. Traditional methods often struggle with predicting project timelines accurately due to uncertain schedules and complex data. This study addresses these challenges by combining time prediction algorithms with Large Language Models (LLMs). It makes use of real-world software project data to construct and validate a model. The model takes detailed project progress data such as task completion dynamic, team Interaction and development metrics as its input and outputs predictions of project timelines. To evaluate the effectiveness of this model, a comprehensive methodology is employed, involving simulations and practical applications in a variety of real-world software project scenarios. This multifaceted evaluation strategy is designed to validate the model's significant role in enhancing forecast accuracy and elevating overall management efficiency, particularly in complex software project environments. The results indicate that the integration of time prediction algorithms with LLMs has the potential to optimize software project progress management. These quantitative results suggest the effectiveness of the method in practical applications. In conclusion, this study demonstrates that integrating time prediction algorithms with LLMs can significantly improve the predictive accuracy and efficiency of software project management. This offers an advanced project management tool for the industry, with the potential to improve operational efficiency, optimize resource allocation, and ensure timely project completion.

Keywords: software project management, time prediction algorithms, large language models (LLMS), forecast accuracy, project progress prediction

Procedia PDF Downloads 57

25459 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: academic performance prediction system, educational data mining, dominant factors, feature selection method, prediction model, student performance

Procedia PDF Downloads 93

25458 An Improved Prediction Model of Ozone Concentration Time Series Based on Chaotic Approach

Authors: Nor Zila Abd Hamid, Mohd Salmi M. Noorani

Abstract:

This study is focused on the development of prediction models of the Ozone concentration time series. Prediction model is built based on chaotic approach. Firstly, the chaotic nature of the time series is detected by means of phase space plot and the Cao method. Then, the prediction model is built and the local linear approximation method is used for the forecasting purposes. Traditional prediction of autoregressive linear model is also built. Moreover, an improvement in local linear approximation method is also performed. Prediction models are applied to the hourly ozone time series observed at the benchmark station in Malaysia. Comparison of all models through the calculation of mean absolute error, root mean squared error and correlation coefficient shows that the one with improved prediction method is the best. Thus, chaotic approach is a good approach to be used to develop a prediction model for the Ozone concentration time series.

Keywords: chaotic approach, phase space, Cao method, local linear approximation method

Procedia PDF Downloads 311