Search results for: regression models drone
9216 Parameter Estimation via Metamodeling
Authors: Sergio Haram Sarmiento, Arcady Ponosov
Abstract:
Based on appropriate multivariate statistical methodology, we suggest a generic framework for efficient parameter estimation for ordinary differential equations and the corresponding nonlinear models. In this framework classical linear regression strategies is refined into a nonlinear regression by a locally linear modelling technique (known as metamodelling). The approach identifies those latent variables of the given model that accumulate most information about it among all approximations of the same dimension. The method is applied to several benchmark problems, in particular, to the so-called ”power-law systems”, being non-linear differential equations typically used in Biochemical System Theory.Keywords: principal component analysis, generalized law of mass action, parameter estimation, metamodels
Procedia PDF Downloads 5169215 Climate Changes in Albania and Their Effect on Cereal Yield
Authors: Lule Basha, Eralda Gjika
Abstract:
This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest
Procedia PDF Downloads 909214 Transport Related Air Pollution Modeling Using Artificial Neural Network
Authors: K. D. Sharma, M. Parida, S. S. Jain, Anju Saini, V. K. Katiyar
Abstract:
Air quality models form one of the most important components of an urban air quality management plan. Various statistical modeling techniques (regression, multiple regression and time series analysis) have been used to predict air pollution concentrations in the urban environment. These models calculate pollution concentrations due to observed traffic, meteorological and pollution data after an appropriate relationship has been obtained empirically between these parameters. Artificial neural network (ANN) is increasingly used as an alternative tool for modeling the pollutants from vehicular traffic particularly in urban areas. In the present paper, an attempt has been made to model traffic air pollution, specifically CO concentration using neural networks. In case of CO concentration, two scenarios were considered. First, with only classified traffic volume input and the second with both classified traffic volume and meteorological variables. The results showed that CO concentration can be predicted with good accuracy using artificial neural network (ANN).Keywords: air quality management, artificial neural network, meteorological variables, statistical modeling
Procedia PDF Downloads 5239213 The Impact of Public Open Space System on Housing Price in Chicago
Authors: Si Chen, Le Zhang, Xian He
Abstract:
The research explored the influences of public open space system on housing price through hedonic models, in order to support better open space plans and economic policies. We have three initial hypotheses: 1) public open space system has an overall positive influence on surrounding housing prices. 2) Different public open space types have different levels of influence on motivating surrounding housing prices. 3) Walking and driving accessibilities from property to public open spaces have different statistical relation with housing prices. Cook County, Illinois, was chosen to be a study area since data availability, sufficient open space types, and long-term open space preservation strategies. We considered the housing attributes, driving and walking accessibility scores from houses to nearby public open spaces, and driving accessibility scores to hospitals as influential features and used real housing sales price in 2010 as a dependent variable in the built hedonic model. Through ordinary least squares (OLS) regression analysis, General Moran’s I analysis and geographically weighted regression analysis, we observed the statistical relations between public open spaces and housing sale prices in the three built hedonic models and confirmed all three hypotheses.Keywords: hedonic model, public open space, housing sale price, regression analysis, accessibility score
Procedia PDF Downloads 1329212 Combination of Unmanned Aerial Vehicle and Terrestrial Laser Scanner Data for Citrus Yield Estimation
Authors: Mohammed Hmimou, Khalid Amediaz, Imane Sebari, Nabil Bounajma
Abstract:
Annual crop production is one of the most important macroeconomic indicators for the majority of countries around the world. This information is valuable, especially for exporting countries which need a yield estimation before harvest in order to correctly plan the supply chain. When it comes to estimating agricultural yield, especially for arboriculture, conventional methods are mostly applied. In the case of the citrus industry, the sale before harvest is largely practiced, which requires an estimation of the production when the fruit is on the tree. However, conventional method based on the sampling surveys of some trees within the field is always used to perform yield estimation, and the success of this process mainly depends on the expertise of the ‘estimator agent’. The present study aims to propose a methodology based on the combination of unmanned aerial vehicle (UAV) images and terrestrial laser scanner (TLS) point cloud to estimate citrus production. During data acquisition, a fixed wing and rotatory drones, as well as a terrestrial laser scanner, were tested. After that, a pre-processing step was performed in order to generate point cloud and digital surface model. At the processing stage, a machine vision workflow was implemented to extract points corresponding to fruits from the whole tree point cloud, cluster them into fruits, and model them geometrically in a 3D space. By linking the resulting geometric properties to the fruit weight, the yield can be estimated, and the statistical distribution of fruits size can be generated. This later property, which is information required by importing countries of citrus, cannot be estimated before harvest using the conventional method. Since terrestrial laser scanner is static, data gathering using this technology can be performed over only some trees. So, integration of drone data was thought in order to estimate the yield over a whole orchard. To achieve that, features derived from drone digital surface model were linked to yield estimation by laser scanner of some trees to build a regression model that predicts the yield of a tree given its features. Several missions were carried out to collect drone and laser scanner data within citrus orchards of different varieties by testing several data acquisition parameters (fly height, images overlap, fly mission plan). The accuracy of the obtained results by the proposed methodology in comparison to the yield estimation results by the conventional method varies from 65% to 94% depending mainly on the phenological stage of the studied citrus variety during the data acquisition mission. The proposed approach demonstrates its strong potential for early estimation of citrus production and the possibility of its extension to other fruit trees.Keywords: citrus, digital surface model, point cloud, terrestrial laser scanner, UAV, yield estimation, 3D modeling
Procedia PDF Downloads 1429211 Digital Mapping of First-Order Drainages and Springs of the Guajiru River, Northeast of Brazil, Based on Satellite and Drone Images
Authors: Sebastião Milton Pinheiro da Silva, Michele Barbosa da Rocha, Ana Lúcia Fernandes Campos, Miquéias Rildo de Souza Silva
Abstract:
Water is an essential natural resource for life on Earth. Rivers, lakes, lagoons and dams are the main sources of water storage for human consumption. The costs of extracting and using these water sources are lower than those of exploiting groundwater on transition zones to semi-arid terrains. However, the volume of surface water has decreased over time, with the depletion of first-order drainage and the disappearance of springs, phenomena which are easily observed in the field. Climate change worsens water scarcity, compromising supply and hydric security for rural populations. To minimize the expected impacts, producing and storing water through watershed management planning requires detailed cartographic information on the relief and topography, and updated data on the stage and intensity of catchment basin environmental degradation problems. The cartography available of the Brazilian northeastern territory dates to the 70s, with topographic maps, printed, at a scale of 1:100,000 which does not meet the requirements to execute this project. Exceptionally, there are topographic maps at scales of 1:50,000 and 1:25,000 of some coastal regions in northeastern Brazil. Still, due to scale limitations and outdatedness, they are products of little utility for mapping low-order watersheds drainage and springs. Remote sensing data and geographic information systems can contribute to guiding the process of mapping and environmental recovery by integrating detailed relief and topographic data besides social and other environmental information in the Guajiru River Basin, located on the east coast of Rio Grande do Norte, on the Northeast region of Brazil. This study aimed to recognize and map catchment basin, springs and low-order drainage features along estimating morphometric parameters. Alos PALSAR and Copernicus DEM digital elevation models were evaluated and provided regional drainage features and the watersheds limits extracted with Terraview/Terrahidro 5.0 software. CBERS 4A satellite images with 2 m spatial resolution, processed with ESA SNAP Toolbox, allowed generating land use land cover map of Guajiru River. A Mappir Survey 3 multiespectral camera onboard of a DJI Phantom 4, a Mavic 2 Pro PPK Drone and an X91 GNSS receiver to collect the precised position of selected points were employed to detail mapping. Satellite images enabled a first knowledge approach of watershed areas on a more regional scale, yet very current, and drone images were essential in mapping details of catchment basins. The drone multispectral image mosaics, the digital elevation model, the contour lines and geomorphometric parameters were generated using OpenDroneMap/ODM and QGis softwares. The drone images generated facilitated the location, understanding and mapping of watersheds, recharge areas and first-order ephemeral watercourses on an adequate scale and will be used in the following project’s phases: watershed management planning, recovery and environmental protection of Rio's springs Guajiru. Environmental degradation is being analyzed from the perspective of the availability and quality of surface water supply.Keywords: imaging, relief, UAV, water
Procedia PDF Downloads 299210 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model
Authors: Yepeng Cheng, Yasuhiko Morimoto
Abstract:
Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.Keywords: customer value, Huff's Gravity Model, POS, Retailer
Procedia PDF Downloads 1229209 Optimizing Nitrogen Fertilizer Application in Rice Cultivation: A Decision Model for Top and Ear Dressing Dosages
Authors: Ya-Li Tsai
Abstract:
Nitrogen is a vital element crucial for crop growth, significantly influencing crop yield. In rice cultivation, farmers often apply substantial nitrogen fertilizer to maximize yields. However, excessive nitrogen application increases the risk of lodging and pest infestation, leading to yield losses. Additionally, conventional flooded irrigation methods consume significant water resources, necessitating precise agricultural and intelligent water management systems. In this study, it leveraged physiological data and field images captured by unmanned aerial vehicles, considering fertilizer treatment and irrigation as key factors. Statistical models incorporating rice physiological data, yield, and vegetation indices from image data were developed. Missing physiological data were addressed using multiple imputation and regression methods, and regression models were established using principal component analysis and stepwise regression. Target nitrogen accumulation at key growth stages was identified to optimize fertilizer application, with the difference between actual and target nitrogen accumulation guiding recommendations for ear dressing dosage. Field experiments conducted in 2022 validated the recommended ear dressing dosage, demonstrating no significant difference in final yield compared to traditional fertilizer levels under alternate wetting and drying irrigation. These findings highlight the efficacy of applying recommended dosages based on fertilizer decision models, offering the potential for reduced fertilizer use while maintaining yield in rice cultivation.Keywords: intelligent fertilizer management, nitrogen top and ear dressing fertilizer, rice, yield optimization
Procedia PDF Downloads 799208 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data
Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill
Abstract:
Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function
Procedia PDF Downloads 2789207 Delivery of Contraceptive and Maternal Health Commodities with Drones in the Most Remote Areas of Madagascar
Authors: Josiane Yaguibou, Ngoy Kishimba, Issiaka V. Coulibaly, Sabrina Pestilli, Falinirina Razanalison, Hantanirina Andremanisa
Abstract:
Background: Madagascar has one of the least developed road networks in the world with a majority of its national and local roads being earth roads and in poor condition. In addition, the country is affected by frequent natural disasters that further affect the road conditions limiting the accessibility to some parts of the country. In 2021 and 2022, 2.21 million people were affected by drought in the Grand Sud region, and by cyclones and floods in the coastal regions, with disruptions of the health system including last mile distribution of lifesaving maternal health commodities and reproductive health commodities in the health facilities. Program intervention: The intervention uses drone technology to deliver maternal health and family planning commodities in hard-to-reach health facilities in the Grand Sud and Sud-Est of Madagascar, the regions more affected by natural disasters. Methodology The intervention was developed in two phases. A first phase, conducted in the Grand Sud, used drones leased from a private company to deliver commodities in isolated health facilities. Based on the lesson learnt and encouraging results of the first phase, in the second phase (2023) the intervention has been extended to the Sud Est regions with the purchase of drones and the recruitment of pilots to reduce costs and ensure sustainability. Key findings: The drones ensure deliveries of lifesaving commodities in the Grand Sud of Madagascar. In 2023, 297 deliveries in commodities in forty hard-to-reach health facilities have been carried out. Drone technology reduced delivery times from the usual 3 - 7 days necessary by road or boat to only a few hours. Program Implications: The use of innovative drone technology demonstrated to be successful in the Madagascar context to reduce dramatically the distribution time of commodities in hard-to-reach health facilities and avoid stockouts of life-saving medicines. When the intervention reaches full scale with the completion of the second phase and the extension in the Sud-Est, 150 hard-to-reach facilities will receive drone deliveries, avoiding stockouts and improving the quality of maternal health and family planning services offered to 1,4 million people in targeted areas.Keywords: commodities, drones, last-mile distribution, lifesaving supplies
Procedia PDF Downloads 659206 The Theory behind Logistic Regression
Authors: Jan Henrik Wosnitza
Abstract:
The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression
Procedia PDF Downloads 4259205 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products
Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic
Abstract:
The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.Keywords: chemometrics, regression analysis, kombucha, quality control
Procedia PDF Downloads 1419204 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health
Authors: Irfan Ahmad Afip
Abstract:
This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression
Procedia PDF Downloads 1149203 Prediction of Compressive Strength Using Artificial Neural Network
Authors: Vijay Pal Singh, Yogesh Chandra Kotiyal
Abstract:
Structures are a combination of various load carrying members which transfer the loads to the foundation from the superstructure safely. At the design stage, the loading of the structure is defined and appropriate material choices are made based upon their properties, mainly related to strength. The strength of materials kept on reducing with time because of many factors like environmental exposure and deformation caused by unpredictable external loads. Hence, to predict the strength of materials used in structures, various techniques are used. Among these techniques, Non-Destructive Techniques (NDT) are the one that can be used to predict the strength without damaging the structure. In the present study, the compressive strength of concrete has been predicted using Artificial Neural Network (ANN). The predicted strength was compared with the experimentally obtained actual compressive strength of concrete and equations were developed for different models. A good co-relation has been obtained between the predicted strength by these models and experimental values. Further, the co-relation has been developed using two NDT techniques for prediction of strength by regression analysis. It was found that the percentage error has been reduced between the predicted strength by using combined techniques in place of single techniques.Keywords: rebound, ultra-sonic pulse, penetration, ANN, NDT, regression
Procedia PDF Downloads 4269202 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models
Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti
Abstract:
In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics
Procedia PDF Downloads 529201 Regression for Doubly Inflated Multivariate Poisson Distributions
Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta
Abstract:
Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios
Procedia PDF Downloads 1569200 Drone On-Time Obstacle Avoidance for Static and Dynamic Obstacles
Authors: Herath M. P. C. Jayaweera, Samer Hanoun
Abstract:
Path planning for on-time obstacle avoidance is an essential and challenging task that enables drones to achieve safe operation in any application domain. The level of challenge increases significantly on the obstacle avoidance technique when the drone is following a ground mobile entity (GME). This is mainly due to the change in direction and magnitude of the GME′s velocity in dynamic and unstructured environments. Force field techniques are the most widely used obstacle avoidance methods due to their simplicity, ease of use, and potential to be adopted for three-dimensional dynamic environments. However, the existing force field obstacle avoidance techniques suffer many drawbacks, including their tendency to generate longer routes when the obstacles are sideways of the drone′s route, poor ability to find the shortest flyable path, propensity to fall into local minima, producing a non-smooth path, and high failure rate in the presence of symmetrical obstacles. To overcome these shortcomings, this paper proposes an on-time three-dimensional obstacle avoidance method for drones to effectively and efficiently avoid dynamic and static obstacles in unknown environments while pursuing a GME. This on-time obstacle avoidance technique generates velocity waypoints for its obstacle-free and efficient path based on the shape of the encountered obstacles. This method can be utilized on most types of drones that have basic distance measurement sensors and autopilot-supported flight controllers. The proposed obstacle avoidance technique is validated and evaluated against existing force field methods for different simulation scenarios in Gazebo and ROS-supported PX4-SITL. The simulation results show that the proposed obstacle avoidance technique outperforms the existing force field techniques and is better suited for real-world applications.Keywords: drones, force field methods, obstacle avoidance, path planning
Procedia PDF Downloads 939199 In Silico Modeling of Drugs Milk/Plasma Ratio in Human Breast Milk Using Structures Descriptors
Authors: Navid Kaboudi, Ali Shayanfar
Abstract:
Introduction: Feeding infants with safe milk from the beginning of their life is an important issue. Drugs which are used by mothers can affect the composition of milk in a way that is not only unsuitable, but also toxic for infants. Consuming permeable drugs during that sensitive period by mother could lead to serious side effects to the infant. Due to the ethical restrictions of drug testing on humans, especially women, during their lactation period, computational approaches based on structural parameters could be useful. The aim of this study is to develop mechanistic models to predict the M/P ratio of drugs during breastfeeding period based on their structural descriptors. Methods: Two hundred and nine different chemicals with their M/P ratio were used in this study. All drugs were categorized into two groups based on their M/P value as Malone classification: 1: Drugs with M/P>1, which are considered as high risk 2: Drugs with M/P>1, which are considered as low risk Thirty eight chemical descriptors were calculated by ACD/labs 6.00 and Data warrior software in order to assess the penetration during breastfeeding period. Later on, four specific models based on the number of hydrogen bond acceptors, polar surface area, total surface area, and number of acidic oxygen were established for the prediction. The mentioned descriptors can predict the penetration with an acceptable accuracy. For the remaining compounds (N= 147, 158, 160, and 174 for models 1 to 4, respectively) of each model binary regression with SPSS 21 was done in order to give us a model to predict the penetration ratio of compounds. Only structural descriptors with p-value<0.1 remained in the final model. Results and discussion: Four different models based on the number of hydrogen bond acceptors, polar surface area, and total surface area were obtained in order to predict the penetration of drugs into human milk during breastfeeding period About 3-4% of milk consists of lipids, and the amount of lipid after parturition increases. Lipid soluble drugs diffuse alongside with fats from plasma to mammary glands. lipophilicity plays a vital role in predicting the penetration class of drugs during lactation period. It was shown in the logistic regression models that compounds with number of hydrogen bond acceptors, PSA and TSA above 5, 90 and 25 respectively, are less permeable to milk because they are less soluble in the amount of fats in milk. The pH of milk is acidic and due to that, basic compounds tend to be concentrated in milk than plasma while acidic compounds may consist lower concentrations in milk than plasma. Conclusion: In this study, we developed four regression-based models to predict the penetration class of drugs during the lactation period. The obtained models can lead to a higher speed in drug development process, saving energy, and costs. Milk/plasma ratio assessment of drugs requires multiple steps of animal testing, which has its own ethical issues. QSAR modeling could help scientist to reduce the amount of animal testing, and our models are also eligible to do that.Keywords: logistic regression, breastfeeding, descriptors, penetration
Procedia PDF Downloads 699198 Ground Motion Modeling Using the Least Absolute Shrinkage and Selection Operator
Authors: Yildiz Stella Dak, Jale Tezcan
Abstract:
Ground motion models that relate a strong motion parameter of interest to a set of predictive seismological variables describing the earthquake source, the propagation path of the seismic wave, and the local site conditions constitute a critical component of seismic hazard analyses. When a sufficient number of strong motion records are available, ground motion relations are developed using statistical analysis of the recorded ground motion data. In regions lacking a sufficient number of recordings, a synthetic database is developed using stochastic, theoretical or hybrid approaches. Regardless of the manner the database was developed, ground motion relations are developed using regression analysis. Development of a ground motion relation is a challenging process which inevitably requires the modeler to make subjective decisions regarding the inclusion criteria of the recordings, the functional form of the model and the set of seismological variables to be included in the model. Because these decisions are critically important to the validity and the applicability of the model, there is a continuous interest on procedures that will facilitate the development of ground motion models. This paper proposes the use of the Least Absolute Shrinkage and Selection Operator (LASSO) in selecting the set predictive seismological variables to be used in developing a ground motion relation. The LASSO can be described as a penalized regression technique with a built-in capability of variable selection. Similar to the ridge regression, the LASSO is based on the idea of shrinking the regression coefficients to reduce the variance of the model. Unlike ridge regression, where the coefficients are shrunk but never set equal to zero, the LASSO sets some of the coefficients exactly to zero, effectively performing variable selection. Given a set of candidate input variables and the output variable of interest, LASSO allows ranking the input variables in terms of their relative importance, thereby facilitating the selection of the set of variables to be included in the model. Because the risk of overfitting increases as the ratio of the number of predictors to the number of recordings increases, selection of a compact set of variables is important in cases where a small number of recordings are available. In addition, identification of a small set of variables can improve the interpretability of the resulting model, especially when there is a large number of candidate predictors. A practical application of the proposed approach is presented, using more than 600 recordings from the National Geospatial-Intelligence Agency (NGA) database, where the effect of a set of seismological predictors on the 5% damped maximum direction spectral acceleration is investigated. The set of candidate predictors considered are Magnitude, Rrup, Vs30. Using LASSO, the relative importance of the candidate predictors has been ranked. Regression models with increasing levels of complexity were constructed using one, two, three, and four best predictors, and the models’ ability to explain the observed variance in the target variable have been compared. The bias-variance trade-off in the context of model selection is discussed.Keywords: ground motion modeling, least absolute shrinkage and selection operator, penalized regression, variable selection
Procedia PDF Downloads 3299197 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection
Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew
Abstract:
The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.
Procedia PDF Downloads 459196 Solving Dimensionality Problem and Finding Statistical Constructs on Latent Regression Models: A Novel Methodology with Real Data Application
Authors: Sergio Paez Moncaleano, Alvaro Mauricio Montenegro
Abstract:
This paper presents a novel statistical methodology for measuring and founding constructs in Latent Regression Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations on Item Response Theory (IRT). In addition, based on the fundamentals of submodel theory and with a convergence of many ideas of IRT, we propose an algorithm not just to solve the dimensionality problem (nowadays an open discussion) but a new research field that promises more fear and realistic qualifications for examiners and a revolution on IRT and educational research. In the end, the methodology is applied to a set of real data set presenting impressive results for the coherence, speed and precision. Acknowledgments: This research was financed by Colciencias through the project: 'Multidimensional Item Response Theory Models for Practical Application in Large Test Designed to Measure Multiple Constructs' and both authors belong to SICS Research Group from Universidad Nacional de Colombia.Keywords: item response theory, dimensionality, submodel theory, factorial analysis
Procedia PDF Downloads 3719195 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition
Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar
Abstract:
In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers
Procedia PDF Downloads 429194 Currency Exchange Rate Forecasts Using Quantile Regression
Authors: Yuzhi Cai
Abstract:
In this paper, we discuss a Bayesian approach to quantile autoregressive (QAR) time series model estimation and forecasting. Together with a combining forecasts technique, we then predict USD to GBP currency exchange rates. Combined forecasts contain all the information captured by the fitted QAR models at different quantile levels and are therefore better than those obtained from individual models. Our results show that an unequally weighted combining method performs better than other forecasting methodology. We found that a median AR model can perform well in point forecasting when the predictive density functions are symmetric. However, in practice, using the median AR model alone may involve the loss of information about the data captured by other QAR models. We recommend that combined forecasts should be used whenever possible.Keywords: combining forecasts, MCMC, predictive density functions, quantile forecasting, quantile modelling
Procedia PDF Downloads 2559193 The Relationship between Coping Styles and Internet Addiction among High School Students
Authors: Adil Kaval, Digdem Muge Siyez
Abstract:
With the negative effects of internet use in a person's life, the use of the Internet has become an issue. This subject was mostly considered as internet addiction, and it was investigated. In literature, it is noteworthy that some theoretical models have been proposed to explain the reasons for internet addiction. In addition to these theoretical models, it may be thought that the coping style for stressing events can be a predictor of internet addiction. It was aimed to test with logistic regression the effect of high school students' coping styles on internet addiction levels. Sample of the study consisted of 770 Turkish adolescents (471 girls, 299 boys) selected from high schools in the 2017-2018 academic year in İzmir province. Internet Addiction Test, Coping Scale for Child and Adolescents and a demographic information form were used in this study. The results of the logistic regression analysis indicated that the model of coping styles predicted internet addiction provides a statistically significant prediction of internet addiction. Gender does not predict whether or not to be addicted to the internet. The active coping style is not effective on internet addiction levels, while the avoiding and negative coping style are effective on internet addiction levels. With this model, % 79.1 of internet addiction in high school is estimated. The Negelkerke pseudo R2 indicated that the model accounted for %35 of the total variance. The results of this study on Turkish adolescents are similar to the results of other studies in the literature. It can be argued that avoiding and negative coping styles are important risk factors in the development of internet addiction.Keywords: adolescents, coping, internet addiction, regression analysis
Procedia PDF Downloads 1739192 The Effects of Cross-Border Use of Drones in Nigerian National Security
Authors: H. P. Kerry
Abstract:
Drone technology has become a significant discourse in a nation’s national security, while this technology could constitute a danger to national security on the one hand, on the other hand, it is used in developed and developing countries for border security, and in some cases, for protection of security agents and migrants. In the case of Nigeria, drones are used by the military to monitor and tighten security around the borders. However, terrorist groups have devised a means to utilize the technology to their advantage. Therefore, the potential danger in the widespread proliferation of this technology has become a myriad of risks. The research on the effects of cross-border use of drones in Nigerian national security looks at the negative and positive consequences of using drone technology. The study employs the use of interviews and relevant documents to obtain data while the study applied the Just War theory to justify the reason why countries use force; it further buttresses the points with what the realist theory thinks about the use of force. In conclusion, the paper recommends that the Nigerian government through the National Assembly should pass a bill for the establishment of a law that will guide the use of armed and unarmed drones in Nigeria enforced by the Nigeria Civil Aviation Authority and the office of the National Security Adviser.Keywords: armed drones, drones, cross-border, national security
Procedia PDF Downloads 1549191 Machine Learning Approaches to Water Usage Prediction in Kocaeli: A Comparative Study
Authors: Kasim Görenekli, Ali Gülbağ
Abstract:
This study presents a comprehensive analysis of water consumption patterns in Kocaeli province, Turkey, utilizing various machine learning approaches. We analyzed data from 5,000 water subscribers across residential, commercial, and official categories over an 80-month period from January 2016 to August 2022, resulting in a total of 400,000 records. The dataset encompasses water consumption records, weather information, weekends and holidays, previous months' consumption, and the influence of the COVID-19 pandemic.We implemented and compared several machine learning models, including Linear Regression, Random Forest, Support Vector Regression (SVR), XGBoost, Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU). Particle Swarm Optimization (PSO) was applied to optimize hyperparameters for all models.Our results demonstrate varying performance across subscriber types and models. For official subscribers, Random Forest achieved the highest R² of 0.699 with PSO optimization. For commercial subscribers, Linear Regression performed best with an R² of 0.730 with PSO. Residential water usage proved more challenging to predict, with XGBoost achieving the highest R² of 0.572 with PSO.The study identified key factors influencing water consumption, with previous months' consumption, meter diameter, and weather conditions being among the most significant predictors. The impact of the COVID-19 pandemic on consumption patterns was also observed, particularly in residential usage.This research provides valuable insights for effective water resource management in Kocaeli and similar regions, considering Turkey's high water loss rate and below-average per capita water supply. The comparative analysis of different machine learning approaches offers a comprehensive framework for selecting appropriate models for water consumption prediction in urban settings.Keywords: mMachine learning, water consumption prediction, particle swarm optimization, COVID-19, water resource management
Procedia PDF Downloads 149190 Machine Learning Techniques for Estimating Ground Motion Parameters
Authors: Farid Khosravikia, Patricia Clayton
Abstract:
The main objective of this study is to evaluate the advantages and disadvantages of various machine learning techniques in forecasting ground-motion intensity measures given source characteristics, source-to-site distance, and local site condition. Intensity measures such as peak ground acceleration and velocity (PGA and PGV, respectively) as well as 5% damped elastic pseudospectral accelerations at different periods (PSA), are indicators of the strength of shaking at the ground surface. Estimating these variables for future earthquake events is a key step in seismic hazard assessment and potentially subsequent risk assessment of different types of structures. Typically, linear regression-based models, with pre-defined equations and coefficients, are used in ground motion prediction. However, due to the restrictions of the linear regression methods, such models may not capture more complex nonlinear behaviors that exist in the data. Thus, this study comparatively investigates potential benefits from employing other machine learning techniques as a statistical method in ground motion prediction such as Artificial Neural Network, Random Forest, and Support Vector Machine. The algorithms are adjusted to quantify event-to-event and site-to-site variability of the ground motions by implementing them as random effects in the proposed models to reduce the aleatory uncertainty. All the algorithms are trained using a selected database of 4,528 ground-motions, including 376 seismic events with magnitude 3 to 5.8, recorded over the hypocentral distance range of 4 to 500 km in Oklahoma, Kansas, and Texas since 2005. The main reason of the considered database stems from the recent increase in the seismicity rate of these states attributed to petroleum production and wastewater disposal activities, which necessities further investigation in the ground motion models developed for these states. Accuracy of the models in predicting intensity measures, generalization capability of the models for future data, as well as usability of the models are discussed in the evaluation process. The results indicate the algorithms satisfy some physically sound characteristics such as magnitude scaling distance dependency without requiring pre-defined equations or coefficients. Moreover, it is shown that, when sufficient data is available, all the alternative algorithms tend to provide more accurate estimates compared to the conventional linear regression-based method, and particularly, Random Forest outperforms the other algorithms. However, the conventional method is a better tool when limited data is available.Keywords: artificial neural network, ground-motion models, machine learning, random forest, support vector machine
Procedia PDF Downloads 1219189 Derivation of Bathymetry from High-Resolution Satellite Images: Comparison of Empirical Methods through Geographical Error Analysis
Authors: Anusha P. Wijesundara, Dulap I. Rathnayake, Nihal D. Perera
Abstract:
Bathymetric information is fundamental importance to coastal and marine planning and management, nautical navigation, and scientific studies of marine environments. Satellite-derived bathymetry data provide detailed information in areas where conventional sounding data is lacking and conventional surveys are inaccessible. The two empirical approaches of log-linear bathymetric inversion model and non-linear bathymetric inversion model are applied for deriving bathymetry from high-resolution multispectral satellite imagery. This study compares these two approaches by means of geographical error analysis for the site Kankesanturai using WorldView-2 satellite imagery. Based on the Levenberg-Marquardt method calibrated the parameters of non-linear inversion model and the multiple-linear regression model was applied to calibrate the log-linear inversion model. In order to calibrate both models, Single Beam Echo Sounding (SBES) data in this study area were used as reference points. Residuals were calculated as the difference between the derived depth values and the validation echo sounder bathymetry data and the geographical distribution of model residuals was mapped. The spatial autocorrelation was calculated by comparing the performance of the bathymetric models and the results showing the geographic errors for both models. A spatial error model was constructed from the initial bathymetry estimates and the estimates of autocorrelation. This spatial error model is used to generate more reliable estimates of bathymetry by quantifying autocorrelation of model error and incorporating this into an improved regression model. Log-linear model (R²=0.846) performs better than the non- linear model (R²=0.692). Finally, the spatial error models improved bathymetric estimates derived from linear and non-linear models up to R²=0.854 and R²=0.704 respectively. The Root Mean Square Error (RMSE) was calculated for all reference points in various depth ranges. The magnitude of the prediction error increases with depth for both the log-linear and the non-linear inversion models. Overall RMSE for log-linear and the non-linear inversion models were ±1.532 m and ±2.089 m, respectively.Keywords: log-linear model, multi spectral, residuals, spatial error model
Procedia PDF Downloads 2979188 Comparative Study on Daily Discharge Estimation of Soolegan River
Authors: Redvan Ghasemlounia, Elham Ansari, Hikmet Kerem Cigizoglu
Abstract:
Hydrological modeling in arid and semi-arid regions is very important. Iran has many regions with these climate conditions such as Chaharmahal and Bakhtiari province that needs lots of attention with an appropriate management. Forecasting of hydrological parameters and estimation of hydrological events of catchments, provide important information that used for design, management and operation of water resources such as river systems, and dams, widely. Discharge in rivers is one of these parameters. This study presents the application and comparison of some estimation methods such as Feed-Forward Back Propagation Neural Network (FFBPNN), Multi Linear Regression (MLR), Gene Expression Programming (GEP) and Bayesian Network (BN) to predict the daily flow discharge of the Soolegan River, located at Chaharmahal and Bakhtiari province, in Iran. In this study, Soolegan, station was considered. This Station is located in Soolegan River at 51° 14՜ Latitude 31° 38՜ longitude at North Karoon basin. The Soolegan station is 2086 meters higher than sea level. The data used in this study are daily discharge and daily precipitation of Soolegan station. Feed Forward Back Propagation Neural Network(FFBPNN), Multi Linear Regression (MLR), Gene Expression Programming (GEP) and Bayesian Network (BN) models were developed using the same input parameters for Soolegan's daily discharge estimation. The results of estimation models were compared with observed discharge values to evaluate performance of the developed models. Results of all methods were compared and shown in tables and charts.Keywords: ANN, multi linear regression, Bayesian network, forecasting, discharge, gene expression programming
Procedia PDF Downloads 5599187 Modeling and Analysis Of Occupant Behavior On Heating And Air Conditioning Systems In A Higher Education And Vocational Training Building In A Mediterranean Climate
Authors: Abderrahmane Soufi
Abstract:
The building sector is the largest consumer of energy in France, accounting for 44% of French consumption. To reduce energy consumption and improve energy efficiency, France implemented an energy transition law targeting 40% energy savings by 2030 in the tertiary building sector. Building simulation tools are used to predict the energy performance of buildings but the reliability of these tools is hampered by discrepancies between the real and simulated energy performance of a building. This performance gap lies in the simplified assumptions of certain factors, such as the behavior of occupants on air conditioning and heating, which is considered deterministic when setting a fixed operating schedule and a fixed interior comfort temperature. However, the behavior of occupants on air conditioning and heating is stochastic, diverse, and complex because it can be affected by many factors. Probabilistic models are an alternative to deterministic models. These models are usually derived from statistical data and express occupant behavior by assuming a probabilistic relationship to one or more variables. In the literature, logistic regression has been used to model the behavior of occupants with regard to heating and air conditioning systems by considering univariate logistic models in residential buildings; however, few studies have developed multivariate models for higher education and vocational training buildings in a Mediterranean climate. Therefore, in this study, occupant behavior on heating and air conditioning systems was modeled using logistic regression. Occupant behavior related to the turn-on heating and air conditioning systems was studied through experimental measurements collected over a period of one year (June 2023–June 2024) in three classrooms occupied by several groups of students in engineering schools and professional training. Instrumentation was provided to collect indoor temperature and indoor relative humidity in 10-min intervals. Furthermore, the state of the heating/air conditioning system (off or on) and the set point were determined. The outdoor air temperature, relative humidity, and wind speed were collected as weather data. The number of occupants, age, and sex were also considered. Logistic regression was used for modeling an occupant turning on the heating and air conditioning systems. The results yielded a proposed model that can be used in building simulation tools to predict the energy performance of teaching buildings. Based on the first months (summer and early autumn) of the investigations, the results illustrate that the occupant behavior of the air conditioning systems is affected by the indoor relative humidity and temperature in June, July, and August and by the indoor relative humidity, temperature, and number of occupants in September and October. Occupant behavior was analyzed monthly, and univariate and multivariate models were developed.Keywords: occupant behavior, logistic regression, behavior model, mediterranean climate, air conditioning, heating
Procedia PDF Downloads 57