Search results for: boosted multivariate trees
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1262

Search results for: boosted multivariate trees

1112 Geographic Information System-Based Map for Best Suitable Place for Cultivating Permanent Trees in South-Lebanon

Authors: Allaw Kamel, Al-Chami Leila

Abstract:

It is important to reduce the human influence on natural resources by identifying an appropriate land use. Moreover, it is essential to carry out the scientific land evaluation. Such kind of analysis allows identifying the main factors of agricultural production and enables decision makers to develop crop management in order to increase the land capability. The key is to match the type and intensity of land use with its natural capability. Therefore; in order to benefit from these areas and invest them to obtain good agricultural production, they must be organized and managed in full. Lebanon suffers from the unorganized agricultural use. We take south Lebanon as a study area, it is the most fertile ground and has a variety of crops. The study aims to identify and locate the most suitable area to cultivate thirteen type of permanent trees which are: apples, avocados, stone fruits in coastal regions and stone fruits in mountain regions, bananas, citrus, loquats, figs, pistachios, mangoes, olives, pomegranates, and grapes. Several geographical factors are taken as criterion for selection of the best location to cultivate. Soil, rainfall, PH, temperature, and elevation are main inputs to create the final map. Input data of each factor is managed, visualized and analyzed using Geographic Information System (GIS). Management GIS tools are implemented to produce input maps capable of identifying suitable areas related to each index. The combination of the different indices map generates the final output map of the suitable place to get the best permanent tree productivity. The output map is reclassified into three suitability classes: low, moderate, and high suitability. Results show different locations suitable for different kinds of trees. Results also reflect the importance of GIS in helping decision makers finding a most suitable location for every tree to get more productivity and a variety in crops.

Keywords: agricultural production, crop management, geographical factors, Geographic Information System, GIS, land capability, permanent trees, suitable location

Procedia PDF Downloads 117
1111 A Data-Driven Monitoring Technique Using Combined Anomaly Detectors

Authors: Fouzi Harrou, Ying Sun, Sofiane Khadraoui

Abstract:

Anomaly detection based on Principal Component Analysis (PCA) was studied intensively and largely applied to multivariate processes with highly cross-correlated process variables. Monitoring metrics such as the Hotelling's T2 and the Q statistics are usually used in PCA-based monitoring to elucidate the pattern variations in the principal and residual subspaces, respectively. However, these metrics are ill suited to detect small faults. In this paper, the Exponentially Weighted Moving Average (EWMA) based on the Q and T statistics, T2-EWMA and Q-EWMA, were developed for detecting faults in the process mean. The performance of the proposed methods was compared with that of the conventional PCA-based fault detection method using synthetic data. The results clearly show the benefit and the effectiveness of the proposed methods over the conventional PCA method, especially for detecting small faults in highly correlated multivariate data.

Keywords: data-driven method, process control, anomaly detection, dimensionality reduction

Procedia PDF Downloads 268
1110 Impact of Land-Use and Climate Change on the Population Structure and Distribution Range of the Rare and Endangered Dracaena ombet and Dobera glabra in Northern Ethiopia

Authors: Emiru Birhane, Tesfay Gidey, Haftu Abrha, Abrha Brhan, Amanuel Zenebe, Girmay Gebresamuel, Florent Noulèkoun

Abstract:

Dracaena ombet and Dobera glabra are two of the most rare and endangered tree species in dryland areas. Unfortunately, their sustainability is being compromised by different anthropogenic and natural factors. However, the impacts of ongoing land use and climate change on the population structure and distribution of the species are less explored. This study was carried out in the grazing lands and hillside areas of the Desa'a dry Afromontane forest, northern Ethiopia, to characterize the population structure of the species and predict the impact of climate change on their potential distributions. In each land-use type, abundance, diameter at breast height, and height of the trees were collected using 70 sampling plots distributed over seven transects spaced one km apart. The geographic coordinates of each individual tree were also recorded. The results showed that the species populations were characterized by low abundance and unstable population structure. The latter was evinced by a lack of seedlings and mature trees. The study also revealed that the total abundance and dendrometric traits of the trees were significantly different between the two land uses. The hillside areas had a denser abundance of bigger and taller trees than the grazing lands. Climate change predictions using the MaxEnt model highlighted that future temperature increases coupled with reduced precipitation would lead to significant reductions in the suitable habitats of the species in northern Ethiopia. The species' suitable habitats were predicted to decline by 48–83% for D. ombet and 35–87% for D. glabra. Hence, to sustain the species populations, different strategies should be adopted, namely the introduction of alternative livelihoods (e.g., gathering NTFP) to reduce the overexploitation of the species for subsistence income and the protection of the current habitats that will remain suitable in the future using community-based exclosures. Additionally, the preservation of the species' seeds in gene banks is crucial to ensure their long-term conservation.

Keywords: grazing lands, hillside areas, land-use change, MaxEnt, range limitation, rare and endangered tree species

Procedia PDF Downloads 49
1109 Determination of Water Pollution and Water Quality with Decision Trees

Authors: Çiğdem Bakır, Mecit Yüzkat

Abstract:

With the increasing emphasis on water quality worldwide, the search for and expanding the market for new and intelligent monitoring systems has increased. The current method is the laboratory process, where samples are taken from bodies of water, and tests are carried out in laboratories. This method is time-consuming, a waste of manpower, and uneconomical. To solve this problem, we used machine learning methods to detect water pollution in our study. We created decision trees with the Orange3 software we used in our study and tried to determine all the factors that cause water pollution. An automatic prediction model based on water quality was developed by taking many model inputs such as water temperature, pH, transparency, conductivity, dissolved oxygen, and ammonia nitrogen with machine learning methods. The proposed approach consists of three stages: preprocessing of the data used, feature detection, and classification. We tried to determine the success of our study with different accuracy metrics and the results. We presented it comparatively. In addition, we achieved approximately 98% success with the decision tree.

Keywords: decision tree, water quality, water pollution, machine learning

Procedia PDF Downloads 59
1108 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 332
1107 Developing Allometric Equations for More Accurate Aboveground Biomass and Carbon Estimation in Secondary Evergreen Forests, Thailand

Authors: Titinan Pothong, Prasit Wangpakapattanawong, Stephen Elliott

Abstract:

Shifting cultivation is an indigenous agricultural practice among upland people and has long been one of the major land-use systems in Southeast Asia. As a result, fallows and secondary forests have come to cover a large part of the region. However, they are increasingly being replaced by monocultures, such as corn cultivation. This is believed to be a main driver of deforestation and forest degradation, and one of the reasons behind the recurring winter smog crisis in Thailand and around Southeast Asia. Accurate biomass estimation of trees is important to quantify valuable carbon stocks and changes to these stocks in case of land use change. However, presently, Thailand lacks proper tools and optimal equations to quantify its carbon stocks, especially for secondary evergreen forests, including fallow areas after shifting cultivation and smaller trees with a diameter at breast height (DBH) of less than 5 cm. Developing new allometric equations to estimate biomass is urgently needed to accurately estimate and manage carbon storage in tropical secondary forests. This study established new equations using a destructive method at three study sites: approximately 50-year-old secondary forest, 4-year-old fallow, and 7-year-old fallow. Tree biomass was collected by harvesting 136 individual trees (including coppiced trees) from 23 species, with a DBH ranging from 1 to 31 cm. Oven-dried samples were sent for carbon analysis. Wood density was calculated from disk samples and samples collected with an increment borer from 79 species, including 35 species currently missing from the Global Wood Densities database. Several models were developed, showing that aboveground biomass (AGB) was strongly related to DBH, height (H), and wood density (WD). Including WD in the model was found to improve the accuracy of the AGB estimation. This study provides insights for reforestation management, and can be used to prepare baseline data for Thailand’s carbon stocks for the REDD+ and other carbon trading schemes. These may provide monetary incentives to stop illegal logging and deforestation for monoculture.

Keywords: aboveground biomass, allometric equation, carbon stock, secondary forest

Procedia PDF Downloads 258
1106 Susceptibility of Different Clones of Eucalyptus Species against Gall Wasp, Leptocybe invasa Fisher and La Salle in Punjab, India

Authors: Ashwinder K. Dhaliwal, G. P. S. Dhillon

Abstract:

Eucalyptus is one of the most important forest tree species that can tolerate and grow well on degraded and unfertile soils which are not suitable for other tree species. Besides this, these trees have a short rotation and good economic value. However, the gall inducing wasp Leptocybe invasa Fisher and La Salle has been reported from many countries throughout the world. The spread of L. invasa is of huge economic concern as more than 20,000 ha of young Eucalyptus trees have already been affected in southern states of India. The host plant resistance being the first line of defense against insect pests demands the screening of different germplasm source against L. invasa. Keeping this in view, fourteen different clones of Eucalyptus spp. were evaluated for their susceptibility to L. invasa from a replicated clonal trial planted at Punjab Agricultural University, Ludhiana. The degree of gall infestation was recorded from three plants of each clone in each replication. Three branches selected from the lower, middle and upper canopy of the trees were selected for recording the total number of galls induced by L. invasa. The statistical analysis was done as per the procedure laid down for completely randomised block design (CRBD), analysis of variance (ANOVA), critical difference (CD) and variance components using Proc GLM (SAS software 9.3, SAS Institute Ltd. U.S.A). All possible treatment means were compared with Duncan’s multiple range test (DMRT) at 1 % probability level. The results showed that the clones C-9, C-45 and C-42 were completely free from the infestation of L. invasa. However, there was minor infestation of L. invasa on C-2135, C-413, C-407, C-35, C-72 and C-37 clones. The clone C-6 was severely infested by L. invasa followed by C-11, C-12, F-316 and C-25 clones. The information generated by this study will be helpful for future breeding and use in afforestation programmes.

Keywords: eucalyptus clones, gall wasp, Leptocybe invasa, screening, susceptibility

Procedia PDF Downloads 192
1105 Chemical Composition and Nutritional Value of Leaves and Pods of Leucaena Leucocephala, Prosopis Laevigata and Acacia Farnesiana in a Xerophyllous Shrubland

Authors: Miguel Mellado, Cecilia Zapata

Abstract:

Goats can be exploited in harsh environments due to their capacity to adjust to limited quantity and quality forage sources. In these environments, leguminous trees can be used as supplementary feeds as foliage and fruits of these trees can contribute to maintain or improve production efficiency in ruminants. The objective of this study was to determine the nutritional value of three leguminous trees heavily selected by goats in a xerophyllous shrubland. Chemical composition and in vitro dry matter disappearance (IVDMD) of leaves and pods from leucaena (Leucaena leucocephala), mesquite (Prosopis laevigata) and huisache (Acacia farnesiana) is presented. Crude protein (CP) ranged from 17.3% for leaves of huisache to 21.9% for leucaena. The neutral detergent fiber (NDF) content ranged from 39.0 to 40.3 with no difference among fodder threes. Across tree species, mean IVDMD was 61.6% for pods and 52.2% for leaves. IVDMD for leaves was highest (P < 0.01) for leucaena (54.9%) and lowest for huisache (47.3%). Condensed tannins in an acetonic extract were highest for leaves of huisache (45.3 mg CE/g DM) and lowest for mesquite (25.9 mg CE/g DM). Pods and leaves of huisache presented the highest number of secondary metabolites, mainly related to hydrobenzoic acid and flavonols; leucaena and mesquite presented mainly flavonols and anthocyanins. It was concluded that leaves and pods of leucaena, mesquite and huisache constitute valuable forages for ruminant livestock due to its low fiber, high CP levels, moderate in vitro fermentation characteristics and high mineral content. Keywords: Fodder tree; ruminants; secondary metabolites; minerals; tannins

Keywords: fodder tree, ruminants, secondary metabolites, minerals, tannins

Procedia PDF Downloads 113
1104 Multivariate Data Analysis for Automatic Atrial Fibrillation Detection

Authors: Zouhair Haddi, Stephane Delliaux, Jean-Francois Pons, Ismail Kechaf, Jean-Claude De Haro, Mustapha Ouladsine

Abstract:

Atrial fibrillation (AF) has been considered as the most common cardiac arrhythmia, and a major public health burden associated with significant morbidity and mortality. Nowadays, telemedical approaches targeting cardiac outpatients situate AF among the most challenged medical issues. The automatic, early, and fast AF detection is still a major concern for the healthcare professional. Several algorithms based on univariate analysis have been developed to detect atrial fibrillation. However, the published results do not show satisfactory classification accuracy. This work was aimed at resolving this shortcoming by proposing multivariate data analysis methods for automatic AF detection. Four publicly-accessible sets of clinical data (AF Termination Challenge Database, MIT-BIH AF, Normal Sinus Rhythm RR Interval Database, and MIT-BIH Normal Sinus Rhythm Databases) were used for assessment. All time series were segmented in 1 min RR intervals window and then four specific features were calculated. Two pattern recognition methods, i.e., Principal Component Analysis (PCA) and Learning Vector Quantization (LVQ) neural network were used to develop classification models. PCA, as a feature reduction method, was employed to find important features to discriminate between AF and Normal Sinus Rhythm. Despite its very simple structure, the results show that the LVQ model performs better on the analyzed databases than do existing algorithms, with high sensitivity and specificity (99.19% and 99.39%, respectively). The proposed AF detection holds several interesting properties, and can be implemented with just a few arithmetical operations which make it a suitable choice for telecare applications.

Keywords: atrial fibrillation, multivariate data analysis, automatic detection, telemedicine

Procedia PDF Downloads 238
1103 Immature Palm Tree Detection Using Morphological Filter for Palm Counting with High Resolution Satellite Image

Authors: Nur Nadhirah Rusyda Rosnan, Nursuhaili Najwa Masrol, Nurul Fatiha MD Nor, Mohammad Zafrullah Mohammad Salim, Sim Choon Cheak

Abstract:

Accurate inventories of oil palm planted areas are crucial for plantation management as this would impact the overall economy and production of oil. One of the technological advancements in the oil palm industry is semi-automated palm counting, which is replacing conventional manual palm counting via digitizing aerial imagery. Most of the semi-automated palm counting method that has been developed was limited to mature palms due to their ideal canopy size represented by satellite image. Therefore, immature palms were often left out since the size of the canopy is barely visible from satellite images. In this paper, an approach using a morphological filter and high-resolution satellite image is proposed to detect immature palm trees. This approach makes it possible to count the number of immature oil palm trees. The method begins with an erosion filter with an appropriate window size of 3m onto the high-resolution satellite image. The eroded image was further segmented using watershed segmentation to delineate immature palm tree regions. Then, local minimum detection was used because it is hypothesized that immature oil palm trees are located at the local minimum within an oil palm field setting in a grayscale image. The detection points generated from the local minimum are displaced to the center of the immature oil palm region and thinned. Only one detection point is left that represents a tree. The performance of the proposed method was evaluated on three subsets with slopes ranging from 0 to 20° and different planting designs, i.e., straight and terrace. The proposed method was able to achieve up to more than 90% accuracy when compared with the ground truth, with an overall F-measure score of up to 0.91.

Keywords: immature palm count, oil palm, precision agriculture, remote sensing

Procedia PDF Downloads 37
1102 Variation in the Traditional Knowledge of Curcuma longa L. in North-Eastern Algeria

Authors: A. Bouzabata, A. Boukhari

Abstract:

Curcuma longa L. (Zingiberaceae), commonly known as turmeric, has a long history of traditional uses for culinary purposes as a spice and a food colorant. The present study aimed to document the ethnobotanical knowledge about Curcuma longa and to assess the variation in the herbalists’ experience in Northeastern Algeria. Data were collected by semi-structured questionnaires and direct interviews with 30 herbalists. Ethnobotanical indices, including the fidelity level (FL%), the relative frequency citation (RFC) and use value (UV) were determined by quantitative methods. Diversity in the knowledge was analyzed using univariate, non-parametric and multivariate statistical methods. Three main categories of uses were recorded for C. longa: for food, for medicine and for cosmetic purposes. As a medicine, turmeric was used for the treatment of gastrointestinal, dermatological and hepatic diseases. Medicinal and food uses were correlated with both forms of use (rhizome and powder). The age group did not influence the use. Multivariate analyses showed a significant variation in traditional knowledge, associated with the use value, origin, quality and efficacy of the drug. These findings suggested that the geographical origin of C. longa affected the use in Algeria.

Keywords: curcuma, indices, knowledge, variation

Procedia PDF Downloads 516
1101 Detection of Abnormal Process Behavior in Copper Solvent Extraction by Principal Component Analysis

Authors: Kirill Filianin, Satu-Pia Reinikainen, Tuomo Sainio

Abstract:

Frequent measurements of product steam quality create a data overload that becomes more and more difficult to handle. In the current study, plant history data with multiple variables was successfully treated by principal component analysis to detect abnormal process behavior, particularly, in copper solvent extraction. The multivariate model is based on the concentration levels of main process metals recorded by the industrial on-stream x-ray fluorescence analyzer. After mean-centering and normalization of concentration data set, two-dimensional multivariate model under principal component analysis algorithm was constructed. Normal operating conditions were defined through control limits that were assigned to squared score values on x-axis and to residual values on y-axis. 80 percent of the data set were taken as the training set and the multivariate model was tested with the remaining 20 percent of data. Model testing showed successful application of control limits to detect abnormal behavior of copper solvent extraction process as early warnings. Compared to the conventional techniques of analyzing one variable at a time, the proposed model allows to detect on-line a process failure using information from all process variables simultaneously. Complex industrial equipment combined with advanced mathematical tools may be used for on-line monitoring both of process streams’ composition and final product quality. Defining normal operating conditions of the process supports reliable decision making in a process control room. Thus, industrial x-ray fluorescence analyzers equipped with integrated data processing toolbox allows more flexibility in copper plant operation. The additional multivariate process control and monitoring procedures are recommended to apply separately for the major components and for the impurities. Principal component analysis may be utilized not only in control of major elements’ content in process streams, but also for continuous monitoring of plant feed. The proposed approach has a potential in on-line instrumentation providing fast, robust and cheap application with automation abilities.

Keywords: abnormal process behavior, failure detection, principal component analysis, solvent extraction

Procedia PDF Downloads 281
1100 Running the Athena Vortex Lattice Code in JAVA through the Java Native Interface

Authors: Paul Okonkwo, Howard Smith

Abstract:

This paper describes a methodology to integrate the Athena Vortex Lattice Aerodynamic Software for automated operation in a multivariate optimisation of the Blended Wing Body Aircraft. The Athena Vortex Lattice code developed at the Massachusetts Institute of Technology allows for the aerodynamic analysis of aircraft using the vortex lattice method. Ordinarily, the Athena Vortex Lattice operation requires a text file containing the aircraft geometry to be loaded into the AVL solver in order to determine the aerodynamic forces and moments. However, automated operation will be required to enable integration into a multidisciplinary optimisation framework. Automated AVL operation within the JAVA design environment will nonetheless require a modification and recompilation of AVL source code into an executable file capable of running on windows and other platforms without the –X11 libraries. This paper describes the procedure for the integrating the FORTRAN written AVL software for automated operation within the multivariate design synthesis optimisation framework for the conceptual design of the BWB aircraft.

Keywords: aerodynamics, automation, optimisation, AVL, JNI

Procedia PDF Downloads 537
1099 Optimal Maintenance Policy for a Partially Observable Two-Unit System

Authors: Leila Jafari, Viliam Makis, G. B. Akram Khaleghei

Abstract:

In this paper, we present a maintenance model of a two-unit series system with economic dependence. Unit#1, which is considered to be more expensive and more important, is subject to condition monitoring (CM) at equidistant, discrete time epochs and unit#2, which is not subject to CM, has a general lifetime distribution. The multivariate observation vectors obtained through condition monitoring carry partial information about the hidden state of unit#1, which can be in a healthy or a warning state while operating. Only the failure state is assumed to be observable for both units. The objective is to find an optimal opportunistic maintenance policy minimizing the long-run expected average cost per unit time. The problem is formulated and solved in the partially observable semi-Markov decision process framework. An effective computational algorithm for finding the optimal policy and the minimum average cost is developed and illustrated by a numerical example.

Keywords: condition-based maintenance, semi-Markov decision process, multivariate Bayesian control chart, partially observable system, two-unit system

Procedia PDF Downloads 436
1098 In situ Real-Time Multivariate Analysis of Methanolysis Monitoring of Sunflower Oil Using FTIR

Authors: Pascal Mwenge, Tumisang Seodigeng

Abstract:

The combination of world population and the third industrial revolution led to high demand for fuels. On the other hand, the decrease of global fossil 8fuels deposits and the environmental air pollution caused by these fuels has compounded the challenges the world faces due to its need for energy. Therefore, new forms of environmentally friendly and renewable fuels such as biodiesel are needed. The primary analytical techniques for methanolysis yield monitoring have been chromatography and spectroscopy, these methods have been proven reliable but are more demanding, costly and do not provide real-time monitoring. In this work, the in situ monitoring of biodiesel from sunflower oil using FTIR (Fourier Transform Infrared) has been studied; the study was performed using EasyMax Mettler Toledo reactor equipped with a DiComp (Diamond) probe. The quantitative monitoring of methanolysis was performed by building a quantitative model with multivariate calibration using iC Quant module from iC IR 7.0 software. 15 samples of known concentrations were used for the modelling which were taken in duplicate for model calibration and cross-validation, data were pre-processed using mean centering and variance scale, spectrum math square root and solvent subtraction. These pre-processing methods improved the performance indexes from 7.98 to 0.0096, 11.2 to 3.41, 6.32 to 2.72, 0.9416 to 0.9999, RMSEC, RMSECV, RMSEP and R2Cum, respectively. The R2 value of 1 (training), 0.9918 (test), 0.9946 (cross-validation) indicated the fitness of the model built. The model was tested against univariate model; small discrepancies were observed at low concentration due to unmodelled intermediates but were quite close at concentrations above 18%. The software eliminated the complexity of the Partial Least Square (PLS) chemometrics. It was concluded that the model obtained could be used to monitor methanol of sunflower oil at industrial and lab scale.

Keywords: biodiesel, calibration, chemometrics, methanolysis, multivariate analysis, transesterification, FTIR

Procedia PDF Downloads 126
1097 Carbon Footprint Assessment Initiative and Trees: Role in Reducing Emissions

Authors: Omar Alelweet

Abstract:

Carbon emissions are quantified in terms of carbon dioxide equivalents, generated through a specific activity or accumulated throughout the life stages of a product or service. Given the growing concern about climate change and the role of carbon dioxide emissions in global warming, this initiative aims to create awareness and understanding of the impact of human activities and identify potential areas for improvement regarding the management of the carbon footprint on campus. Given that trees play a vital role in reducing carbon emissions by absorbing CO₂ during the photosynthesis process, this paper evaluated the contribution of each tree to reducing those emissions. Collecting data over an extended period of time is essential to monitoring carbon dioxide levels. This will help capture changes at different times and identify any patterns or trends in the data. By linking the data to specific activities, events, or environmental factors, it is possible to identify sources of emissions and areas where carbon dioxide levels are rising. Analyzing the collected data can provide valuable insights into ways to reduce emissions and mitigate the impact of climate change.

Keywords: sustainability, green building, environmental impact, CO₂

Procedia PDF Downloads 28
1096 Determination of Physical Properties of Crude Oil Distillates by Near-Infrared Spectroscopy and Multivariate Calibration

Authors: Ayten Ekin Meşe, Selahattin Şentürk, Melike Duvanoğlu

Abstract:

Petroleum refineries are a highly complex process industry with continuous production and high operating costs. Physical separation of crude oil starts with the crude oil distillation unit, continues with various conversion and purification units, and passes through many stages until obtaining the final product. To meet the desired product specification, process parameters are strictly followed. To be able to ensure the quality of distillates, routine analyses are performed in quality control laboratories based on appropriate international standards such as American Society for Testing and Materials (ASTM) standard methods and European Standard (EN) methods. The cut point of distillates in the crude distillation unit is very crucial for the efficiency of the upcoming processes. In order to maximize the process efficiency, the determination of the quality of distillates should be as fast as possible, reliable, and cost-effective. In this sense, an alternative study was carried out on the crude oil distillation unit that serves the entire refinery process. In this work, studies were conducted with three different crude oil distillates which are Light Straight Run Naphtha (LSRN), Heavy Straight Run Naphtha (HSRN), and Kerosene. These products are named after separation by the number of carbons it contains. LSRN consists of five to six carbon-containing hydrocarbons, HSRN consist of six to ten, and kerosene consists of sixteen to twenty-two carbon-containing hydrocarbons. Physical properties of three different crude distillation unit products (LSRN, HSRN, and Kerosene) were determined using Near-Infrared Spectroscopy with multivariate calibration. The absorbance spectra of the petroleum samples were obtained in the range from 10000 cm⁻¹ to 4000 cm⁻¹, employing a quartz transmittance flow through cell with a 2 mm light path and a resolution of 2 cm⁻¹. A total of 400 samples were collected for each petroleum sample for almost four years. Several different crude oil grades were processed during sample collection times. Extended Multiplicative Signal Correction (EMSC) and Savitzky-Golay (SG) preprocessing techniques were applied to FT-NIR spectra of samples to eliminate baseline shifts and suppress unwanted variation. Two different multivariate calibration approaches (Partial Least Squares Regression, PLS and Genetic Inverse Least Squares, GILS) and an ensemble model were applied to preprocessed FT-NIR spectra. Predictive performance of each multivariate calibration technique and preprocessing techniques were compared, and the best models were chosen according to the reproducibility of ASTM reference methods. This work demonstrates the developed models can be used for routine analysis instead of conventional analytical methods with over 90% accuracy.

Keywords: crude distillation unit, multivariate calibration, near infrared spectroscopy, data preprocessing, refinery

Procedia PDF Downloads 88
1095 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 116
1094 The Use of Multivariate Statistical and GIS for Characterization Groundwater Quality in Laghouat Region, Algeria

Authors: Rouighi Mustapha, Bouzid Laghaa Souad, Rouighi Tahar

Abstract:

Due to rain Shortage and the increase of population in the last years, wells excavation and groundwater use for different purposes had been increased without any planning. This is a great challenge for our country. Moreover, this scarcity of water resources in this region is unfortunately combined with rapid fresh water resources quality deterioration, due to salinity and contamination processes. Therefore, it is necessary to conduct the studies about groundwater quality in Algeria. In this work consists in the identification of the factors which influence the water quality parameters in Laghouat region by using statistical analysis Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA) and geographic information system (GIS) in an attempt to discriminate the sources of the variation of water quality variations. The results of PCA technique indicate that variables responsible for water quality composition are mainly related to soluble salts variables; natural processes and the nature of the rock which modifies significantly the water chemistry. Inferred from the positive correlation between K+ and NO3-, NO3- is believed to be human induced rather than naturally originated. In this study, the multivariate statistical analysis and GIS allows the hydrogeologist to have supplementary tools in the characterization and evaluating of aquifers.

Keywords: cluster, analysis, GIS, groundwater, laghouat, quality

Procedia PDF Downloads 292
1093 Modelling Operational Risk Using Extreme Value Theory and Skew t-Copulas via Bayesian Inference

Authors: Betty Johanna Garzon Rozo, Jonathan Crook, Fernando Moreira

Abstract:

Operational risk losses are heavy tailed and are likely to be asymmetric and extremely dependent among business lines/event types. We propose a new methodology to assess, in a multivariate way, the asymmetry and extreme dependence between severity distributions, and to calculate the capital for Operational Risk. This methodology simultaneously uses (i) several parametric distributions and an alternative mix distribution (the Lognormal for the body of losses and the Generalized Pareto Distribution for the tail) via extreme value theory using SAS®, (ii) the multivariate skew t-copula applied for the first time for operational losses and (iii) Bayesian theory to estimate new n-dimensional skew t-copula models via Markov chain Monte Carlo (MCMC) simulation. This paper analyses a newly operational loss data set, SAS Global Operational Risk Data [SAS OpRisk], to model operational risk at international financial institutions. All the severity models are constructed in SAS® 9.2. We implement the procedure PROC SEVERITY and PROC NLMIXED. This paper focuses in describing this implementation.

Keywords: operational risk, loss distribution approach, extreme value theory, copulas

Procedia PDF Downloads 561
1092 Process Optimization of Mechanochemical Synthesis for the Production of 4,4 Bipyridine Based MOFS using Twin Screw Extrusion and Multivariate Analysis

Authors: Ahmed Metawea, Rodrigo Soto, Majeida Kharejesh, Gavin Walker, Ahmad B. Albadarin

Abstract:

In this study, towards a green approach, we have investigated the effect of operating conditions of solvent assessed twin-screw extruder (TSE) for the production of 4, 4-bipyridine (1-dimensional coordinated polymer (1D)) based coordinated polymer using cobalt nitrate as a metal precursor with molar ratio 1:1. Different operating parameters such as solvent percentage, screw speed and feeding rate are considered. The resultant product is characterized using offline characterization methods, namely Powder X-ray diffraction (PXRD), Raman spectroscopy and scanning electron microscope (SEM) in order to investigate the product purity and surface morphology. A lower feeding rate increased the product’s quality as more resident time was provided for the reaction to take place. The most important influencing factor was the amount of liquid added. The addition of water helped in facilitating the reaction inside the TSE by increasing the surface area of the reaction for particles

Keywords: MOFS, multivariate analysis, process optimization, chemometric

Procedia PDF Downloads 127
1091 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction

Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi

Abstract:

For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.

Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy

Procedia PDF Downloads 79
1090 Evaluation of Dry Matter Yield of Panicum maximum Intercropped with Pigeonpea and Sesbania Sesban

Authors: Misheck Musokwa, Paramu Mafongoya, Simon Lorentz

Abstract:

Seasonal shortages of fodder during the dry season is a major constraint to smallholder livestock farmers in South Africa. To mitigate the shortage of fodder, legume trees can be intercropped with pastures which can diversify the sources of feed and increase the amount of protein for grazing animals. The objective was to evaluate dry matter yield of Panicum maximum and land productivity under different fodder production systems during 2016/17-2017/18 seasons at Empangeni (28.6391° S and 31.9400° E). A randomized complete block design, replicated three times was used, the treatments were sole Panicum maximum, Panicum maximum + Sesbania sesban, Panicum maximum + pigeonpea, sole Sesbania sesban, Sole pigeonpea. Three months S.sesbania seedlings were transplanted whilst pigeonpea was direct seeded at spacing of 1m x 1m. P. maximum seeds were drilled at a respective rate of 7.5 kg/ha having an inter-row spacing of 0.25 m apart. In between rows of trees P. maximum seeds were drilled. The dry matter yield harvesting times were separated by six months’ timeframe. A 0.25 m² quadrant randomly placed on 3 points on the plot was used as sampling area during harvesting P. maximum. There was significant difference P < 0.05 across 3 harvests and total dry matter. P. maximum had higher dry matter yield as compared to both intercrops at first harvest and total. The second and third harvest had no significant difference with pigeonpea intercrop. The results was in this order for all 3 harvest: P. maximum (541.2c, 1209.3b and 1557b) kg ha¹ ≥ P. maximum + pigeonpea (157.2b, 926.7b and 1129b) kg ha¹ > P. maximum + S. sesban (36.3a, 282a and 555a) kg ha¹. Total accumulation of dry matter yield of P. maximum (3307c kg ha¹) > P. maximum + pigeonpea (2212 kg ha¹) ≥ P. maximum + S. sesban (874 kg ha¹). There was a significant difference (P< 0.05) on seed yield for trees. Pigeonpea (1240.3 kg ha¹) ≥ Pigeonpea + P. maximum (862.7 kg ha¹) > S.sesbania (391.9 kg ha¹) ≥ S.sesbania + P. maximum. The Land Equivalent Ratio (LER) was in the following order P. maximum + pigeonpea (1.37) > P. maximum + S. sesban (0.84) > Pigeonpea (0.59) ≥ S. Sesbania (0.57) > P. maximum (0.26). Results indicates that it is beneficial to have P. maximum intercropped with pigeonpea because of higher land productivity. Planting grass with pigeonpea was more beneficial than S. sesban with grass or sole cropping in terms of saving the shortage of arable land. P. maximum + pigeonpea saves a substantial (37%) land which can be subsequently be used for other crop production. Pigeonpea is recommended as an intercrop with P. maximum due to its higher LER and combined production of livestock feed, human food, and firewood. Panicum grass is low in crude protein though high in carbohydrates, there is a need for intercropping it with legume trees. A farmer who buys concentrates can reduce costs by combining P. maximum with pigeonpea this will provide a balanced diet at low cost.

Keywords: fodder, livestock, productivity, smallholder farmers

Procedia PDF Downloads 123
1089 Using Discriminant Analysis to Forecast Crime Rate in Nigeria

Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele

Abstract:

This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.

Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda

Procedia PDF Downloads 440
1088 Feature Extraction and Impact Analysis for Solid Mechanics Using Supervised Finite Element Analysis

Authors: Edward Schwalb, Matthias Dehmer, Michael Schlenkrich, Farzaneh Taslimi, Ketron Mitchell-Wynne, Horen Kuecuekyan

Abstract:

We present a generalized feature extraction approach for supporting Machine Learning (ML) algorithms which perform tasks similar to Finite-Element Analysis (FEA). We report results for estimating the Head Injury Categorization (HIC) of vehicle engine compartments across various impact scenarios. Our experiments demonstrate that models learned using features derived with a simple discretization approach provide a reasonable approximation of a full simulation. We observe that Decision Trees could be as effective as Neural Networks for the HIC task. The simplicity and performance of the learned Decision Trees could offer a trade-off of a multiple order of magnitude increase in speed and cost improvement over full simulation for a reasonable approximation. When used as a complement to full simulation, the approach enables rapid approximate feedback to engineering teams before submission for full analysis. The approach produces mesh independent features and is further agnostic of the assembly structure.

Keywords: mechanical design validation, FEA, supervised decision tree, convolutional neural network.

Procedia PDF Downloads 106
1087 Reconstruction of Age-Related Generations of Siberian Larch to Quantify the Climatogenic Dynamics of Woody Vegetation Close the Upper Limit of Its Growth

Authors: A. P. Mikhailovich, V. V. Fomin, E. M. Agapitov, V. E. Rogachev, E. A. Kostousova, E. S. Perekhodova

Abstract:

Woody vegetation among the upper limit of its habitat is a sensitive indicator of biota reaction to regional climate changes. Quantitative assessment of temporal and spatial changes in the distribution of trees and plant biocenoses calls for the development of new modeling approaches based upon selected data from measurements on the ground level and ultra-resolution aerial photography. Statistical models were developed for the study area located in the Polar Urals. These models allow obtaining probabilistic estimates for placing Siberian Larch trees into one of the three age intervals, namely 1-10, 11-40 and over 40 years, based on the Weilbull distribution of the maximum horizontal crown projection. Authors developed the distribution map for larch trees with crown diameters exceeding twenty centimeters by deciphering aerial photographs made by a UAV from an altitude equal to fifty meters. The total number of larches was equal to 88608, forming the following distribution row across the abovementioned intervals: 16980, 51740, and 19889 trees. The results demonstrate that two processes can be observed in the course of recent decades: first is the intensive forestation of previously barren or lightly wooded fragments of the study area located within the patches of wood, woodlands, and sparse stand, and second, expansion into mountain tundra. The current expansion of the Siberian Larch in the region replaced the depopulation process that occurred in the course of the Little Ice Age from the late 13ᵗʰ to the end of the 20ᵗʰ century. Using data from field measurements of Siberian larch specimen biometric parameters (including height, diameter at root collar and at 1.3 meters, and maximum projection of the crown in two orthogonal directions) and data on tree ages obtained at nine circular test sites, authors developed a model for artificial neural network including two layers with three and two neurons, respectively. The model allows quantitative assessment of a specimen's age based on height and maximum crone projection values. Tree height and crown diameters can be quantitatively assessed using data from aerial photographs and lidar scans. The resulting model can be used to assess the age of all Siberian larch trees. The proposed approach, after validation, can be applied to assessing the age of other tree species growing near the upper tree boundaries in other mountainous regions. This research was collaboratively funded by the Russian Ministry for Science and Education (project No. FEUG-2023-0002) and Russian Science Foundation (project No. 24-24-00235) in the field of data modeling on the basis of artificial neural network.

Keywords: treeline, dynamic, climate, modeling

Procedia PDF Downloads 29
1086 Dynamic of an Invasive Insect Gut Microbiome When Facing to Abiotic Stress

Authors: Judith Mogouong, Philippe Constant, Robert Lavallee, Claude Guertin

Abstract:

The emerald ash borer (EAB) is an exotic wood borer insect native from China, which is associated with important environmental and economic damages in North America. Beetles are known to be vectors of microbial communities related to their adaptive capacities. It is now established that environmental stress factors may induce physiological events on the host trees, such as phytochemical changes. Consequently, that may affect the establishment comportment of herbivorous insect. Considering the number of insects collected on ash trees (insects’ density) as an abiotic factor related to stress damage, the aim of our study was to explore the dynamic of EAB gut microbial community genome (microbiome) when facing that factor and to monitor its diversity. Insects were trapped using specific green Lindgren© traps. A gradient of the captured insect population along the St. Lawrence River was used to create three levels of insects’ density (low, intermediate, and high). After dissection, total DNA extracted from insect guts of each level has been sent for amplicon sequencing of bacterial 16S rRNA gene and fungal ITS2 region. The composition of microbial communities among sample appeared largely diversified with the Simpson index significantly different across the three levels of density for bacteria. Add to that; bacteria were represented by seven phyla and twelve classes, whereas fungi were represented by two phyla and seven known classes. Using principal coordinate analysis (PCoA) based on Bray Curtis distances of 16S rRNA sequences, we observed a significant variation between the structure of the bacterial communities depending on insects’ density. Moreover, the analysis showed significant correlations between some bacterial taxa and the three classes of insects’ density. This study is the first to present a complete overview of the bacterial and fungal communities associated with the gut of EAB base on culture-independent methods, and to correlate those communities with a potential stress factor of the host trees.

Keywords: gut microbiome, DNA, 16S rRNA sequences, emerald ash borer

Procedia PDF Downloads 370
1085 Walmart Sales Forecasting using Machine Learning in Python

Authors: Niyati Sharma, Om Anand, Sanjeev Kumar Prasad

Abstract:

Assuming future sale value for any of the organizations is one of the major essential characteristics of tactical development. Walmart Sales Forecasting is the finest illustration to work with as a beginner; subsequently, it has the major retail data set. Walmart uses this sales estimate problem for hiring purposes also. We would like to analyzing how the internal and external effects of one of the largest companies in the US can walk out their Weekly Sales in the future. Demand forecasting is the planned prerequisite of products or services in the imminent on the basis of present and previous data and different stages of the market. Since all associations is facing the anonymous future and we do not distinguish in the future good demand. Hence, through exploring former statistics and recent market statistics, we envisage the forthcoming claim and building of individual goods, which are extra challenging in the near future. As a result of this, we are producing the required products in pursuance of the petition of the souk in advance. We will be using several machine learning models to test the exactness and then lastly, train the whole data by Using linear regression and fitting the training data into it. Accuracy is 8.88%. The extra trees regression model gives the best accuracy of 97.15%.

Keywords: random forest algorithm, linear regression algorithm, extra trees classifier, mean absolute error

Procedia PDF Downloads 120
1084 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 276
1083 Improving Predictions of Coastal Benthic Invertebrate Occurrence and Density Using a Multi-Scalar Approach

Authors: Stephanie Watson, Fabrice Stephenson, Conrad Pilditch, Carolyn Lundquist

Abstract:

Spatial data detailing both the distribution and density of functionally important marine species are needed to inform management decisions. Species distribution models (SDMs) have proven helpful in this regard; however, models often focus only on species occurrences derived from spatially expansive datasets and lack the resolution and detail required to inform regional management decisions. Boosted regression trees (BRT) were used to produce high-resolution SDMs (250 m) at two spatial scales predicting probability of occurrence, abundance (count per sample unit), density (count per km2) and uncertainty for seven coastal seafloor taxa that vary in habitat usage and distribution to examine prediction differences and implications for coastal management. We investigated if small scale regionally focussed models (82,000 km2) can provide improved predictions compared to data-rich national scale models (4.2 million km2). We explored the variability in predictions across model type (occurrence vs abundance) and model scale to determine if specific taxa models or model types are more robust to geographical variability. National scale occurrence models correlated well with broad-scale environmental predictors, resulting in higher AUC (Area under the receiver operating curve) and deviance explained scores; however, they tended to overpredict in the coastal environment and lacked spatially differentiated detail for some taxa. Regional models had lower overall performance, but for some taxa, spatial predictions were more differentiated at a localised ecological scale. National density models were often spatially refined and highlighted areas of ecological relevance producing more useful outputs than regional-scale models. The utility of a two-scale approach aids the selection of the most optimal combination of models to create a spatially informative density model, as results contrasted for specific taxa between model type and scale. However, it is vital that robust predictions of occurrence and abundance are generated as inputs for the combined density model as areas that do not spatially align between models can be discarded. This study demonstrates the variability in SDM outputs created over different geographical scales and highlights implications and opportunities for managers utilising these tools for regional conservation, particularly in data-limited environments.

Keywords: Benthic ecology, spatial modelling, multi-scalar modelling, marine conservation.

Procedia PDF Downloads 36