Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 16908

Search results for: model tree

16668 Modelling the Impact of Installation of Heat Cost Allocators in District Heating Systems Using Machine Learning

Authors: Danica Maljkovic, Igor Balen, Bojana Dalbelo Basic

Abstract:

Following the regulation of EU Directive on Energy Efficiency, specifically Article 9, individual metering in district heating systems has to be introduced by the end of 2016. These directions have been implemented in member state’s legal framework, Croatia is one of these states. The directive allows installation of both heat metering devices and heat cost allocators. Mainly due to bad communication and PR, the general public false image was created that the heat cost allocators are devices that save energy. Although this notion is wrong, the aim of this work is to develop a model that would precisely express the influence of installation heat cost allocators on potential energy savings in each unit within multifamily buildings. At the same time, in recent years, a science of machine learning has gain larger application in various fields, as it is proven to give good results in cases where large amounts of data are to be processed with an aim to recognize a pattern and correlation of each of the relevant parameter as well as in the cases where the problem is too complex for a human intelligence to solve. A special method of machine learning, decision tree method, has proven an accuracy of over 92% in prediction general building consumption. In this paper, a machine learning algorithms will be used to isolate the sole impact of installation of heat cost allocators on a single building in multifamily houses connected to district heating systems. Special emphasises will be given regression analysis, logistic regression, support vector machines, decision trees and random forest method.

Keywords: district heating, heat cost allocator, energy efficiency, machine learning, decision tree model, regression analysis, logistic regression, support vector machines, decision trees and random forest method

Procedia PDF Downloads 211

16667 Ecotourism Development in Ikogosi Warmspring, Nigeria: Implications on Its Floristic Composition and Structure

Authors: Oluwatobi Emmanuel Olaniyi, Babafemi George Ogunjemite

Abstract:

The high rate of infrastructural development in Ikogosi warm spring towards harnessing her great ecotourism potentials calls for a serious concern, as more forest areas are been opened up for public access and the landscape is modified. On this note, we investigated the implication of ecotourism development on the floristic composition and forest structure in Ikogosi. The study aimed at identifying the past and present status of infrastructural development, assessing and comparing the floristic composition and structure of the built- up/ recreational areas and undisturbed forested areas, to infer on the impact of ecotourism development on the study site. We conducted stakeholder interview and field observation to identify the past and present status of infrastructural development respectively. A total of ten quadrants were employed in the vegetation assessment to characterize the woody tree species composition, diameter at breast height and height, to obtain mean indices characterizing each part of the site. These indices were compared using T – test analysis. A total of 49 different woody tree species distributed in 21 families were identified in the built-in/ recreational areas while 67 different woody tree species belonging to 25 families were recorded in the undeveloped forested areas. Although, the latter has a higher mean diameter at breast height of woody trees, it was not significantly different from the former (T-test = -0.74, p = 0.46). On the contrary, the built-up area had a higher mean trees height than the undeveloped areas, but the difference was not statistically significant (T-test= 1.04, p = 0.30). Despite these, the slight reduction in richness and diversity of the woody tree species in the built- up/ recreational areas implies mitigating the negative effects of infrastructural development on the warm spring's vegetation.

Keywords: ecosystem services, forest structure, vegetation assessment, warm-spring

Procedia PDF Downloads 472

16666 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: decision tree, genetic algorithm, machine learning, software defect prediction

Procedia PDF Downloads 302

16665 Thermochemical and Biological Pretreatment Study for Efficient Sugar Release from Lignocellulosic Biomass (Deodar and Sal Wood Residues)

Authors: Neelu Raina, Parvez Singh Slathia, Deepali Bhagat, Preeti Sharma

Abstract:

Pretreatment of lignocellulosic biomass for generating suitable substrates (starch/ sugars) for conversion to bioethanol is the most crucial step. In present study waste from furniture industry i.e sawdust from softwood Cedrus deodara (deodar) and hardwood Shorea robusta (sal) was used as lignocellulosic biomass. Thermochemical pretreatment was given by autoclaving at 121°C temperature and 15 psi pressure. Acids (H2SO4,HCl,HNO3,H3PO4), alkali (NaOH,NH4OH,KOH,Ca(OH)2) and organic acids (C6H8O7,C2H2O4,C4H4O4) were used at 0.1%, 0.5% and 1% concentration without giving any residence time. 1% HCl gave maximum sugar yield of 3.6587g/L in deodar and 6.1539 g/L in sal. For biological pretreatment a fungi isolated from decaying wood was used , sawdust from deodar tree species was used as a lignocellulosic substrate and before thermochemical pretreatment sawdust was treated with fungal culture at 37°C under submerged conditions with a residence time of one week followed by a thermochemical pretreatment methodology. Higher sugar yields were obtained with sal tree species followed by deodar tree species, i.e., 6.0334g/L in deodar and 8.3605g/L in sal was obtained by a combined biological and thermochemical pretreatment. Use of acids along with biological pretreatment is a favourable factor for breaking the lignin seal and thus increasing the sugar yield. Sugar estimation was done using Dinitrosalicyclic assay method. Result validation is being done by statistical analysis.

Keywords: lignocellulosic biomass, bioethanol, pretreatment, sawdust

Procedia PDF Downloads 378

16664 BodeACD: Buffer Overflow Vulnerabilities Detecting Based on Abstract Syntax Tree, Control Flow Graph, and Data Dependency Graph

Authors: Xinghang Lv, Tao Peng, Jia Chen, Junping Liu, Xinrong Hu, Ruhan He, Minghua Jiang, Wenli Cao

Abstract:

As one of the most dangerous vulnerabilities, effective detection of buffer overflow vulnerabilities is extremely necessary. Traditional detection methods are not accurate enough and consume more resources to meet complex and enormous code environment at present. In order to resolve the above problems, we propose the method for Buffer overflow detection based on Abstract syntax tree, Control flow graph, and Data dependency graph (BodeACD) in C/C++ programs with source code. Firstly, BodeACD constructs the function samples of buffer overflow that are available on Github, then represents them as code representation sequences, which fuse control flow, data dependency, and syntax structure of source code to reduce information loss during code representation. Finally, BodeACD learns vulnerability patterns for vulnerability detection through deep learning. The results of the experiments show that BodeACD has increased the precision and recall by 6.3% and 8.5% respectively compared with the latest methods, which can effectively improve vulnerability detection and reduce False-positive rate and False-negative rate.

Keywords: vulnerability detection, abstract syntax tree, control flow graph, data dependency graph, code representation, deep learning

Procedia PDF Downloads 140

16663 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry

Authors: Deepika Christopher, Garima Anand

Abstract:

To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.

Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications

Procedia PDF Downloads 23

16662 Effect of Temperature on Germination and Seedlings Development of Moringa Oleifera Lam

Authors: Khater N., Rahmine S., Bougoffa C., Bouguenna T., Ouanes H.

Abstract:

Moringa oleifera L. species is considered one of the most useful trees in the world, possessing many interesting properties that make it of great scientific interest. It has been described as the miracle tree, the tree of a thousand virtues, the tree of life and God's gift to man. The present study aims to introduce, produce, and develop Moringa Oleifera as a species with high ecological potential (resistance to biotic and abiotic stresses and productivity), high added value, and multiple virtues. The aim of this work is to study the germination potential of this species under different temperature conditions. In this study, the germination assay was tested in two different temperature ranges: internal (laboratory ambient temperature between 22°c and 25°c) and external (seasonal temperature between 4°c and 8°c). Morphological and physiological analyses were carried out by Shoot length (SL), root length (RL), diameter at the crown (DC), fresh weight of shoots (FWS), fresh weight of roots (FWR), dry weight of shoots (DWS) and dry weight of roots (DWS). For all these variables, the results of the study reveal a significant difference between the two temperature intervals, with a high germination rate of 81. 81% and plant growth was rapid (7cm during 24h) in the laboratory temperature; in contrast to the external temperatures, a germination rate value of around 27% was recorded, and germination took place after 20 days of sowing, with slower plant growth. The results obtained show that a temperature greater than or equal to 25° is the ideal temperature for the germination and growth of moringa seeds and has a positive influence on the speed and percentage of germination.

Keywords: moringa oleifera, temperature, germination rate, growth, biomass

Procedia PDF Downloads 29

16661 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 47

16660 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction

Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi

Abstract:

For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.

Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy

Procedia PDF Downloads 79

16659 Spatial Relationship of Drug Smuggling Based on Geographic Information System Knowledge Discovery Using Decision Tree Algorithm

Authors: S. Niamkaeo, O. Robert, O. Chaowalit

Abstract:

In this investigation, we focus on discovering spatial relationship of drug smuggling along the northern border of Thailand. Thailand is no longer a drug production site, but Thailand is still one of the major drug trafficking hubs due to its topographic characteristics facilitating drug smuggling from neighboring countries. Our study areas cover three districts (Mae-jan, Mae-fahluang, and Mae-sai) in Chiangrai city and four districts (Chiangdao, Mae-eye, Chaiprakarn, and Wienghang) in Chiangmai city where drug smuggling of methamphetamine crystal and amphetamine occurs mostly. The data on drug smuggling incidents from 2011 to 2017 was collected from several national and local published news. Geo-spatial drug smuggling database was prepared. Decision tree algorithm was applied in order to discover the spatial relationship of factors related to drug smuggling, which was converted into rules using rule-based system. The factors including land use type, smuggling route, season and distance within 500 meters from check points were found that they were related to drug smuggling in terms of rules-based relationship. It was illustrated that drug smuggling was occurred mostly in forest area in winter. Drug smuggling exhibited was discovered mainly along topographic road where check points were not reachable. This spatial relationship of drug smuggling could support the Thai Office of Narcotics Control Board in surveillance drug smuggling.

Keywords: decision tree, drug smuggling, Geographic Information System, GIS knowledge discovery, rule-based system

Procedia PDF Downloads 143

16658 Empirical and Indian Automotive Equity Portfolio Decision Support

Authors: P. Sankar, P. James Daniel Paul, Siddhant Sahu

Abstract:

A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.

Keywords: Indian automotive sector, stock market decisions, equity portfolio analysis, decision tree classifiers, statistical data analysis

Procedia PDF Downloads 453

16657 Characteristics of Old-Growth and Secondary Forests in Relation to Age and Typhoon Disturbance

Authors: Teng-Chiu Lin, Pei-Jen Lee Shaner, Shin-Yu Lin

Abstract:

Both forest age and physical damages due to weather events such as tropical cyclones can influence forest characteristics and subsequently its capacity to sequester carbon. Detangling these influences is therefore a pressing issue under climate change. In this study, we compared the compositional and structural characteristics of three forests in Taiwan differing in age and severity of typhoon disturbances. We found that the two forests (one old-growth forest and one secondary forest) experiencing more severe typhoon disturbances had shorter stature, higher wood density, higher tree species diversity, and lower typhoon-induced tree mortality than the other secondary forest experiencing less severe typhoon disturbances. On the other hand, the old-growth forest had a larger amount of woody debris than the two secondary forests, suggesting a dominant role of forest age on woody debris accumulation. Of the three forests, only the two experiencing more severe typhoon disturbances formed new gaps following two 2015 typhoons, and between these two forests, the secondary forest gained more gaps than the old-growth forest. Consider that older forests generally have more gaps due to a higher background tree mortality, our findings suggest that the age effects on gap dynamics may be reversed by typhoon disturbances. This study demonstrated the effects of typhoons on forest characteristics, some of which could negate the age effects and rejuvenate older forests. If cyclone disturbances were to intensity under climate change, the capacity of older forests to sequester carbon may be reduced.

Keywords: typhoon, canpy gap, coarse woody debris, forest stature, forest age

Procedia PDF Downloads 238

16656 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall

Procedia PDF Downloads 248

16655 A Green Method for Selective Spectrophotometric Determination of Hafnium(IV) with Aqueous Extract of Ficus carica Tree Leaves

Authors: A. Boveiri Monji, H. Yousefnia, M. Haji Hosseini, S. Zolghadri

Abstract:

A clean spectrophotometric method for the determination of hafnium by using a green reagent, acidic extract of Ficus carica tree leaves is developed. In 6-M hydrochloric acid, hafnium reacts with this reagent to form a yellow product. The formed product shows maximum absorbance at 421 nm with a molar absorptivity value of 0.28 × 104 l mol⁻¹ cm⁻¹, and the method was linear in the 2-11 µg ml⁻¹ concentration range. The detection limit value was found to be 0.312 µg ml⁻¹. Except zirconium and iron, the selectivity was good, and most of the ions did not show any significant spectral interference at concentrations up to several hundred times. The proposed method was green, simple, low cost, and selective.

Keywords: spectrophotometric determination, Ficus caricatree leaves, synthetic reagents, hafnium

Procedia PDF Downloads 175

16654 Impact of Land-Use and Climate Change on the Population Structure and Distribution Range of the Rare and Endangered Dracaena ombet and Dobera glabra in Northern Ethiopia

Authors: Emiru Birhane, Tesfay Gidey, Haftu Abrha, Abrha Brhan, Amanuel Zenebe, Girmay Gebresamuel, Florent Noulèkoun

Abstract:

Dracaena ombet and Dobera glabra are two of the most rare and endangered tree species in dryland areas. Unfortunately, their sustainability is being compromised by different anthropogenic and natural factors. However, the impacts of ongoing land use and climate change on the population structure and distribution of the species are less explored. This study was carried out in the grazing lands and hillside areas of the Desa'a dry Afromontane forest, northern Ethiopia, to characterize the population structure of the species and predict the impact of climate change on their potential distributions. In each land-use type, abundance, diameter at breast height, and height of the trees were collected using 70 sampling plots distributed over seven transects spaced one km apart. The geographic coordinates of each individual tree were also recorded. The results showed that the species populations were characterized by low abundance and unstable population structure. The latter was evinced by a lack of seedlings and mature trees. The study also revealed that the total abundance and dendrometric traits of the trees were significantly different between the two land uses. The hillside areas had a denser abundance of bigger and taller trees than the grazing lands. Climate change predictions using the MaxEnt model highlighted that future temperature increases coupled with reduced precipitation would lead to significant reductions in the suitable habitats of the species in northern Ethiopia. The species' suitable habitats were predicted to decline by 48–83% for D. ombet and 35–87% for D. glabra. Hence, to sustain the species populations, different strategies should be adopted, namely the introduction of alternative livelihoods (e.g., gathering NTFP) to reduce the overexploitation of the species for subsistence income and the protection of the current habitats that will remain suitable in the future using community-based exclosures. Additionally, the preservation of the species' seeds in gene banks is crucial to ensure their long-term conservation.

Keywords: grazing lands, hillside areas, land-use change, MaxEnt, range limitation, rare and endangered tree species

Procedia PDF Downloads 46

16653 Predicting Football Player Performance: Integrating Data Visualization and Machine Learning

Authors: Saahith M. S., Sivakami R.

Abstract:

In the realm of football analytics, particularly focusing on predicting football player performance, the ability to forecast player success accurately is of paramount importance for teams, managers, and fans. This study introduces an elaborate examination of predicting football player performance through the integration of data visualization methods and machine learning algorithms. The research entails the compilation of an extensive dataset comprising player attributes, conducting data preprocessing, feature selection, model selection, and model training to construct predictive models. The analysis within this study will involve delving into feature significance using methodologies like Select Best and Recursive Feature Elimination (RFE) to pinpoint pertinent attributes for predicting player performance. Various machine learning algorithms, including Random Forest, Decision Tree, Linear Regression, Support Vector Regression (SVR), and Artificial Neural Networks (ANN), will be explored to develop predictive models. The evaluation of each model's performance utilizing metrics such as Mean Squared Error (MSE) and R-squared will be executed to gauge their efficacy in predicting player performance. Furthermore, this investigation will encompass a top player analysis to recognize the top-performing players based on the anticipated overall performance scores. Nationality analysis will entail scrutinizing the player distribution based on nationality and investigating potential correlations between nationality and player performance. Positional analysis will concentrate on examining the player distribution across various positions and assessing the average performance of players in each position. Age analysis will evaluate the influence of age on player performance and identify any discernible trends or patterns associated with player age groups. The primary objective is to predict a football player's overall performance accurately based on their individual attributes, leveraging data-driven insights to enrich the comprehension of player success on the field. By amalgamating data visualization and machine learning methodologies, the aim is to furnish valuable tools for teams, managers, and fans to effectively analyze and forecast player performance. This research contributes to the progression of sports analytics by showcasing the potential of machine learning in predicting football player performance and offering actionable insights for diverse stakeholders in the football industry.

Keywords: football analytics, player performance prediction, data visualization, machine learning algorithms, random forest, decision tree, linear regression, support vector regression, artificial neural networks, model evaluation, top player analysis, nationality analysis, positional analysis

Procedia PDF Downloads 10

16652 Urban Park Characteristics Defining Avian Community Structure

Authors: Deepti Kumari, Upamanyu Hore

Abstract:

Cities are an example of a human-modified environment with few fragments of urban green spaces, which are widely considered for urban biodiversity. The study aims to address the avifaunal diversity in urban parks based on the park size and their urbanization intensity. Also, understanding the key factors affecting species composition and structure as birds are a good indicator of a healthy ecosystem, and they are sensitive to changes in the environment. A 50 m-long line-transect method is used to survey birds in 39 urban parks in Delhi, India. Habitat variables, including vegetation (percentage of non-native trees, percentage of native trees, top canopy cover, sub-canopy cover, diameter at breast height, ground vegetation cover, shrub height) were measured using the quadrat method along the transect, and disturbance variables (distance from water, distance from road, distance from settlement, park area, visitor rate, and urbanization intensity) were measured using ArcGIS and google earth. We analyzed species data for diversity and richness. We explored the relation of species diversity and richness to habitat variables using the multi-model inference approach. Diversity and richness are found significant in different park sizes and their urbanization intensity. Medium size park supports more diversity, whereas large size park has more richness. However, diversity and richness both declined with increasing urbanization intensity. The result of CCA revealed that species composition in urban parks was positively associated with tree diameter at breast height and distance from the settlement. On the model selection approach, disturbance variables, especially distance from road, urbanization intensity, and visitors are the best predictors for the species richness of birds in urban parks. In comparison, multiple regression analysis between habitat variables and bird diversity suggested that native tree species in the park may explain the diversity pattern of birds in urban parks. Feeding guilds such as insectivores, omnivores, carnivores, granivores, and frugivores showed a significant relation with vegetation variables, while carnivores and scavenger bird species mainly responded with disturbance variables. The study highlights the importance of park size in urban areas and their urbanization intensity. It also indicates that distance from the settlement, distance from the road, urbanization intensity, visitors, diameter at breast height, and native tree species can be important determining factors for bird richness and diversity in urban parks. The study also concludes that the response of feeding guilds to vegetation and disturbance in urban parks varies. Therefore, we recommend that park size and surrounding urban matrix should be considered in order to increase bird diversity and richness in urban areas for designing and planning.

Keywords: diversity, feeding guild, urban park, urbanization intensity

Procedia PDF Downloads 70

16651 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 604

16650 Crude Palm Oil Antioxidant Extraction and the Antioxidation Activity

Authors: Supriyono Supriyono, Sumardiyono Sumardiyono, Peni Pujiastuti, Dian Indriana Hapsari

Abstract:

Crude palm oil (CPO) is a vegetable oil that came from a palm tree bunch. The productivity of the oil is 12 ton/hectare/year. Thus palm oil tree was known as highest vegetable oil yield. It was grown across Equatorial County, especially in Malaysia and Indonesia. The greenish-red color on CPO was come from carotenoid. Carotenoid is one of the antioxidants that could be extracted. Carotenoid could be used as functional food and other purposes. Another antioxidant that also found in CPO is tocopherol. The aim of the research work is to find antioxidant activity on CPO comparing to the synthetic antioxidant that available in a market. In this research work, antioxidant was extracted by a mixture of acetone and n.hexane, while the activity of the antioxidant extract was determined by DPPH method. Antioxidant activity of the extracted compound about 46% compared to pure tocopherol. While the solvent mixture compose by 90% acetone and 10% n. hexane meet the best on the antioxidant activity.

Keywords: antioxidant, beta carotene, crude palm oil, DPPH, tocopherol

Procedia PDF Downloads 173

16649 Conservation Status of a Lowland Tropical Forest in South-West, Nigeria

Authors: Lucky Dartsa Wakawa, Friday Nwabueze Ogana, Temitope Elizabeth Adeniyi

Abstract:

Timely and reliable information on the status of a forest is essential for assessing the extent of regeneration and degradation. However, when such information is lacking effective forest management practices becomes impossible. Therefore, this study assessed the tree species composition, richness, diversity, structure of Oluwa forest reserve with the view of ascertaining it conservation status. A systematic line transect was used in the laying of eight (8) temporary sample plots (TSPs) of size 50m x 50m. Trees with Dbh ≥ 10cm in the selected plots were enumerated, identified and measured. The results indicate that 535 individual trees were enumerated cutting across 26 families and 58 species. The family Sterculiaceae recorded the highest number of species (10) and occurrence (112) representing 17.2% and 20.93% respectively. Celtis zenkeri is the species with the highest number of occurrence of tree per hectare and importance value index (IVI) of 59 and 53.81 respectively. The reserve has the Margalef's index of species richness, Shannon-Weiner diversity Index (H') and Pielou's Species Evenness Index (EH) of 9.07, 3.43 and 0.84 respectively. The forest has a mean Dbh (cm), mean height (m), total basal area/ha (m2) and total volume/ha (m3) of 24.7, 16.9, 36.63 and 602.09 respectively. The important tropical tree species identified includes Diospyros crassiflora Milicia excels, Mansonia altisima, Triplochiton scleroxylon. Despite the level of exploitation in the forest, the forest seems to be resilience. Given the right attention, it could regenerate and replenish to save some of the original species composition of the reserve.

Keywords: forest conservation, forest structure, Lowland tropical forest, South-west Nigeria

Procedia PDF Downloads 313

16648 The Effects of Stand Density, Standards and Species Composition on Biomass Production in Traditional Coppices

Authors: Marek Mejstřík, Radim Matula, Martin Šrámek

Abstract:

Traditional coppices and coppice-with-standards were widely used throughout Europe and Asia for centuries but were largely abandoned in the second half of the 19th century, especially in central and northwestern Europe. In the last decades, there has been a renewed interest in traditional coppicing for nature conservation and most often, for rapid woody biomass production. However, there is little information on biomass productivity of traditional coppices and what affects it. Here, we focused on the effects of stand density, standards and tree species composition on sprout biomass production in newly restored coppices in the Czech Republic. We measured sprouts and calculated sprout biomass 7 years after the harvest from 2013 resprouting stumps in two 4 ha experimental plots. Each plot was divided into 64 subplots with different densities of standards and sprouting stumps. Total sprout biomass declined with increasing density of standards, but the effect of standards differed significantly among studied species. Whereas increasing density of standards decreased sprout biomass in Quercus petraea and Carpinus betulus, it did not affect sprout biomass productivity in Acer campestre and Tilia cordata. Sprout biomass on stand-level increased linearly with an increasing number of sprouting stumps and we observed no leveling of this relationship even in the highest densities of stumps. We also found a significant shift in tree species composition with the steeply declining relative abundance of Quercus in favor of other studied tree species.

Keywords: traditional coppice, coppice with standards, sprout biomass, forest management

Procedia PDF Downloads 127

16647 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 16

16646 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: machine learning, medical diagnosis, meningitis detection, pediatric research

Procedia PDF Downloads 121

16645 A Combinatorial Representation for the Invariant Measure of Diffusion Processes on Metric Graphs

Authors: Michele Aleandri, Matteo Colangeli, Davide Gabrielli

Abstract:

We study a generalization to a continuous setting of the classical Markov chain tree theorem. In particular, we consider an irreducible diffusion process on a metric graph. The unique invariant measure has an atomic component on the vertices and an absolutely continuous part on the edges. We show that the corresponding density at x can be represented by a normalized superposition of the weights associated to metric arborescences oriented toward the point x. A metric arborescence is a metric tree oriented towards its root. The weight of each oriented metric arborescence is obtained by the product of the exponential of integrals of the form ∫a/b², where b is the drift and σ² is the diffusion coefficient, along the oriented edges, for a weight for each node determined by the local orientation of the arborescence around the node and for the inverse of the diffusion coefficient at x. The metric arborescences are obtained by cutting the original metric graph along some edges.

Keywords: diffusion processes, metric graphs, invariant measure, reversibility

Procedia PDF Downloads 134

16644 Risk Analysis of Leaks from a Subsea Oil Facility Based on Fuzzy Logic Techniques

Authors: Belén Vinaixa Kinnear, Arturo Hidalgo López, Bernardo Elembo Wilasi, Pablo Fernández Pérez, Cecilia Hernández Fuentealba

Abstract:

The expanded use of risk assessment in legislative and corporate decision-making has increased the role of expert judgement in giving data for security-related decision-making. Expert judgements are required in most steps of risk assessment: danger recognizable proof, hazard estimation, risk evaluation, and examination of choices. This paper presents a fault tree analysis (FTA), which implies a probabilistic failure analysis applied to leakage of oil in a subsea production system. In standard FTA, the failure probabilities of items of a framework are treated as exact values while evaluating the failure probability of the top event. There is continuously insuﬃciency of data for calculating the failure estimation of components within the drilling industry. Therefore, fuzzy hypothesis can be used as a solution to solve the issue. The aim of this paper is to examine the leaks from the Zafiro West subsea oil facility by using fuzzy fault tree analysis (FFTA). As a result, the research has given theoretical and practical contributions to maritime safety and environmental protection. It has been also an effective strategy used traditionally in identifying hazards in nuclear installations and power industries.

Keywords: expert judgment, probability assessment, fault tree analysis, risk analysis, oil pipelines, subsea production system, drilling, quantitative risk analysis, leakage failure, top event, off-shore industry

Procedia PDF Downloads 158

16643 A Comprehensive Method of Fault Detection and Isolation based on Testability Modeling Data

Authors: Junyou Shi, Weiwei Cui

Abstract:

Testability modeling is a commonly used method in testability design and analysis of system. A dependency matrix will be obtained from testability modeling, and we will give a quantitative evaluation about fault detection and isolation. Based on the dependency matrix, we can obtain the diagnosis tree. The tree provides the procedures of the fault detection and isolation. But the dependency matrix usually includes built-in test (BIT) and manual test in fact. BIT runs the test automatically and is not limited by the procedures. The method above cannot give a more efficient diagnosis and use the advantages of the BIT. A Comprehensive method of fault detection and isolation is proposed. This method combines the advantages of the BIT and Manual test by splitting the matrix. The result of the case study shows that the method is effective.

Keywords: fault detection, fault isolation, testability modeling, BIT

Procedia PDF Downloads 298

16642 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 302

16641 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 508

16640 Study on Relevance Between Electrical Tree Growth and Partial Discharges in Epoxy Resin Materials

Authors: Chien-Kuo Chang, You-Syuan Wu, Min-Chiu Wu, Chun-Wei Wang

Abstract:

Epoxy resin is widely used in the insulation of high-voltage equipment such as transformers and insulating bushings due to its good electrical insulation properties. However, manufacturing defects will cause unpredictable accidents. Therefore, it is an important issue to determine the insulation state of equipment by measuring partial discharges. In this study, the needle-plane electrode structure was used to test the epoxy resin electrical treeing insulation deterioration phenomenon. During the test, we measured the partial discharge signal and then used the signal as the input data of the insulation status assessment system, which was developed in the past research. The experimental samples were made of transparent epoxy resin to facilitate the observation of changes, and were made in the distance of 1 cm and 1.5 cm of 5 sets. During the experiment, a magnifying glass with a total magnification of 2 times is set up to enlarge the picture and a time-lapse camera is used to record the changes of the experimental samples. In the experiment, we found that the electrical treeing phenomenon of the epoxy resin insulation deterioration process can be divided into several stages: initial dark tree, filamentary tree, reverse tree, and insulation breakdown, and simply observed each stage of electrical treeing. After substituting the partial discharge signal into the insulation status assessment system, it can be found that most experimental samples were assessed into the attention period in the middle of the test and into the risky period in the middle and late of the test. Compared to the attention period signal to the recorded film, there was no obvious correlation currently, but compared to the risky period signal, we can see that the experimental sample deformed due to the temperature rise caused by the larger and more frequent discharge. Besides, we also try to collect data about different types of PD by mixing high dielectric constant materials and changing the interior constitution of the sample. Recording data like PDIV、PDEV、RPDIV, the data that recorded can improve performance of various algorithm models.

Keywords: partial discharge, insulation deterioration, epoxy resin, electrical treeing

Procedia PDF Downloads 20

16639 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score

Procedia PDF Downloads 106