Search results for: tree algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2743

Search results for: tree algorithms

2263 The Role of Named Entity Recognition for Information Extraction

Authors: Girma Yohannis Bade, Olga Kolesnikova, Grigori Sidorov

Abstract:

Named entity recognition (NER) is a building block for information extraction. Though the information extraction process has been automated using a variety of techniques to find and extract a piece of relevant information from unstructured documents, the discovery of targeted knowledge still poses a number of research difficulties because of the variability and lack of structure in Web data. NER, a subtask of information extraction (IE), came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as the name of the person, country, location, organization, dates, and event in a document, and categorizing them as predetermined labels, which is an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Thus, this paper well summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, and so on. It also outlines the three types of sequence labeling algorithms for NER such as feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.

Keywords: the role of NER, named entity recognition, information extraction, sequence labeling algorithms, named entity application area

Procedia PDF Downloads 69
2262 The Impact of Mining Activities on the Surface Water Quality: A Case Study of the Kaap River in Barberton, Mpumalanga

Authors: M. F. Mamabolo

Abstract:

Mining activities are identified as the most significant source of heavy metal contamination in river basins, due to inadequate disposal of mining waste thus resulting in acid mine drainage. Waste materials generated from gold mining and processing have severe and widespread impacts on water resources. Therefore, a total of 30 water samples were collected from Fig Tree Creek, Kaapriver, Sheba mine stream & Sauid kaap river to investigate the impact of gold mines on the Kaap River system. Physicochemical parameters (pH, EC and TDS) were taken using a BANTE 900P portable water quality meter. The concentration of Fe, Cu, Co, and SO₄²⁻ in water samples were analysed using Inductively Coupled Plasma-Mass spectrophotometry (ICP-MS) at 0.01 mg/L. The results were compared to the regulatory guideline of the World Health Organization (WHO) and the South Africa National Standards (SANS). It was found that Fe, Cu and Co were below the guideline values while SO₄²⁻ detected in Sheba mine stream exceeded the 250 mg/L limit for both seasons, attributed by mine wastewater. SO₄²⁻ was higher in wet season due to high evaporation rates and greater interaction between rocks and water. The pH of all the streams was within the limit (≥5 to ≤9.7), however EC of the Sheba mine stream, Suid Kaap River & where the tributary connects with the Fig Tree Creek exceeded 1700 uS/m, due to dissolved material. The TDS of Sheba mine stream exceeded 1000 mg/L, attributed by high SO₄²⁻ concentration. While the tributary connecting to the Fig Tree Creek exceed the value due to pollution from household waste, runoff from agriculture etc. In conclusion, the water from all sampled streams were safe for consumption due to low concentrations of physicochemical parameters. However, elevated concentration of SO₄²⁻ should be monitored and managed to avoid water quality deterioration in the Kaap River system.

Keywords: Kaap river system, mines, heavy metals, sulphate

Procedia PDF Downloads 69
2261 Study and Improvement of the Quality of a Production Line

Authors: S. Bouchami, M.N. Lakhoua

Abstract:

The automotive market is a dynamic market that continues to grow. That’s why several companies belonging to this sector adopt a quality improvement approach. Wanting to be competitive and successful in the environment in which they operate, these companies are dedicated to establishing a system of quality management to ensure the achievement of the objective quality, improving the products and process as well as the satisfaction of the customers. In this paper, the management of the quality and the improvement of a production line in an industrial company is presented. In fact, the project is divided into two essential parts: the creation of the technical line documentation and the quality assurance documentation and the resolution of defects at the line, as well as those claimed by the customer. The creation of the documents has required a deep understanding of the manufacturing process. The analysis and problem solving were done through the implementation of PDCA (Plan Do Check Act) and FTA (Fault Tree Analysis). As perspective, in order to better optimize production and improve the efficiency of the production line, a study on the problems associated with the supply of raw materials should be made to solve the problems of stock-outs which cause delays penalizing for the industrial company.

Keywords: quality management, documentary system, Plan Do Check Act (PDCA), fault tree analysis (FTA) method

Procedia PDF Downloads 131
2260 Design and Experiment of Orchard Gas Explosion Subsoiling and Fertilizer Injection Machine

Authors: Xiaobo Xi, Ruihong Zhang

Abstract:

At present, the orchard ditching and fertilizing technology has a series of problems, such as easy tree roots damage, high energy consumption and uneven fertilizing. In this paper, a gas explosion subsoiling and fertilizer injection machine was designed, which used high pressure gas to shock soil body and then injected fertilizer. The drill pipe mechanism with pneumatic chipping hammer excitation and hydraulic assistance was designed to drill the soil. The operation of gas and liquid fertilizer supply was controlled by PLC system. The 3D model of the whole machine was established by using SolidWorks software. The machine prototype was produced, and field experiments were carried out. The results showed that soil fractures were created and diffused by gas explosion, and the subsoiling effect radius reached 40 cm under the condition of 0.8 MPa gas pressure and 30 cm drilling depth. What’s more, the work efficiency is 0.048 hm2/h at least. This machine could meet the agronomic requirements of orchard, garden and city greening fertilization, and the tree roots were not easily damaged and the fertilizer evenly distributed, which was conducive to nutrient absorption of root growth.

Keywords: gas explosion subsoiling, fertigation, pneumatic chipping hammer exciting, soil compaction

Procedia PDF Downloads 195
2259 Physicochemical-Mechanical, Thermal and Rheological Properties Analysis of Pili Tree (Canarium Ovatum) Resin as Aircraft Integral Fuel Tank Sealant

Authors: Mark Kennedy, E. Bantugon, Noruane A. Daileg

Abstract:

Leaks arising from aircraft fuel tanks is a protracted problem for the aircraft manufacturers, operators, and maintenance crews. It principally arises from stress, structural defects, or degraded sealants as the aircraft age. It can be ignited by different sources, which can result in catastrophic flight and consequences, exhibiting a major drain both on time and budget. In order to mitigate and eliminate this kind of problem, the researcher produced an experimental sealant having a base material of natural tree resin, the Pili Tree Resin. Aside from producing an experimental sealant, the main objective of this research is to analyze its physical, chemical, mechanical, thermal, and rheological properties, which is beneficial and effective for specific aircraft parts, particularly the integral fuel tank. The experimental method of research was utilized in this study since it is a product invention. This study comprises two parts, specifically the Optimization Process and the Characterization Process. In the Optimization Process, the experimental sealant was subjected to the Flammability Test, an important test and consideration according to 14 Code of Federal Regulation Appendix N, Part 25 - Fuel Tank Flammability Exposure and Reliability Analysis, to get the most suitable formulation. Followed by the Characterization Process, where the formulated experimental sealant has undergone thirty-eight (38) different standard testing including Organoleptic, Instrumental Color Measurement Test, Smoothness of Appearance Test, Miscibility Test, Boiling Point Test, Flash Point Test, Curing Time, Adhesive Test, Toxicity Test, Shore A Hardness Test, Compressive Strength, Shear Strength, Static Bending Strength, Tensile Strength, Peel Strength Test, Knife Test, Adhesion by Tape Test, Leakage Test), Drip Test, Thermogravimetry-Differential Thermal Analysis (TG-DTA), Differential Scanning Calorimetry, Calorific Value, Viscosity Test, Creep Test, and Anti-Sag Resistance Test to determine and analyze the five (5) material properties of the sealant. The numerical values of the mentioned tests are determined using product application, testing, and calculation. These values are then used to calculate the efficiency of the experimental sealant. Accordingly, this efficiency is the means of comparison between the experimental and commercial sealant. Based on the results of the different standard testing conducted, the experimental sealant exceeded all the data results of the commercial sealant. This result shows that the physicochemical-mechanical, thermal, and rheological properties of the experimental sealant are far more effective as an aircraft integral fuel tank sealant alternative in comparison to the commercial sealant. Therefore, Pili Tree possesses a new role and function: a source of ingredients in sealant production.

Keywords: Aircraft Integral Fuel Tank, Physicochemi-mechanical, Pili Tree Resin, Properties, Rheological, Sealant, Thermal

Procedia PDF Downloads 270
2258 Modeling of Power Network by ATP-Draw for Lightning Stroke Studies

Authors: John Morales, Armando Guzman

Abstract:

Protection relay algorithms play a crucial role in Electric Power System stability, where, it is clear that lightning strokes produce the mayor percentage of faults and outages of Transmission Lines (TLs) and Distribution Feeders (DFs). In this context, it is imperative to develop novel protection relay algorithms. However, in order to get this aim, Electric Power Systems (EPS) network have to be simulated as real as possible, especially the lightning phenomena, and EPS elements that affect their behavior like direct and indirect lightning, insulator string, overhead line, soil ionization and other. However, researchers have proposed new protection relay algorithms considering common faults, which are not produced by lightning strokes, omitting these imperative phenomena for the transmission line protection relays behavior. Based on the above said, this paper presents the possibilities of using the Alternative Transient Program ATP-Draw for the modeling and simulation of some models to make lightning stroke studies, especially for protection relays, which are developed through Transient Analysis of Control Systems (TACS) and MODELS language corresponding to the ATP-Draw.

Keywords: back-flashover, faults, flashover, lightning stroke, modeling of lightning, outages, protection relays

Procedia PDF Downloads 303
2257 Descent Algorithms for Optimization Algorithms Using q-Derivative

Authors: Geetanjali Panda, Suvrakanti Chakraborty

Abstract:

In this paper, Newton-like descent methods are proposed for unconstrained optimization problems, which use q-derivatives of the gradient of an objective function. First, a local scheme is developed with alternative sufficient optimality condition, and then the method is extended to a global scheme. Moreover, a variant of practical Newton scheme is also developed introducing a real sequence. Global convergence of these schemes is proved under some mild conditions. Numerical experiments and graphical illustrations are provided. Finally, the performance profiles on a test set show that the proposed schemes are competitive to the existing first-order schemes for optimization problems.

Keywords: Descent algorithm, line search method, q calculus, Quasi Newton method

Procedia PDF Downloads 388
2256 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 35
2255 Algorithms Utilizing Wavelet to Solve Various Partial Differential Equations

Authors: K. P. Mredula, D. C. Vakaskar

Abstract:

The article traces developments and evolution of various algorithms developed for solving partial differential equations using the significant combination of wavelet with few already explored solution procedures. The approach depicts a study over a decade of traces and remarks on the modifications in implementing multi-resolution of wavelet, finite difference approach, finite element method and finite volume in dealing with a variety of partial differential equations in the areas like plasma physics, astrophysics, shallow water models, modified Burger equations used in optical fibers, biology, fluid dynamics, chemical kinetics etc.

Keywords: multi-resolution, Haar Wavelet, partial differential equation, numerical methods

Procedia PDF Downloads 292
2254 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: rough set theory, attribute reduction, fuzzy logic, memetic algorithms, record to record algorithm, great deluge algorithm

Procedia PDF Downloads 444
2253 Reducing Pressure Drop in Microscale Channel Using Constructal Theory

Authors: K. X. Cheng, A. L. Goh, K. T. Ooi

Abstract:

The effectiveness of microchannels in enhancing heat transfer has been demonstrated in the semiconductor industry. In order to tap the microscale heat transfer effects into macro geometries, overcoming the cost and technological constraints, microscale passages were created in macro geometries machined using conventional fabrication methods. A cylindrical insert was placed within a pipe, and geometrical profiles were created on the outer surface of the insert to enhance heat transfer under steady-state single-phase liquid flow conditions. However, while heat transfer coefficient values of above 10 kW/m2·K were achieved, the heat transfer enhancement was accompanied by undesirable pressure drop increment. Therefore, this study aims to address the high pressure drop issue using Constructal theory, a universal design law for both animate and inanimate systems. Two designs based on Constructal theory were developed to study the effectiveness of Constructal features in reducing the pressure drop increment as compared to parallel channels, which are commonly found in microchannel fabrication. The hydrodynamic and heat transfer performance for the Tree insert and Constructal fin (Cfin) insert were studied using experimental methods, and the underlying mechanisms were substantiated by numerical results. In technical terms, the objective is to achieve at least comparable increment in both heat transfer coefficient and pressure drop, if not higher increment in the former parameter. Results show that the Tree insert improved the heat transfer performance by more than 16 percent at low flow rates, as compared to the Tree-parallel insert. However, the heat transfer enhancement reduced to less than 5 percent at high Reynolds numbers. On the other hand, the pressure drop increment stayed almost constant at 20 percent. This suggests that the Tree insert has better heat transfer performance in the low Reynolds number region. More importantly, the Cfin insert displayed improved heat transfer performance along with favourable hydrodynamic performance, as compared to Cfinparallel insert, at all flow rates in this study. At 2 L/min, the enhancement of heat transfer was more than 30 percent, with 20 percent pressure drop increment, as compared to Cfin-parallel insert. Furthermore, comparable increment in both heat transfer coefficient and pressure drop was observed at 8 L/min. In other words, the Cfin insert successfully achieved the objective of this study. Analysis of the results suggests that bifurcation of flows is effective in reducing the increment in pressure drop relative to heat transfer enhancement. Optimising the geometries of the Constructal fins is therefore the potential future study in achieving a bigger stride in energy efficiency at much lower costs.

Keywords: constructal theory, enhanced heat transfer, microchannel, pressure drop

Procedia PDF Downloads 327
2252 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: temporal graph network, anomaly detection, cyber security, IDS

Procedia PDF Downloads 92
2251 A Comparative Study of Twin Delayed Deep Deterministic Policy Gradient and Soft Actor-Critic Algorithms for Robot Exploration and Navigation in Unseen Environments

Authors: Romisaa Ali

Abstract:

This paper presents a comparison between twin-delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC) reinforcement learning algorithms in the context of training robust navigation policies for Jackal robots. By leveraging an open-source framework and custom motion control environments, the study evaluates the performance, robustness, and transferability of the trained policies across a range of scenarios. The primary focus of the experiments is to assess the training process, the adaptability of the algorithms, and the robot’s ability to navigate in previously unseen environments. Moreover, the paper examines the influence of varying environmental complexities on the learning process and the generalization capabilities of the resulting policies. The results of this study aim to inform and guide the development of more efficient and practical reinforcement learning-based navigation policies for Jackal robots in real-world scenarios.

Keywords: Jackal robot environments, reinforcement learning, TD3, SAC, robust navigation, transferability, custom environment

Procedia PDF Downloads 90
2250 Implementation of the Recursive Formula for Evaluation of the Strength of Daniels' Bundle

Authors: Vaclav Sadilek, Miroslav Vorechovsky

Abstract:

The paper deals with the classical fiber bundle model of equal load sharing, sometimes referred to as the Daniels' bundle or the democratic bundle. Daniels formulated a multidimensional integral and also a recursive formula for evaluation of the strength cumulative distribution function. This paper describes three algorithms for evaluation of the recursive formula and also their implementations with source codes in high-level programming language Python. A comparison of the algorithms are provided with respect to execution time. Analysis of orders of magnitudes of addends in the recursion is also provided.

Keywords: equal load sharing, mpmath, python, strength of Daniels' bundle

Procedia PDF Downloads 398
2249 Volume Estimation of Trees: An Exploratory Study on Rosewood Logging Within Forest Transition and Savannah Ecological Zones of Ghana

Authors: Albert Kwabena Osei Konadu

Abstract:

One of the endemic forest species of the savannah transition zones enlisted by the Convention of International Treaty for Endangered Species (CITES) in Appendix II is the Rosewood, also known as Pterocarpus erinaceus or Krayie. Its economic viability has made it increasingly popular and in high demand. Ghana’s forest resource management regime for these ecozones is mainly on conservation and very little on resource utilization. Consequently, commercial logging management standards are at teething stage and not fully developed, leading to a deficiency in the monitoring of logging operations and quantification of harvested trees volumes. Tree information form (TIF); a volume estimation and tracking regime, has proven to be an effective sustainable management tool for regulating timber resource extraction in the high forest zones of the country. This work aims to generate TIF that can track and capture requisite parameters to accurately estimate the volume of harvested rosewood within forest savannah transition zones. Tree information forms were created on three scenarios of individual billets, stacked billets and conveying vessel basis. The study was limited by the usage of regulators assigned volume as benchmark and also fraught with potential volume measurement error in the stacked billet scenario due to the existence of spaces within packed billets. These TIFs were field-tested to deduce the most viable option for the tracking and estimation of harvested volumes of rosewood using the smallian and cubic volume estimation formula. Overall, four districts were covered with individual billets, stacked billets and conveying vessel scenarios registering mean volumes of 25.83m3,45.08m3 and 32.6m3, respectively. These adduced volumes were validated by benchmarking to assigned volumes of the Forestry Commission of Ghana and known standard volumes of conveying vessels. The results did indicate an underestimation of extracted volumes under the quotas regime, a situation that could lead to unintended overexploitation of the species. The research revealed conveying vessels route is the most viable volume estimation and tracking regime for the sustainable management of the Pterocarpous erinaceus species as it provided a more practical volume estimate and data extraction protocol.

Keywords: cubic volume formula, smallian volume formula, pterocarpus erinaceus, tree information form, forest transition and savannah zones, harvested tree volume

Procedia PDF Downloads 32
2248 Investigating Sub-daily Responses of Water Flow of Trees in Tropical Successional Forests in Thailand

Authors: Pantana Tor-Ngern

Abstract:

In the global water cycle, tree water use (Tr) largely contributes to evapotranspiration which is the total amount of water evaporated from terrestrial ecosystems to the atmosphere, regulating climates. Tree water use responds to environmental factors, including atmospheric humidity and sunlight (represented by vapor pressure deficit or VPD and photosynthetically active radiation or PAR, respectively) and soil moisture. In forests, Tr responses to such factors depend on species and their spatial and temporal variations. Tropical forests in Southeast Asia (SEA) have experienced land-use conversion from abandoned agricultural practices, resulting in patches of forests at different stages including old-growth and secondary forests. Because the inherent structures, such as canopy height and tree density, significantly vary among forests at different stages and can strongly affect their respective microclimate, Tr and its responses to changing environmental conditions in successional forests may differ. Daily and seasonal variations in the environmental factors may exert significant impacts on the respective Tr patterns. Extrapolating Tr data from short periods of days to longer periods of seasons or years can be complex and is important for estimating long-term ecosystem water use which often includes normal and abnormal climatic conditions. Thus, this study aims to investigate the diurnal variation of Tr, using measured sap flux density (JS) data, with changes in VPD in eight evergreen tree species in an old-growth forest (hereafter OF; >200 years old) and a young forest (hereafter YF, <10 years old) in Khao Yai National Park, Thailand. The studied species included Sysygium syzygoides, Aquilaria crassna, Cinnamomum subavenium, Nephelium melliferum, Altingia excelsa in OF, and Syzygium nervosum and Adinandra integerrima in YF. Only Sysygium antisepticum was found in both forest stages. Specifically, hysteresis, which indicates the asymmetrical changes of JS in response to changing VPD across daily timescale, was examined in these species. Results showed no hysteresis in all species in OF, except Altingia excelsa which exhibited a 3-hour delayed JS response to VPD. In contrast, JS of all species in YF displayed one-hour delayed responses to VPD. The OF species that showed no hysteresis indicated their well-coupling of their canopies with the atmosphere, facilitating the gas exchange which is essential for tree growth. The delayed responses in Altingia excelsa in OF and all species in YF were associated with higher JS in the morning than that in the afternoon. This implies that these species were sensitive to drying air, closing stomata relatively rapidly compared to the decreasing atmospheric humidity (VPD). Such behavior is often observed in trees growing in dry environments. This study suggests that detailed investigation of JS at sub-daily timescales is imperative for better understanding of mechanistic responses of trees to the changing climate, which will benefit the improvement of earth system models.

Keywords: sap flow, tropical forest, forest succession, thermal dissipcation probe

Procedia PDF Downloads 51
2247 An Overview of Adaptive Channel Equalization Techniques and Algorithms

Authors: Navdeep Singh Randhawa

Abstract:

Wireless communication system has been proved as the best for any communication. However, there are some undesirable threats of a wireless communication channel on the information transmitted through it, such as attenuation, distortions, delays and phase shifts of the signals arriving at the receiver end which are caused by its band limited and dispersive nature. One of the threat is ISI (Inter Symbol Interference), which has been found as a great obstacle in high speed communication. Thus, there is a need to provide perfect and accurate technique to remove this effect to have an error free communication. Thus, different equalization techniques have been proposed in literature. This paper presents the equalization techniques followed by the concept of adaptive filter equalizer, its algorithms (LMS and RLS) and applications of adaptive equalization technique.

Keywords: channel equalization, adaptive equalizer, least mean square, recursive least square

Procedia PDF Downloads 440
2246 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network

Procedia PDF Downloads 124
2245 A Comparative Study of GTC and PSP Algorithms for Mining Sequential Patterns Embedded in Database with Time Constraints

Authors: Safa Adi

Abstract:

This paper will consider the problem of sequential mining patterns embedded in a database by handling the time constraints as defined in the GSP algorithm (level wise algorithms). We will compare two previous approaches GTC and PSP, that resumes the general principles of GSP. Furthermore this paper will discuss PG-hybrid algorithm, that using PSP and GTC. The results show that PSP and GTC are more efficient than GSP. On the other hand, the GTC algorithm performs better than PSP. The PG-hybrid algorithm use PSP algorithm for the two first passes on the database, and GTC approach for the following scans. Experiments show that the hybrid approach is very efficient for short, frequent sequences.

Keywords: database, GTC algorithm, PSP algorithm, sequential patterns, time constraints

Procedia PDF Downloads 377
2244 An Investigation on Hot-Spot Temperature Calculation Methods of Power Transformers

Authors: Ahmet Y. Arabul, Ibrahim Senol, Fatma Keskin Arabul, Mustafa G. Aydeniz, Yasemin Oner, Gokhan Kalkan

Abstract:

In the standards of IEC 60076-2 and IEC 60076-7, three different hot-spot temperature estimation methods are suggested. In this study, the algorithms which used in hot-spot temperature calculations are analyzed by comparing the algorithms with the results of an experimental set-up made by a Transformer Monitoring System (TMS) in use. In tested system, TMS uses only top oil temperature and load ratio for hot-spot temperature calculation. And also, it uses some constants from standards which are on agreed statements tables. During the tests, it came out that hot-spot temperature calculation method is just making a simple calculation and not uses significant all other variables that could affect the hot-spot temperature.

Keywords: Hot-spot temperature, monitoring system, power transformer, smart grid

Procedia PDF Downloads 566
2243 Segmentation of Arabic Handwritten Numeral Strings Based on Watershed Approach

Authors: Nidal F. Shilbayeh, Remah W. Al-Khatib, Sameer A. Nooh

Abstract:

Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system.  This is in turn can increase the speed and efficiency of the recognition system. In this paper, we proposed algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criteria. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.

Keywords: handwritten numerals, segmentation, courtesy amount, feature extraction, numeral recognition

Procedia PDF Downloads 374
2242 Predicting Wealth Status of Households Using Ensemble Machine Learning Algorithms

Authors: Habtamu Ayenew Asegie

Abstract:

Wealth, as opposed to income or consumption, implies a more stable and permanent status. Due to natural and human-made difficulties, households' economies will be diminished, and their well-being will fall into trouble. Hence, governments and humanitarian agencies offer considerable resources for poverty and malnutrition reduction efforts. One key factor in the effectiveness of such efforts is the accuracy with which low-income or poor populations can be identified. As a result, this study aims to predict a household’s wealth status using ensemble Machine learning (ML) algorithms. In this study, design science research methodology (DSRM) is employed, and four ML algorithms, Random Forest (RF), Adaptive Boosting (AdaBoost), Light Gradient Boosted Machine (LightGBM), and Extreme Gradient Boosting (XGBoost), have been used to train models. The Ethiopian Demographic and Health Survey (EDHS) dataset is accessed for this purpose from the Central Statistical Agency (CSA)'s database. Various data pre-processing techniques were employed, and the model training has been conducted using the scikit learn Python library functions. Model evaluation is executed using various metrics like Accuracy, Precision, Recall, F1-score, area under curve-the receiver operating characteristics (AUC-ROC), and subjective evaluations of domain experts. An optimal subset of hyper-parameters for the algorithms was selected through the grid search function for the best prediction. The RF model has performed better than the rest of the algorithms by achieving an accuracy of 96.06% and is better suited as a solution model for our purpose. Following RF, LightGBM, XGBoost, and AdaBoost algorithms have an accuracy of 91.53%, 88.44%, and 58.55%, respectively. The findings suggest that some of the features like ‘Age of household head’, ‘Total children ever born’ in a family, ‘Main roof material’ of their house, ‘Region’ they lived in, whether a household uses ‘Electricity’ or not, and ‘Type of toilet facility’ of a household are determinant factors to be a focal point for economic policymakers. The determinant risk factors, extracted rules, and designed artifact achieved 82.28% of the domain expert’s evaluation. Overall, the study shows ML techniques are effective in predicting the wealth status of households.

Keywords: ensemble machine learning, households wealth status, predictive model, wealth status prediction

Procedia PDF Downloads 31
2241 Evaluation of Gesture-Based Password: User Behavioral Features Using Machine Learning Algorithms

Authors: Lakshmidevi Sreeramareddy, Komalpreet Kaur, Nane Pothier

Abstract:

Graphical-based passwords have existed for decades. Their major advantage is that they are easier to remember than an alphanumeric password. However, their disadvantage (especially recognition-based passwords) is the smaller password space, making them more vulnerable to brute force attacks. Graphical passwords are also highly susceptible to the shoulder-surfing effect. The gesture-based password method that we developed is a grid-free, template-free method. In this study, we evaluated the gesture-based passwords for usability and vulnerability. The results of the study are significant. We developed a gesture-based password application for data collection. Two modes of data collection were used: Creation mode and Replication mode. In creation mode (Session 1), users were asked to create six different passwords and reenter each password five times. In replication mode, users saw a password image created by some other user for a fixed duration of time. Three different duration timers, such as 5 seconds (Session 2), 10 seconds (Session 3), and 15 seconds (Session 4), were used to mimic the shoulder-surfing attack. After the timer expired, the password image was removed, and users were asked to replicate the password. There were 74, 57, 50, and 44 users participated in Session 1, Session 2, Session 3, and Session 4 respectfully. In this study, the machine learning algorithms have been applied to determine whether the person is a genuine user or an imposter based on the password entered. Five different machine learning algorithms were deployed to compare the performance in user authentication: namely, Decision Trees, Linear Discriminant Analysis, Naive Bayes Classifier, Support Vector Machines (SVMs) with Gaussian Radial Basis Kernel function, and K-Nearest Neighbor. Gesture-based password features vary from one entry to the next. It is difficult to distinguish between a creator and an intruder for authentication. For each password entered by the user, four features were extracted: password score, password length, password speed, and password size. All four features were normalized before being fed to a classifier. Three different classifiers were trained using data from all four sessions. Classifiers A, B, and C were trained and tested using data from the password creation session and the password replication with a timer of 5 seconds, 10 seconds, and 15 seconds, respectively. The classification accuracies for Classifier A using five ML algorithms are 72.5%, 71.3%, 71.9%, 74.4%, and 72.9%, respectively. The classification accuracies for Classifier B using five ML algorithms are 69.7%, 67.9%, 70.2%, 73.8%, and 71.2%, respectively. The classification accuracies for Classifier C using five ML algorithms are 68.1%, 64.9%, 68.4%, 71.5%, and 69.8%, respectively. SVMs with Gaussian Radial Basis Kernel outperform other ML algorithms for gesture-based password authentication. Results confirm that the shorter the duration of the shoulder-surfing attack, the higher the authentication accuracy. In conclusion, behavioral features extracted from the gesture-based passwords lead to less vulnerable user authentication.

Keywords: authentication, gesture-based passwords, machine learning algorithms, shoulder-surfing attacks, usability

Procedia PDF Downloads 93
2240 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 293
2239 Supervised/Unsupervised Mahalanobis Algorithm for Improving Performance for Cyberattack Detection over Communications Networks

Authors: Radhika Ranjan Roy

Abstract:

Deployment of machine learning (ML)/deep learning (DL) algorithms for cyberattack detection in operational communications networks (wireless and/or wire-line) is being delayed because of low-performance parameters (e.g., recall, precision, and f₁-score). If datasets become imbalanced, which is the usual case for communications networks, the performance tends to become worse. Complexities in handling reducing dimensions of the feature sets for increasing performance are also a huge problem. Mahalanobis algorithms have been widely applied in scientific research because Mahalanobis distance metric learning is a successful framework. In this paper, we have investigated the Mahalanobis binary classifier algorithm for increasing cyberattack detection performance over communications networks as a proof of concept. We have also found that high-dimensional information in intermediate features that are not utilized as much for classification tasks in ML/DL algorithms are the main contributor to the state-of-the-art of improved performance of the Mahalanobis method, even for imbalanced and sparse datasets. With no feature reduction, MD offers uniform results for precision, recall, and f₁-score for unbalanced and sparse NSL-KDD datasets.

Keywords: Mahalanobis distance, machine learning, deep learning, NS-KDD, local intrinsic dimensionality, chi-square, positive semi-definite, area under the curve

Procedia PDF Downloads 69
2238 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 306
2237 Pattern Identification in Statistical Process Control Using Artificial Neural Networks

Authors: M. Pramila Devi, N. V. N. Indra Kiran

Abstract:

Control charts, predominantly in the form of X-bar chart, are important tools in statistical process control (SPC). They are useful in determining whether a process is behaving as intended or there are some unnatural causes of variation. A process is out of control if a point falls outside the control limits or a series of point’s exhibit an unnatural pattern. In this paper, a study is carried out on four training algorithms for CCPs recognition. For those algorithms optimal structure is identified and then they are studied for type I and type II errors for generalization without early stopping and with early stopping and the best one is proposed.

Keywords: control chart pattern recognition, neural network, backpropagation, generalization, early stopping

Procedia PDF Downloads 362
2236 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li

Abstract:

The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 203
2235 Particle Swarm Optimization and Quantum Particle Swarm Optimization to Multidimensional Function Approximation

Authors: Diogo Silva, Fadul Rodor, Carlos Moraes

Abstract:

This work compares the results of multidimensional function approximation using two algorithms: the classical Particle Swarm Optimization (PSO) and the Quantum Particle Swarm Optimization (QPSO). These algorithms were both tested on three functions - The Rosenbrock, the Rastrigin, and the sphere functions - with different characteristics by increasing their number of dimensions. As a result, this study shows that the higher the function space, i.e. the larger the function dimension, the more evident the advantages of using the QPSO method compared to the PSO method in terms of performance and number of necessary iterations to reach the stop criterion.

Keywords: PSO, QPSO, function approximation, AI, optimization, multidimensional functions

Procedia PDF Downloads 570
2234 Minimizing Total Completion Time in No-Wait Flowshops with Setup Times

Authors: Ali Allahverdi

Abstract:

The m-machine no-wait flowshop scheduling problem is addressed in this paper. The objective is to minimize total completion time subject to the constraint that the makespan value is not greater than a certain value. Setup times are treated as separate from processing times. Several recent algorithms are adapted and proposed for the problem. An extensive computational analysis has been conducted for the evaluation of the proposed algorithms. The computational analysis indicates that the best proposed algorithm performs significantly better than the earlier existing best algorithm.

Keywords: scheduling, no-wait flowshop, algorithm, setup times, total completion time, makespan

Procedia PDF Downloads 333