Search results for: mixed dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3734

Search results for: mixed dataset

3464 Evaluating Generative Neural Attention Weights-Based Chatbot on Customer Support Twitter Dataset

Authors: Sinarwati Mohamad Suhaili, Naomie Salim, Mohamad Nazim Jambli

Abstract:

Sequence-to-sequence (seq2seq) models augmented with attention mechanisms are playing an increasingly important role in automated customer service. These models, which are able to recognize complex relationships between input and output sequences, are crucial for optimizing chatbot responses. Central to these mechanisms are neural attention weights that determine the focus of the model during sequence generation. Despite their widespread use, there remains a gap in the comparative analysis of different attention weighting functions within seq2seq models, particularly in the domain of chatbots using the Customer Support Twitter (CST) dataset. This study addresses this gap by evaluating four distinct attention-scoring functions—dot, multiplicative/general, additive, and an extended multiplicative function with a tanh activation parameter — in neural generative seq2seq models. Utilizing the CST dataset, these models were trained and evaluated over 10 epochs with the AdamW optimizer. Evaluation criteria included validation loss and BLEU scores implemented under both greedy and beam search strategies with a beam size of k=3. Results indicate that the model with the tanh-augmented multiplicative function significantly outperforms its counterparts, achieving the lowest validation loss (1.136484) and the highest BLEU scores (0.438926 under greedy search, 0.443000 under beam search, k=3). These results emphasize the crucial influence of selecting an appropriate attention-scoring function in improving the performance of seq2seq models for chatbots. Particularly, the model that integrates tanh activation proves to be a promising approach to improve the quality of chatbots in the customer support context.

Keywords: attention weight, chatbot, encoder-decoder, neural generative attention, score function, sequence-to-sequence

Procedia PDF Downloads 51
3463 Index t-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings

Authors: Gaelle Candel, David Naccache

Abstract:

t-SNE is an embedding method that the data science community has widely used. It helps two main tasks: to display results by coloring items according to the item class or feature value; and for forensic, giving a first overview of the dataset distribution. Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. t-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric. The transformation from a high to low dimensional space is described but not learned. Two initializations of the algorithm would lead to two different embeddings. In a forensic approach, analysts would like to compare two or more datasets using their embedding. A naive approach would be to embed all datasets together. However, this process is costly as the complexity of t-SNE is quadratic and would be infeasible for too many datasets. Another approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding’ match. The embedding with the support process can be repeated more than once, with the newly obtained embedding. The successive embedding can be used to study the impact of one variable over the dataset distribution or monitor changes over time. This method has the same complexity as t-SNE per embedding, and memory requirements are only doubled. For a dataset of n elements sorted and split into k subsets, the total embedding complexity would be reduced from O(n²) to O(n²=k), and the memory requirement from n² to 2(n=k)², which enables computation on recent laptops. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution, and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets’ dynamics.

Keywords: concept drift, data visualization, dimension reduction, embedding, monitoring, reusability, t-SNE, unsupervised learning

Procedia PDF Downloads 114
3462 Deep Learning based Image Classifiers for Detection of CSSVD in Cacao Plants

Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka

Abstract:

The detection of diseases within plants has attracted a lot of attention from computer vision enthusiasts. Despite the progress made to detect diseases in many plants, there remains a research gap to train image classifiers to detect the cacao swollen shoot virus disease or CSSVD for short, pertinent to cacao plants. This gap has mainly been due to the unavailability of high quality labeled training data. Moreover, institutions have been hesitant to share their data related to CSSVD. To fill these gaps, image classifiers to detect CSSVD-infected cacao plants are presented in this study. The classifiers are based on VGG16, ResNet50 and Vision Transformer (ViT). The image classifiers are evaluated on a recently released and publicly accessible KaraAgroAI Cocoa dataset. The best performing image classifier, based on ResNet50, achieves 95.39\% precision, 93.75\% recall, 94.34\% F1-score and 94\% accuracy on only 20 epochs. There is a +9.75\% improvement in recall when compared to previous works. These results indicate that the image classifiers learn to identify cacao plants infected with CSSVD.

Keywords: CSSVD, image classification, ResNet50, vision transformer, KaraAgroAI cocoa dataset

Procedia PDF Downloads 66
3461 Separation Performance of CO₂ by Mixed Matrix Membrane Comprising Carbide-Derived Carbon

Authors: Musa Najimu, Isam Aljundi

Abstract:

In this study, the development of mixed matrix membrane (MMM) containing carbide-derived carbon (CDC) for the separation of CO₂ was investigated. MMM with four different loadings (0.1 to 2 wt%) were prepared by the dry/wet phase inversion technique. Prior to this, the formula of the control polysulfone (PSF) membrane was optimized in terms of the PSF concentration in a mixture of NMP/THF solvents and ethanol. Prepared samples were characterized and tested for CO₂ and CH₄ gas permeation. The optimization of the control PSF membrane revealed that 30 wt% PSF is the critical polymer concentration in the formulation. Characterization results unveiled reinforcement of thermal stability and improved polarity imparted by CDC in the MMM, in addition to uniform dispersion of filler up to 1 wt% loading. Furthermore, the incorporation of CDC in PSF membrane formulation enhanced both the CO₂ permeance and ideal selectivity over the control membrane. A CDC loading of 0.5 wt% resulted in the highest CO₂ permeance of 5.5 GPU corresponding to 120% increase in permeance while a CDC loading of 1 wt% resulted in the highest selectivity (CO₂ /CH₄) of 27 corresponding to 29% increase in selectivity. Studies of operating temperature effect showed that an optimum operating temperature for M1.0 membrane is 20 ⁰C. In addition, the feed pressure studies showed that high pressure feeds will favor high performance of the membrane and a good CO₂ /CH₄ separation.

Keywords: carbide derived carbon, mixed matrix membrane, CO₂ separation, polysulfone

Procedia PDF Downloads 173
3460 Urban Dynamics Modelling of Mixed Land Use for Sustainable Urban Development in Indian Context

Authors: Rewati Raman, Uttam K. Roy

Abstract:

One of the main adversaries of city planning in present times is the ever-expanding problem of urbanization and the antagonistic issues accompanying it. The prevalent challenges in urbanization such as population growth, urban sprawl, poverty, inequality, pollution, congestion, etc. call for reforms in the urban fabric as well as in planning theory and practice. One of the various paradigms of city planning, land use planning, has been the major instruments for spatial planning of cities and regions in India. Zoning regulation based land use planning in the form of land use and development control plans (LUDCP) and development control regulations (DCR) have been considered mainstream guiding principles in land use planning for decades. In spite of many advantages of such zoning based regulations, over a period of time, it has been critiqued by scholars for its own limitations of isolation and lack of vitality, inconvenience in business in terms of proximity to residence and low operating cost, unsuitable environment for small investments, higher travel distance for facilities, amenities and thereby higher expenditure, safety issues etc. Mixed land use has been advocated as a tool to avoid such limitations in city planning by researchers. In addition, mixed land use can offer many advantages like housing variety and density, the creation of an economic blend of compatible land use, compact development, stronger neighborhood character, walkability, and generation of jobs, etc. Alternatively, the mixed land use beyond a suitable balance of use can also bring disadvantages like traffic congestion, encroachments, very high-density housing leading to a slum like condition, parking spill out, non-residential uses operating on residential premises paying less tax, chaos hampering residential privacy, pressure on existing infrastructure facilities, etc. This research aims at studying and outlining the various challenges and potentials of mixed land use zoning, through modeling tools, as a competent instrument for city planning in lieu of the present urban scenario. The methodology of research adopted in this paper involves the study of a mixed land use neighborhood in India, identification of indicators and parameters related to its extent and spatial pattern and the subsequent use of system dynamics as a modeling tool for simulation. The findings from this analysis helped in identifying the various advantages and challenges associated with the dynamic nature of a mixed use urban settlement. The results also confirmed the hypothesis that mixed use neighborhoods are catalysts for employment generation, socioeconomic gains while improving vibrancy, health, safety, and security. It is also seen that certain challenges related to chaos, lack of privacy and pollution prevail in mixed use neighborhoods, which can be mitigated by varying the percentage of mixing as per need, ensuring compatibility of adjoining use, institutional interventions in the form of policies, neighborhood micro-climatic interventions, etc. Therefore this paper gives a consolidated and holistic framework and quantified outcome pertaining to the extent and spatial pattern of mixed land use that should be adopted to ensure sustainable urban planning.

Keywords: mixed land use, sustainable development, system dynamics analysis, urban dynamics modelling

Procedia PDF Downloads 145
3459 Automatic Identification and Classification of Contaminated Biodegradable Plastics using Machine Learning Algorithms and Hyperspectral Imaging Technology

Authors: Nutcha Taneepanichskul, Helen C. Hailes, Mark Miodownik

Abstract:

Plastic waste has emerged as a critical global environmental challenge, primarily driven by the prevalent use of conventional plastics derived from petrochemical refining and manufacturing processes in modern packaging. While these plastics serve vital functions, their persistence in the environment post-disposal poses significant threats to ecosystems. Addressing this issue necessitates approaches, one of which involves the development of biodegradable plastics designed to degrade under controlled conditions, such as industrial composting facilities. It is imperative to note that compostable plastics are engineered for degradation within specific environments and are not suited for uncontrolled settings, including natural landscapes and aquatic ecosystems. The full benefits of compostable packaging are realized when subjected to industrial composting, preventing environmental contamination and waste stream pollution. Therefore, effective sorting technologies are essential to enhance composting rates for these materials and diminish the risk of contaminating recycling streams. In this study, it leverage hyperspectral imaging technology (HSI) coupled with advanced machine learning algorithms to accurately identify various types of plastics, encompassing conventional variants like Polyethylene terephthalate (PET), Polypropylene (PP), Low density polyethylene (LDPE), High density polyethylene (HDPE) and biodegradable alternatives such as Polybutylene adipate terephthalate (PBAT), Polylactic acid (PLA), and Polyhydroxyalkanoates (PHA). The dataset is partitioned into three subsets: a training dataset comprising uncontaminated conventional and biodegradable plastics, a validation dataset encompassing contaminated plastics of both types, and a testing dataset featuring real-world packaging items in both pristine and contaminated states. Five distinct machine learning algorithms, namely Partial Least Squares Discriminant Analysis (PLS-DA), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Logistic Regression, and Decision Tree Algorithm, were developed and evaluated for their classification performance. Remarkably, the Logistic Regression and CNN model exhibited the most promising outcomes, achieving a perfect accuracy rate of 100% for the training and validation datasets. Notably, the testing dataset yielded an accuracy exceeding 80%. The successful implementation of this sorting technology within recycling and composting facilities holds the potential to significantly elevate recycling and composting rates. As a result, the envisioned circular economy for plastics can be established, thereby offering a viable solution to mitigate plastic pollution.

Keywords: biodegradable plastics, sorting technology, hyperspectral imaging technology, machine learning algorithms

Procedia PDF Downloads 41
3458 Graph Based Traffic Analysis and Delay Prediction Using a Custom Built Dataset

Authors: Gabriele Borg, Alexei Debono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale. Furthermore, a series of traffic prediction graph neural network models are conducted to compare MalTra to large-scale traffic datasets.

Keywords: graph neural networks, traffic management, big data, mobile data patterns

Procedia PDF Downloads 96
3457 A Comprehensive Safety Analysis for a Pressurized Water Reactor Fueled with Mixed-Oxide Fuel as an Accident Tolerant Fuel

Authors: Mohamed Y. M. Mohsen

Abstract:

The viability of utilising mixed-oxide fuel (MOX) ((U₀.₉, rgPu₀.₁) O₂) as an accident-tolerant fuel (ATF) has been thoroughly investigated. MOX fuel provides the best example of a nuclear waste recycling process. The MCNPX 2.7 code was used to determine the main neutronic features, especially the radial power distribution, to identify the hot channel on which the thermal-hydraulic (TH) study was performed. Based on the computational fluid dynamics technique, the simulation of the rod-centered thermal-hydraulic subchannel model was implemented using COMSOL Multiphysics. TH analysis was utilised to determine the axially and radially distributed temperatures of the fuel and cladding materials, as well as the departure from the nucleate boiling ratio (DNBR) along the coolant channel. COMSOL Multiphysics can simulate reality by coupling multiphysics, such as coupling between heat transfer and solid mechanics. The main solid structure parameters, such as the von Mises stress, volumetric strain, and displacement, were simulated using this coupling. When the neutronic, TH, and solid structure performances of UO₂ and ((U₀.₉, rgPu₀.₁) O₂) were compared, the results showed considerable improvement and an increase in safety margins with the use of ((U₀.₉, rgPu₀.₁) O₂).

Keywords: mixed-oxide, MCNPX, neutronic analysis, COMSOL-multiphysics, thermal-hydraulic, solid structure

Procedia PDF Downloads 69
3456 Agile Software Effort Estimation Using Regression Techniques

Authors: Mikiyas Adugna

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 26
3455 Deep Learning for Qualitative and Quantitative Grain Quality Analysis Using Hyperspectral Imaging

Authors: Ole-Christian Galbo Engstrøm, Erik Schou Dreier, Birthe Møller Jespersen, Kim Steenstrup Pedersen

Abstract:

Grain quality analysis is a multi-parameterized problem that includes a variety of qualitative and quantitative parameters such as grain type classification, damage type classification, and nutrient regression. Currently, these parameters require human inspection, a multitude of instruments employing a variety of sensor technologies, and predictive model types or destructive and slow chemical analysis. This paper investigates the feasibility of applying near-infrared hyperspectral imaging (NIR-HSI) to grain quality analysis. For this study two datasets of NIR hyperspectral images in the wavelength range of 900 nm - 1700 nm have been used. Both datasets contain images of sparsely and densely packed grain kernels. The first dataset contains ~87,000 image crops of bulk wheat samples from 63 harvests where protein value has been determined by the FOSS Infratec NOVA which is the golden industry standard for protein content estimation in bulk samples of cereal grain. The second dataset consists of ~28,000 image crops of bulk grain kernels from seven different wheat varieties and a single rye variety. In the first dataset, protein regression analysis is the problem to solve while variety classification analysis is the problem to solve in the second dataset. Deep convolutional neural networks (CNNs) have the potential to utilize spatio-spectral correlations within a hyperspectral image to simultaneously estimate the qualitative and quantitative parameters. CNNs can autonomously derive meaningful representations of the input data reducing the need for advanced preprocessing techniques required for classical chemometric model types such as artificial neural networks (ANNs) and partial least-squares regression (PLS-R). A comparison between different CNN architectures utilizing 2D and 3D convolution is conducted. These results are compared to the performance of ANNs and PLS-R. Additionally, a variety of preprocessing techniques from image analysis and chemometrics are tested. These include centering, scaling, standard normal variate (SNV), Savitzky-Golay (SG) filtering, and detrending. The results indicate that the combination of NIR-HSI and CNNs has the potential to be the foundation for an automatic system unifying qualitative and quantitative grain quality analysis within a single sensor technology and predictive model type.

Keywords: deep learning, grain analysis, hyperspectral imaging, preprocessing techniques

Procedia PDF Downloads 70
3454 Mixed Integer Programing for Multi-Tier Rebate with Discontinuous Cost Function

Authors: Y. Long, L. Liu, K. V. Branin

Abstract:

One challenge faced by procurement decision-maker during the acquisition process is how to compare similar products from different suppliers and allocate orders among different products or services. This work focuses on allocating orders among multiple suppliers considering rebate. The objective function is to minimize the total acquisition cost including purchasing cost and rebate benefit. Rebate benefit is complex and difficult to estimate at the ordering step. Rebate rules vary for different suppliers and usually change over time. In this work, we developed a system to collect the rebate policies, standardized the rebate policies and developed two-stage optimization models for ordering allocation. Rebate policy with multi-tiers is considered in modeling. The discontinuous cost function of rebate benefit is formulated for different scenarios. A piecewise linear function is used to approximate the discontinuous cost function of rebate benefit. And a Mixed Integer Programing (MIP) model is built for order allocation problem with multi-tier rebate. A case study is presented and it shows that our optimization model can reduce the total acquisition cost by considering rebate rules.

Keywords: discontinuous cost function, mixed integer programming, optimization, procurement, rebate

Procedia PDF Downloads 232
3453 Calibration of Mini TEPC and Measurement of Lineal Energy in a Mixed Radiation Field Produced by Neutrons

Authors: I. C. Cho, W. H. Wen, H. Y. Tsai, T. C. Chao, C. J. Tung

Abstract:

Tissue-equivalent proportional counter (TEPC) is a useful instrument used to measure radiation single-event energy depositions in a subcellular target volume. The quantity of measurements is the microdosimetric lineal energy, which determines the relative biological effectiveness, RBE, for radiation therapy or the radiation-weighting factor, WR, for radiation protection. TEPC is generally used in a mixed radiation field, where each component radiation has its own RBE or WR value. To reduce the pile-up effect during radiotherapy measurements, a miniature TEPC (mini TEPC) with cavity size in the order of 1 mm may be required. In the present work, a homemade mini TEPC with a cylindrical cavity of 1 mm in both the diameter and the height was constructed to measure the lineal energy spectrum of a mixed radiation field with high- and low-LET radiations. Instead of using external radiation beams to penetrate the detector wall, mixed radiation fields were produced by the interactions of neutrons with TEPC walls that contained small plugs of different materials, i.e. Li, B, A150, Cd and N. In all measurements, mini TEPC was placed at the beam port of the Tsing Hua Open-pool Reactor (THOR). Measurements were performed using the propane-based tissue-equivalent gas mixture, i.e. 55% C3H8, 39.6% CO2 and 5.4% N2 by partial pressures. The gas pressure of 422 torr was applied for the simulation of a 1 m diameter biological site. The calibration of mini TEPC was performed using two marking points in the lineal energy spectrum, i.e. proton edge and electron edge. Measured spectra revealed high lineal energy (> 100 keV/m) peaks due to neutron-capture products, medium lineal energy (10 – 100 keV/m) peaks from hydrogen-recoil protons, and low lineal energy (< 10 keV/m) peaks of reactor photons. For cases of Li and B plugs, the high lineal energy peaks were quite prominent. The medium lineal energy peaks were in the decreasing order of Li, Cd, N, A150, and B. The low lineal energy peaks were smaller compared to other peaks. This study demonstrated that internally produced mixed radiations from the interactions of neutrons with different plugs in the TEPC wall provided a useful approach for TEPC measurements of lineal energies.

Keywords: TEPC, lineal energy, microdosimetry, radiation quality

Procedia PDF Downloads 449
3452 Early Recognition and Grading of Cataract Using a Combined Log Gabor/Discrete Wavelet Transform with ANN and SVM

Authors: Hadeer R. M. Tawfik, Rania A. K. Birry, Amani A. Saad

Abstract:

Eyes are considered to be the most sensitive and important organ for human being. Thus, any eye disorder will affect the patient in all aspects of life. Cataract is one of those eye disorders that lead to blindness if not treated correctly and quickly. This paper demonstrates a model for automatic detection, classification, and grading of cataracts based on image processing techniques and artificial intelligence. The proposed system is developed to ease the cataract diagnosis process for both ophthalmologists and patients. The wavelet transform combined with 2D Log Gabor Wavelet transform was used as feature extraction techniques for a dataset of 120 eye images followed by a classification process that classified the image set into three classes; normal, early, and advanced stage. A comparison between the two used classifiers, the support vector machine SVM and the artificial neural network ANN were done for the same dataset of 120 eye images. It was concluded that SVM gave better results than ANN. SVM success rate result was 96.8% accuracy where ANN success rate result was 92.3% accuracy.

Keywords: cataract, classification, detection, feature extraction, grading, log-gabor, neural networks, support vector machines, wavelet

Procedia PDF Downloads 297
3451 Solid-Liquid-Polymer Mixed Matrix Membrane Using Liquid Additive Adsorbed on Activated Carbon Dispersed in Polymeric Membrane for CO2/CH4 Separation

Authors: P. Chultheera, T. Rirksomboon, S. Kulprathipanja, C. Liu, W. Chinsirikul, N. Kerddonfag

Abstract:

Gas separation by selective transport through polymeric membranes is one of the rapid growing branches of membrane technology. However, the tradeoff between the permeability and selectivity is one of the critical challenges encountered by pure polymer membranes, which in turn limits their large-scale application. To enhance gas separation performances, mixed matrix membranes (MMMs) have been developed. In this study, MMMs were prepared by a solution-coating method and tested for CO2/CH4 separation through permeability and selectivity using a membrane testing unit at room temperature and a pressure of 100 psig. The fabricated MMMs were composed of silicone rubber dispersed with the activated carbon individually absorbed with polyethylene glycol (PEG) as a liquid additive. PEG emulsified silicone rubber MMMs showed superior gas separation on cellulose acetate membrane with both high permeability and selectivity compared with silicone rubber membrane and alone support membrane. However, the MMMs performed limited stability resulting from the undesirable PEG leakage. To stabilize the MMMs, PEG was then incorporated into activated carbon by adsorption. It was found that the incorporation of solid and liquid was effective to improve the separation performance of MMMs.

Keywords: mixed matrix membrane, membrane, CO₂/CH₄ separation, activated carbon

Procedia PDF Downloads 304
3450 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 59
3449 Enhanced Extra Trees Classifier for Epileptic Seizure Prediction

Authors: Maurice Ntahobari, Levin Kuhlmann, Mario Boley, Zhinoos Razavi Hesabi

Abstract:

For machine learning based epileptic seizure prediction, it is important for the model to be implemented in small implantable or wearable devices that can be used to monitor epilepsy patients; however, current state-of-the-art methods are complex and computationally intensive. We use Shapley Additive Explanation (SHAP) to find relevant intracranial electroencephalogram (iEEG) features and improve the computational efficiency of a state-of-the-art seizure prediction method based on the extra trees classifier while maintaining prediction performance. Results for a small contest dataset and a much larger dataset with continuous recordings of up to 3 years per patient from 15 patients yield better than chance prediction performance (p < 0.004). Moreover, while the performance of the SHAP-based model is comparable to that of the benchmark, the overall training and prediction time of the model has been reduced by a factor of 1.83. It can also be noted that the feature called zero crossing value is the best EEG feature for seizure prediction. These results suggest state-of-the-art seizure prediction performance can be achieved using efficient methods based on optimal feature selection.

Keywords: machine learning, seizure prediction, extra tree classifier, SHAP, epilepsy

Procedia PDF Downloads 81
3448 Short Text Classification for Saudi Tweets

Authors: Asma A. Alsufyani, Maram A. Alharthi, Maha J. Althobaiti, Manal S. Alharthi, Huda Rizq

Abstract:

Twitter is one of the most popular microblogging sites that allows users to publish short text messages called 'tweets'. Increasing the number of accounts to follow (followings) increases the number of tweets that will be displayed from different topics in an unclassified manner in the timeline of the user. Therefore, it can be a vital solution for many Twitter users to have their tweets in a timeline classified into general categories to save the user’s time and to provide easy and quick access to tweets based on topics. In this paper, we developed a classifier for timeline tweets trained on a dataset consisting of 3600 tweets in total, which were collected from Saudi Twitter and annotated manually. We experimented with the well-known Bag-of-Words approach to text classification, and we used support vector machines (SVM) in the training process. The trained classifier performed well on a test dataset, with an average F1-measure equal to 92.3%. The classifier has been integrated into an application, which practically proved the classifier’s ability to classify timeline tweets of the user.

Keywords: corpus creation, feature extraction, machine learning, short text classification, social media, support vector machine, Twitter

Procedia PDF Downloads 125
3447 An Integrated Mixed-Integer Programming Model to Address Concurrent Project Scheduling and Material Ordering

Authors: Babak H. Tabrizi, Seyed Farid Ghaderi

Abstract:

Concurrent planning of project scheduling and material ordering can provide more flexibility to the project scheduling problem, as the project execution costs can be enhanced. Hence, the issue has been taken into account in this paper. To do so, a mixed-integer mathematical model is developed which considers the aforementioned flexibility, in addition to the materials quantity discount and space availability restrictions. Moreover, the activities duration has been treated as decision variables. Finally, the efficiency of the proposed model is tested by different instances. Additionally, the influence of the aforementioned parameters is investigated on the model performance.

Keywords: material ordering, project scheduling, quantity discount, space availability

Procedia PDF Downloads 340
3446 Effect of Carbon Nanotubes on Nanocomposite from Nanofibrillated Cellulose

Authors: M. Z. Shazana, R. Rosazley, M. A. Izzati, A. W. Fareezal, I. Rushdan, A. B. Suriani, S. Zakaria

Abstract:

There is an increasing interest in the development of flexible energy storage for application of Carbon Nanotubes and nanofibrillated cellulose (NFC). In this study, nanocomposite is consisting of Carbon Nanotube (CNT) mixed with suspension of nanofibrillated cellulose (NFC) from Oil Palm Empty Fruit Bunch (OPEFB). The use of Carbon Nanotube (CNT) as additive nanocomposite was improved the conductivity and mechanical properties of nanocomposite from nanofibrillated cellulose (NFC). The nanocomposite were characterized for electrical conductivity and mechanical properties in uniaxial tension, which were tensile to measure the bond of fibers in nanocomposite. The processing route is environmental friendly which leads to well-mixed structures and good results as well.

Keywords: carbon nanotube (CNT), nanofibrillated cellulose (NFC), mechanical properties, electrical conductivity

Procedia PDF Downloads 301
3445 Stochastic Modelling for Mixed Mode Fatigue Delamination Growth of Wind Turbine Composite Blades

Authors: Chi Zhang, Hua-Peng Chen

Abstract:

With the increasingly demanding resources in the word, renewable and clean energy has been considered as an alternative way to replace traditional ones. Thus, one of practical examples for using wind energy is wind turbine, which has gained more attentions in recent research. Like most offshore structures, the blades, which is the most critical components of the wind turbine, will be subjected to millions of loading cycles during service life. To operate safely in marine environments, the blades are typically made from fibre reinforced composite materials to resist fatigue delamination and harsh environment. The fatigue crack development of blades is uncertain because of indeterminate mechanical properties for composite and uncertainties under offshore environment like wave loads, wind loads, and humid environments. There are three main delamination failure modes for composite blades, and the most common failure type in practices is subjected to mixed mode loading, typically a range of opening (mode 1) and shear (mode 2). However, the fatigue crack development for mixed mode cannot be predicted as deterministic values because of various uncertainties in realistic practical situation. Therefore, selecting an effective stochastic model to evaluate the mixed mode behaviour of wind turbine blades is a critical issue. In previous studies, gamma process has been considered as an appropriate stochastic approach, which simulates the stochastic deterioration process to proceed in one direction such as realistic situation for fatigue damage failure of wind turbine blades. On the basis of existing studies, various Paris Law equations are discussed to simulate the propagation of the fatigue crack growth. This paper develops a Paris model with the stochastic deterioration modelling according to gamma process for predicting fatigue crack performance in design service life. A numerical example of wind turbine composite materials is investigated to predict the mixed mode crack depth by Paris law and the probability of fatigue failure by gamma process. The probability of failure curves under different situations are obtained from the stochastic deterioration model for comparisons. Compared with the results from experiments, the gamma process can take the uncertain values into consideration for crack propagation of mixed mode, and the stochastic deterioration process shows a better agree well with realistic crack process for composite blades. Finally, according to the predicted results from gamma stochastic model, assessment strategies for composite blades are developed to reduce total lifecycle costs and increase resistance for fatigue crack growth.

Keywords: Reinforced fibre composite, Wind turbine blades, Fatigue delamination, Mixed failure mode, Stochastic process.

Procedia PDF Downloads 383
3444 Meta-Learning for Hierarchical Classification and Applications in Bioinformatics

Authors: Fabio Fabris, Alex A. Freitas

Abstract:

Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for recommending the best hierarchical classification algorithm to a hierarchical classification dataset. This work’s contributions are: 1) proposing an algorithm for splitting hierarchical datasets into new datasets to increase the number of meta-instances, 2) proposing meta-features for hierarchical classification, and 3) interpreting decision-tree meta-models for hierarchical classification algorithm recommendation.

Keywords: algorithm recommendation, meta-learning, bioinformatics, hierarchical classification

Procedia PDF Downloads 281
3443 Supplier Selection by Considering Cost and Reliability

Authors: K. -H. Yang

Abstract:

Supplier selection problem is one of the important issues of supply chain problems. Two categories of methodologies include qualitative and quantitative approaches which can be applied to supplier selection problems. However, due to the complexities of the problem and lacking of reliable and quantitative data, qualitative approaches are more than quantitative approaches. This study considers operational cost and supplier’s reliability factor and solves the problem by using a quantitative approach. A mixed integer programming model is the primary analytic tool. Analyses of different scenarios with variable cost and reliability structures show that the effectiveness of this approach to the supplier selection problem.

Keywords: mixed integer programming, quantitative approach, supplier’s reliability, supplier selection

Procedia PDF Downloads 345
3442 Structural and Magnetic Properties of Calcium Mixed Ferrites Prepared by Co-Precipitation Method

Authors: Sijo S. Thomas, S. Hridya, Manoj Mohan, Bibin Jacob, Hysen Thomas

Abstract:

Ferrites are iron based oxides with technologically significant magnetic properties and have widespread applications in medicine, technology, and industry. There has been a growing interest in the study of magnetic, electrical and structural properties of mixed ferrites. In the present work, structural and magnetic properties of Nickel and Calcium substituted Fe₃O₄ nanoparticles were investigated. NiₓCa₁₋ₓFe₂O₄ nanoparticles (x = 0, 0.1, 0.3, 0.5, 0.7, 0.9) were synthesized by chemical co-precipitation method and the samples were subsequently sintered at 900°C. The magnetic and structural properties of NiₓCa₁₋ₓFe₂O₄ were investigated using Vibrating Sample Magnetometer and X-Ray diffraction. The XRD results revealed that the synthesized particles have nanometer size and it varies from 46-72 nm as the calcium concentration diminishes. The variation is explained based on the increase in the reaction rate with Ni concentration which favors the formation of ultrafine particles of mixed ferrites. VSM results show pure CaFe₂O₄ exhibit paramagnetic behavior with low saturation value. As the concentration of Ca decreases, a transition occurs from paramagnetic state to ferromagnetic state. When the concentration of Ni becomes dominant, magnetic saturation, coercivity, and retentivity become high, indicating near ferromagnetic behavior of the compound.

Keywords: co-precipitation, ferrites, magnetic behavior, structure

Procedia PDF Downloads 213
3441 A Reflection of the Contemporary Life of Urban People Through Mixed Media Art

Authors: Van Huong Mai, Kanokwan Nithiratphat, Adool Booncham

Abstract:

The Movement of Contemporary Life consisted of two purposes, which were to study the movement and development of the modern life and to create the visual arts, which were paintings expressed via the form of apartment buildings was used from mixed media (digital printing and acrylic painting on canvas) which conveyed the rapid pace of modern life leading to diverse movements in viewer’s feeling. The operation of this creation was collected field data, documentary data, and influence from creative work. The data analysis was analyzed in order to theme, form, technique, and process to satisfy of concept and special character of the pieces.

Keywords: movement, contemporary life, visual art, acrylic painting, digital art, urban space

Procedia PDF Downloads 68
3440 Thermal Radiation Effect on Mixed Convection Boundary Layer Flow over a Vertical Plate with Varying Density and Volumetric Expansion Coefficient

Authors: Sadia Siddiqa, Z. Khan, M. A. Hossain

Abstract:

In this article, the effect of thermal radiation on mixed convection boundary layer flow of a viscous fluid along a highly heated vertical flat plate is considered with varying density and volumetric expansion coefficient. The density of the fluid is assumed to vary exponentially with temperature, however; volumetric expansion coefficient depends linearly on temperature. Boundary layer equations are transformed into convenient form by introducing primitive variable formulations. Solutions of transformed system of equations are obtained numerically through implicit finite difference method along with Gaussian elimination technique. Results are discussed in view of various parameters, like thermal radiation parameter, volumetric expansion parameter and density variation parameter on the wall shear stress and heat transfer rate. It is concluded from the present investigation that increase in volumetric expansion parameter decreases wall shear stress and enhances heat transfer rate.

Keywords: thermal radiation, mixed convection, variable density, variable volumetric expansion coefficient

Procedia PDF Downloads 345
3439 Ruminal VFA of Beef Fed Different Protein

Authors: P. Paengkoum, S. C. Chen, S. Paengkoum

Abstract:

Six male growing Thai-indigenous beef cattle with body weight (BW) of 154±13.2 kg were randomly assigned in replicated 3×3 Latin square design, and fed with different levels of crude protein (CP) in total mixed ration (TMR) diets. CP levels in diets were 4.3%, 7.3% and 10.3% base on dry matter (DM). Ruminal ammonia nitrogen (NH3-N) and blood urea nitrogen (BUN) concentrations increased (P<0.01) with increasing CP levels. Moreover, there is a positive relationship between BUN and ruminal NH3-N. Rumen pH, total volatile fatty acid (VFA), molar proportions of acetate, propionate and butyrate were not affected by CP levels (P>0.05).

Keywords: Thai-indigenous beef cattle, crude protein, volatile fatty acid (VFA), total mixed ration (TMR) diets

Procedia PDF Downloads 251
3438 Seashore Debris Detection System Using Deep Learning and Histogram of Gradients-Extractor Based Instance Segmentation Model

Authors: Anshika Kankane, Dongshik Kang

Abstract:

Marine debris has a significant influence on coastal environments, damaging biodiversity, and causing loss and damage to marine and ocean sector. A functional cost-effective and automatic approach has been used to look up at this problem. Computer vision combined with a deep learning-based model is being proposed to identify and categorize marine debris of seven kinds on different beach locations of Japan. This research compares state-of-the-art deep learning models with a suggested model architecture that is utilized as a feature extractor for debris categorization. The model is being proposed to detect seven categories of litter using a manually constructed debris dataset, with the help of Mask R-CNN for instance segmentation and a shape matching network called HOGShape, which can then be cleaned on time by clean-up organizations using warning notifications of the system. The manually constructed dataset for this system is created by annotating the images taken by fixed KaKaXi camera using CVAT annotation tool with seven kinds of category labels. A pre-trained HOG feature extractor on LIBSVM is being used along with multiple templates matching on HOG maps of images and HOG maps of templates to improve the predicted masked images obtained via Mask R-CNN training. This system intends to timely alert the cleanup organizations with the warning notifications using live recorded beach debris data. The suggested network results in the improvement of misclassified debris masks of debris objects with different illuminations, shapes, viewpoints and litter with occlusions which have vague visibility.

Keywords: computer vision, debris, deep learning, fixed live camera images, histogram of gradients feature extractor, instance segmentation, manually annotated dataset, multiple template matching

Procedia PDF Downloads 70
3437 Different Motor Inhibition Processes in Action Selection Stage: A Study with Spatial Stroop Paradigm

Authors: German Galvez-Garcia, Javier Albayay, Javiera Peña, Marta Lavin, George A. Michael

Abstract:

The aim of this research was to investigate whether the selection of the actions needs different inhibition processes during the response selection stage. In Experiment 1, we compared the magnitude of the Spatial Stroop effect, which occurs in response selection stage, in two motor actions (lifting vs reaching) when the participants performed both actions in the same block or in different blocks (mixed block vs. pure blocks).Within pure blocks, we obtained faster latencies when lifting actions were performed, but no differences in the magnitude of the Spatial Stroop effect were observed. Within mixed block, we obtained faster latencies as well as bigger-magnitude for Spatial Stroop effect when reaching actions were performed. We concluded that when no action selection is required (the pure blocks condition), inhibition works as a unitary system, whereas in the mixed block condition, where action selection is required, different inhibitory processes take place within a common processing stage. In Experiment 2, we investigated this common processing stage in depth by limiting participants’ available resources, requiring them to engage in a concurrent auditory task within a mixed block condition. The Spatial Stroop effect interacted with Movement as it did in Experiment 1, but it did not significantly interact with available resources (Auditory task x Spatial Stroop effect x Movement interaction). Thus, we concluded that available resources are distributed equally to both inhibition processes; this reinforces the likelihood of there being a common processing stage in which the different inhibitory processes take place.

Keywords: inhibition process, motor processes, selective inhibition, dual task

Procedia PDF Downloads 359
3436 The Flexural Strength of Fiber-Reinforced Polymer Cement Mortars Using UM Resin

Authors: Min Ho Kwon, Woo Young Jung, Hyun Su Seo

Abstract:

A Polymer Cement Mortar (PCM) has been widely used as the material of repair and restoration work for concrete structure; however a PCM usually induces an environmental pollutant. Therefore, there is a need to develop PCM which is less impact to environments. Usually, UM resin is known to be harmless to the environment. Accordingly, in this paper, the properties of the PCM using UM resin were studied. The general cement mortar and UM resin was mixed in the specified ratio. A certain percentage of PVA fibers, steel fibers and mixed fibers (PVA fiber and steel fiber) were added to enhance the flexural strength. The flexural tests were performed in order to investigate the flexural strength of each PCM. Experimental results showed that the strength of proposed PCM using UM resin is improved when they are compared with general cement mortar.

Keywords: polymer cement mortar, UM resin, compressive strength, PVA fiber, steel fiber

Procedia PDF Downloads 309
3435 Contact Phenomena in Medieval Business Texts

Authors: Carmela Perta

Abstract:

Among the studies flourished in the field of historical sociolinguistics, mainly in the strand devoted to English history, during its Medieval and early modern phases, multilingual texts had been analysed using theories and models coming from contact linguistics, thus applying synchronic models and approaches to the past. This is true also in the case of contact phenomena which would transcend the writing level involving the language systems implicated in contact processes to the point of perceiving a new variety. This is the case for medieval administrative-commercial texts in which, according to some Scholars, the degree of fusion of Anglo-Norman, Latin and middle English is so high a mixed code emerges, and there are recurrent patterns of mixed forms. Interesting is a collection of multilingual business writings by John Balmayn, an Englishman overseeing a large shipment in Tuscany, namely the Cantelowe accounts. These documents display various analogies with multilingual texts written in England in the same period; in fact, the writer seems to make use of the above-mentioned patterns, with Middle English, Latin, Anglo-Norman, and the newly added Italian. Applying an atomistic yet dynamic approach to the study of contact phenomena, we will investigate these documents, trying to explore the nature of the switching forms they contain from an intra-writer variation perspective. After analysing the accounts and the type of multilingualism in them, we will take stock of the assumed mixed code nature, comparing the characteristics found in this genre with modern assumptions. The aim is to evaluate the possibility to consider the switching forms as core elements of a mixed code, used as professional variety among merchant communities, or whether such texts should be analysed from a switching perspective.

Keywords: historical sociolinguistics, historical code switching, letters, medieval england

Procedia PDF Downloads 46