Search results for: pretrained model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 16789

Search results for: pretrained model

16789 Camera Model Identification for Mi Pad 4, Oppo A37f, Samsung M20, and Oppo f9

Authors: Ulrich Wake, Eniman Syamsuddin

Abstract:

The model for camera model identificaiton is trained using pretrained model ResNet43 and ResNet50. The dataset consists of 500 photos of each phone. Dataset is divided into 1280 photos for training, 320 photos for validation and 400 photos for testing. The model is trained using One Cycle Policy Method and tested using Test-Time Augmentation. Furthermore, the model is trained for 50 epoch using regularization such as drop out and early stopping. The result is 90% accuracy for validation set and above 85% for Test-Time Augmentation using ResNet50. Every model is also trained by slightly updating the pretrained model’s weights

Keywords: ​ One Cycle Policy, ResNet34, ResNet50, Test-Time Agumentation

Procedia PDF Downloads 207
16788 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lopez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri

Abstract:

Pretrained transformer-based models have shown high performance in natural language generation tasks. However, a new wave of interest has surged: automatic programming language code generation. This task consists of translating natural language instructions to a source code. Despite the fact that well-known pre-trained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformer neural network. It aims to generate java source code from natural language text. JaCoText leverages the advantages of both natural language and code generation models. More specifically, we study some findings from state of the art and use them to (1) initialize our model from powerful pre-trained models, (2) explore additional pretraining on our java dataset, (3) lead experiments combining the unimodal and bimodal data in training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: java code generation, natural language processing, sequence-to-sequence models, transformer neural networks

Procedia PDF Downloads 283
16787 A Lightweight Pretrained Encrypted Traffic Classification Method with Squeeze-and-Excitation Block and Sharpness-Aware Optimization

Authors: Zhiyan Meng, Dan Liu, Jintao Meng

Abstract:

Dependable encrypted traffic classification is crucial for improving cybersecurity and handling the growing amount of data. Large language models have shown that learning from large datasets can be effective, making pre-trained methods for encrypted traffic classification popular. However, attention-based pre-trained methods face two main issues: their large neural parameters are not suitable for low-computation environments like mobile devices and real-time applications, and they often overfit by getting stuck in local minima. To address these issues, we developed a lightweight transformer model, which reduces the computational parameters through lightweight vocabulary construction and Squeeze-and-Excitation Block. We use sharpness-aware optimization to avoid local minima during pre-training and capture temporal features with relative positional embeddings. Our approach keeps the model's classification accuracy high for downstream tasks. We conducted experiments on four datasets -USTC-TFC2016, VPN 2016, Tor 2016, and CICIOT 2022. Even with fewer than 18 million parameters, our method achieves classification results similar to methods with ten times as many parameters.

Keywords: sharpness-aware optimization, encrypted traffic classification, squeeze-and-excitation block, pretrained model

Procedia PDF Downloads 29
16786 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 146
16785 A Generative Pretrained Transformer-Based Question-Answer Chatbot and Phantom-Less Quantitative Computed Tomography Bone Mineral Density Measurement System for Osteoporosis

Authors: Mian Huang, Chi Ma, Junyu Lin, William Lu

Abstract:

Introduction: Bone health attracts more attention recently and an intelligent question and answer (QA) chatbot for osteoporosis is helpful for science popularization. With Generative Pretrained Transformer (GPT) technology developing, we build an osteoporosis corpus dataset and then fine-tune LLaMA, a famous open-source GPT foundation large language model(LLM), on our self-constructed osteoporosis corpus. Evaluated by clinical orthopedic experts, our fine-tuned model outperforms vanilla LLaMA on osteoporosis QA task in Chinese. Three-dimensional quantitative computed tomography (QCT) measured bone mineral density (BMD) is considered as more accurate than DXA for BMD measurement in recent years. We develop an automatic Phantom-less QCT(PL-QCT) that is more efficient for BMD measurement since no need of an external phantom for calibration. Combined with LLM on osteoporosis, our PL-QCT provides efficient and accurate BMD measurement for our chatbot users. Material and Methods: We build an osteoporosis corpus containing about 30,000 Chinese literatures whose titles are related to osteoporosis. The whole process is done automatically, including crawling literatures in .pdf format, localizing text/figure/table region by layout segmentation algorithm and recognizing text by OCR algorithm. We train our model by continuous pre-training with Low-rank Adaptation (LoRA, rank=10) technology to adapt LLaMA-7B model to osteoporosis domain, whose basic principle is to mask the next word in the text and make the model predict that word. The loss function is defined as cross-entropy between the predicted and ground-truth word. Experiment is implemented on single NVIDIA A800 GPU for 15 days. Our automatic PL-QCT BMD measurement adopt AI-associated region-of-interest (ROI) generation algorithm for localizing vertebrae-parallel cylinder in cancellous bone. Due to no phantom for BMD calibration, we calculate ROI BMD by CT-BMD of personal muscle and fat. Results & Discussion: Clinical orthopaedic experts are invited to design 5 osteoporosis questions in Chinese, evaluating performance of vanilla LLaMA and our fine-tuned model. Our model outperforms LLaMA on over 80% of these questions, understanding ‘Expert Consensus on Osteoporosis’, ‘QCT for osteoporosis diagnosis’ and ‘Effect of age on osteoporosis’. Detailed results are shown in appendix. Future work may be done by training a larger LLM on the whole orthopaedics with more high-quality domain data, or a multi-modal GPT combining and understanding X-ray and medical text for orthopaedic computer-aided-diagnosis. However, GPT model gives unexpected outputs sometimes, such as repetitive text or seemingly normal but wrong answer (called ‘hallucination’). Even though GPT give correct answers, it cannot be considered as valid clinical diagnoses instead of clinical doctors. The PL-QCT BMD system provided by Bone’s QCT(Bone’s Technology(Shenzhen) Limited) achieves 0.1448mg/cm2(spine) and 0.0002 mg/cm2(hip) mean absolute error(MAE) and linear correlation coefficient R2=0.9970(spine) and R2=0.9991(hip)(compared to QCT-Pro(Mindways)) on 155 patients in three-center clinical trial in Guangzhou, China. Conclusion: This study builds a Chinese osteoporosis corpus and develops a fine-tuned and domain-adapted LLM as well as a PL-QCT BMD measurement system. Our fine-tuned GPT model shows better capability than LLaMA model on most testing questions on osteoporosis. Combined with our PL-QCT BMD system, we are looking forward to providing science popularization and early morning screening for potential osteoporotic patients.

Keywords: GPT, phantom-less QCT, large language model, osteoporosis

Procedia PDF Downloads 70
16784 Efficient Chiller Plant Control Using Modern Reinforcement Learning

Authors: Jingwei Du

Abstract:

The need of optimizing air conditioning systems for existing buildings calls for control methods designed with energy-efficiency as a primary goal. The majority of current control methods boil down to two categories: empirical and model-based. To be effective, the former heavily relies on engineering expertise and the latter requires extensive historical data. Reinforcement Learning (RL), on the other hand, is a model-free approach that explores the environment to obtain an optimal control strategy often referred to as “policy”. This research adopts Proximal Policy Optimization (PPO) to improve chiller plant control, and enable the RL agent to collaborate with experienced engineers. It exploits the fact that while the industry lacks historical data, abundant operational data is available and allows the agent to learn and evolve safely under human supervision. Thanks to the development of language models, renewed interest in RL has led to modern, online, policy-based RL algorithms such as the PPO. This research took inspiration from “alignment”, a process that utilizes human feedback to finetune the pretrained model in case of unsafe content. The methodology can be summarized into three steps. First, an initial policy model is generated based on minimal prior knowledge. Next, the prepared PPO agent is deployed so feedback from both critic model and human experts can be collected for future finetuning. Finally, the agent learns and adapts itself to the specific chiller plant, updates the policy model and is ready for the next iteration. Besides the proposed approach, this study also used traditional RL methods to optimize the same simulated chiller plants for comparison, and it turns out that the proposed method is safe and effective at the same time and needs less to no historical data to start up.

Keywords: chiller plant, control methods, energy efficiency, proximal policy optimization, reinforcement learning

Procedia PDF Downloads 28
16783 Single-Camera Basketball Tracker through Pose and Semantic Feature Fusion

Authors: Adrià Arbués-Sangüesa, Coloma Ballester, Gloria Haro

Abstract:

Tracking sports players is a widely challenging scenario, specially in single-feed videos recorded in tight courts, where cluttering and occlusions cannot be avoided. This paper presents an analysis of several geometric and semantic visual features to detect and track basketball players. An ablation study is carried out and then used to remark that a robust tracker can be built with Deep Learning features, without the need of extracting contextual ones, such as proximity or color similarity, nor applying camera stabilization techniques. The presented tracker consists of: (1) a detection step, which uses a pretrained deep learning model to estimate the players pose, followed by (2) a tracking step, which leverages pose and semantic information from the output of a convolutional layer in a VGG network. Its performance is analyzed in terms of MOTA over a basketball dataset with more than 10k instances.

Keywords: basketball, deep learning, feature extraction, single-camera, tracking

Procedia PDF Downloads 137
16782 Self-Supervised Pretraining on Sequences of Functional Magnetic Resonance Imaging Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training

Procedia PDF Downloads 88
16781 Application of Low-order Modeling Techniques and Neural-Network Based Models for System Identification

Authors: Venkatesh Pulletikurthi, Karthik B. Ariyur, Luciano Castillo

Abstract:

The system identification from the turbulence wakes will lead to the tactical advantage to prepare and also, to predict the trajectory of the opponents’ movements. A low-order modeling technique, POD, is used to predict the object based on the wake pattern and compared with pre-trained image recognition neural network (NN) to classify the wake patterns into objects. It is demonstrated that low-order modeling, POD, is able to predict the objects better compared to pretrained NN by ~30%.

Keywords: the bluff body wakes, low-order modeling, neural network, system identification

Procedia PDF Downloads 179
16780 A New Nonlinear State-Space Model and Its Application

Authors: Abdullah Eqal Al Mazrooei

Abstract:

In this work, a new nonlinear model will be introduced. The model is in the state-space form. The nonlinearity of this model is in the state equation where the state vector is multiplied by its self. This technique makes our model generalizes many famous models as Lotka-Volterra model and Lorenz model which have many applications in the real life. We will apply our new model to estimate the wind speed by using a new nonlinear estimator which suitable to work with our model.

Keywords: nonlinear systems, state-space model, Kronecker product, nonlinear estimator

Procedia PDF Downloads 690
16779 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 634
16778 The Use of Layered Neural Networks for Classifying Hierarchical Scientific Fields of Study

Authors: Colin Smith, Linsey S Passarella

Abstract:

Due to the proliferation and decentralized nature of academic publication, no widely accepted scheme exists for organizing papers by their scientific field of study (FoS) to the author’s best knowledge. While many academic journals require author provided keywords for papers, these keywords range wildly in scope and are not consistent across papers, journals, or field domains, necessitating alternative approaches to paper classification. Past attempts to perform field-of-study (FoS) classification on scientific texts have largely used a-hierarchical FoS schemas or ignored the schema’s inherently hierarchical structure, e.g. by compressing the structure into a single layer for multi-label classification. In this paper, we introduce an application of a Layered Neural Network (LNN) to the problem of performing supervised hierarchical classification of scientific fields of study (FoS) on research papers. In this approach, paper embeddings from a pretrained language model are fed into a top-down LNN. Beginning with a single neural network (NN) for the highest layer of the class hierarchy, each node uses a separate local NN to classify the subsequent subfield child node(s) for an input embedding of concatenated paper titles and abstracts. We compare our LNN-FOS method to other recent machine learning methods using the Microsoft Academic Graph (MAG) FoS hierarchy and find that the LNN-FOS offers increased classification accuracy at each FoS hierarchical level.

Keywords: hierarchical classification, layer neural network, scientific field of study, scientific taxonomy

Procedia PDF Downloads 132
16777 Mathematical Model to Quantify the Phenomenon of Democracy

Authors: Mechlouch Ridha Fethi

Abstract:

This paper presents a recent mathematical model in political sciences concerning democracy. The model is represented by a logarithmic equation linking the Relative Index of Democracy (RID) to Participation Ratio (PR). Firstly the meanings of the different parameters of the model were presented; and the variation curve of the RID according to PR with different critical areas was discussed. Secondly, the model was applied to a virtual group where we show that the model can be applied depending on the gender. Thirdly, it was observed that the model can be extended to different language models of democracy and that little use to assess the state of democracy for some International organizations like UNO.

Keywords: democracy, mathematic, modelization, quantification

Procedia PDF Downloads 367
16776 The Achievement Model of University Social Responsibility

Authors: Le Kang

Abstract:

On the research question of 'how to achieve USR', this contribution reflects the concept of university social responsibility, identify three achievement models of USR as the society - diversified model, the university-cooperation model, the government - compound model, also conduct a case study to explore characteristics of Chinese achievement model of USR. The contribution concludes with discussion of how the university, government and society balance demands and roles, make necessarily strategic adjustment and innovative approach to repair the shortcomings of each achievement model.

Keywords: modern university, USR, achievement model, compound model

Procedia PDF Downloads 755
16775 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 518
16774 Implementation and Validation of a Damage-Friction Constitutive Model for Concrete

Authors: L. Madouni, M. Ould Ouali, N. E. Hannachi

Abstract:

Two constitutive models for concrete are available in ABAQUS/Explicit, the Brittle Cracking Model and the Concrete Damaged Plasticity Model, and their suitability and limitations are well known. The aim of the present paper is to implement a damage-friction concrete constitutive model and to evaluate the performance of this model by comparing the predicted response with experimental data. The constitutive formulation of this material model is reviewed. In order to have consistent results, the parameter identification and calibration for the model have been performed. Several numerical simulations are presented in this paper, whose results allow for validating the capability of the proposed model for reproducing the typical nonlinear performances of concrete structures under different monotonic and cyclic load conditions. The results of the evaluation will be used for recommendations concerning the application and further improvements of the investigated model.

Keywords: Abaqus, concrete, constitutive model, numerical simulation

Procedia PDF Downloads 363
16773 Model Driven Architecture Methodologies: A Review

Authors: Arslan Murtaza

Abstract:

Model Driven Architecture (MDA) is technique presented by OMG (Object Management Group) for software development in which different models are proposed and converted them into code. The main plan is to identify task by using PIM (Platform Independent Model) and transform it into PSM (Platform Specific Model) and then converted into code. In this review paper describes some challenges and issues that are faced in MDA, type and transformation of models (e.g. CIM, PIM and PSM), and evaluation of MDA-based methodologies.

Keywords: OMG, model driven rrchitecture (MDA), computation independent model (CIM), platform independent model (PIM), platform specific model(PSM), MDA-based methodologies

Procedia PDF Downloads 457
16772 The Influence of the Concentration and Temperature on the Rheological Behavior of Carbonyl-Methylcellulose

Authors: Mohamed Rabhi, Kouider Halim Benrahou

Abstract:

The rheological properties of the carbonyl-methylcellulose (CMC), of different concentrations (25000, 50000, 60000, 80000 and 100000 ppm) and different temperatures were studied. We found that the rheological behavior of all CMC solutions presents a pseudo-plastic behavior, it follows the model of Ostwald-de Waele. The objective of this work is the modeling of flow by the CMC Cross model. The Cross model gives us the variation of the viscosity according to the shear rate. This model allowed us to adjust more clearly the rheological characteristics of CMC solutions. A comparison between the Cross model and the model of Ostwald was made. Cross the model fitting parameters were determined by a numerical simulation to make an approach between the experimental curve and those given by the two models. Our study has shown that the model of Cross, describes well the flow of "CMC" for low concentrations.

Keywords: CMC, rheological modeling, Ostwald model, cross model, viscosity

Procedia PDF Downloads 404
16771 3D Model of Rain-Wind Induced Vibration of Inclined Cable

Authors: Viet-Hung Truong, Seung-Eock Kim

Abstract:

Rain–wind induced vibration of inclined cable is a special aerodynamic phenomenon because it is easily influenced by many factors, especially the distribution of rivulet and wind velocity. This paper proposes a new 3D model of inclined cable, based on single degree-of-freedom model. Aerodynamic forces are firstly established and verified with the existing results from a 2D model. The 3D model of inclined cable is developed. The 3D model is then applied to assess the effects of wind velocity distribution and the continuity of rivulets on the cable. Finally, an inclined cable model with small sag is investigated.

Keywords: 3D model, rain - wind induced vibration, rivulet, analytical model

Procedia PDF Downloads 488
16770 Identifying Model to Predict Deterioration of Water Mains Using Robust Analysis

Authors: Go Bong Choi, Shin Je Lee, Sung Jin Yoo, Gibaek Lee, Jong Min Lee

Abstract:

In South Korea, it is difficult to obtain data for statistical pipe assessment. In this paper, to address these issues, we find that various statistical model presented before is how data mixed with noise and are whether apply in South Korea. Three major type of model is studied and if data is presented in the paper, we add noise to data, which affects how model response changes. Moreover, we generate data from model in paper and analyse effect of noise. From this we can find robustness and applicability in Korea of each model.

Keywords: proportional hazard model, survival model, water main deterioration, ecological sciences

Procedia PDF Downloads 742
16769 Self-Supervised Learning for Hate-Speech Identification

Authors: Shrabani Ghosh

Abstract:

Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.

Keywords: attention learning, language model, offensive language detection, self-supervised learning

Procedia PDF Downloads 103
16768 Uncertainty Estimation in Neural Networks through Transfer Learning

Authors: Ashish James, Anusha James

Abstract:

The impressive predictive performance of deep learning techniques on a wide range of tasks has led to its widespread use. Estimating the confidence of these predictions is paramount for improving the safety and reliability of such systems. However, the uncertainty estimates provided by neural networks (NNs) tend to be overconfident and unreasonable. Ensemble of NNs typically produce good predictions but uncertainty estimates tend to be inconsistent. Inspired by these, this paper presents a framework that can quantitatively estimate the uncertainties by leveraging the advances in transfer learning through slight modification to the existing training pipelines. This promising algorithm is developed with an intention of deployment in real world problems which already boast a good predictive performance by reusing those pretrained models. The idea is to capture the behavior of the trained NNs for the base task by augmenting it with the uncertainty estimates from a supplementary network. A series of experiments with known and unknown distributions show that the proposed approach produces well calibrated uncertainty estimates with high quality predictions.

Keywords: uncertainty estimation, neural networks, transfer learning, regression

Procedia PDF Downloads 134
16767 Equivalent Circuit Model for the Eddy Current Damping with Frequency-Dependence

Authors: Zhiguo Shi, Cheng Ning Loong, Jiazeng Shan, Weichao Wu

Abstract:

This study proposes an equivalent circuit model to simulate the eddy current damping force with shaking table tests and finite element modeling. The model is firstly proposed and applied to a simple eddy current damper, which is modelled in ANSYS, indicating that the proposed model can simulate the eddy current damping force under different types of excitations. Then, a non-contact and friction-free eddy current damper is designed and tested, and the proposed model can reproduce the experimental observations. The excellent agreement between the simulated results and the experimental data validates the accuracy and reliability of the equivalent circuit model. Furthermore, a more complicated model is performed in ANSYS to verify the feasibility of the equivalent circuit model in complex eddy current damper, and the higher-order fractional model and viscous model are adopted for comparison.

Keywords: equivalent circuit model, eddy current damping, finite element model, shake table test

Procedia PDF Downloads 190
16766 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 552
16765 A Theoretical Hypothesis on Ferris Wheel Model of University Social Responsibility

Authors: Le Kang

Abstract:

According to the nature of the university, as a free and responsible academic community, USR is based on a different foundation —academic responsibility, so the Pyramid and the IC Model of CSR could not fully explain the most distinguished feature of USR. This paper sought to put forward a new model— Ferris Wheel Model, to illustrate the nature of USR and the process of achievement. The Ferris Wheel Model of USR shows the university creates a balanced, fairness and neutrality systemic structure to afford social responsibilities; that makes the organization could obtain a synergistic effect to achieve more extensive interests of stakeholders and wider social responsibilities.

Keywords: USR, achievement model, ferris wheel model, social responsibilities

Procedia PDF Downloads 723
16764 Model Predictive Control of Three Phase Inverter for PV Systems

Authors: Irtaza M. Syed, Kaamran Raahemifar

Abstract:

This paper presents a model predictive control (MPC) of a utility interactive three phase inverter (TPI) for a photovoltaic (PV) system at commercial level. The proposed model uses phase locked loop (PLL) to synchronize TPI with the power electric grid (PEG) and performs MPC control in a dq reference frame. TPI model consists of boost converter (BC), maximum power point tracking (MPPT) control, and a three leg voltage source inverter (VSI). Operational model of VSI is used to synthesize sinusoidal current and track the reference. Model is validated using a 35.7 kW PV system in Matlab/Simulink. Implementation and results show simplicity and accuracy, as well as reliability of the model.

Keywords: model predictive control, three phase voltage source inverter, PV system, Matlab/simulink

Procedia PDF Downloads 593
16763 Model Observability – A Monitoring Solution for Machine Learning Models

Authors: Amreth Chandrasehar

Abstract:

Machine Learning (ML) Models are developed and run in production to solve various use cases that help organizations to be more efficient and help drive the business. But this comes at a massive development cost and lost business opportunities. According to the Gartner report, 85% of data science projects fail, and one of the factors impacting this is not paying attention to Model Observability. Model Observability helps the developers and operators to pinpoint the model performance issues data drift and help identify root cause of issues. This paper focuses on providing insights into incorporating model observability in model development and operationalizing it in production.

Keywords: model observability, monitoring, drift detection, ML observability platform

Procedia PDF Downloads 112
16762 All-or-None Principle and Weakness of Hodgkin-Huxley Mathematical Model

Authors: S. A. Sadegh Zadeh, C. Kambhampati

Abstract:

Mathematical and computational modellings are the necessary tools for reviewing, analysing, and predicting processes and events in the wide spectrum range of scientific fields. Therefore, in a field as rapidly developing as neuroscience, the combination of these two modellings can have a significant role in helping to guide the direction the field takes. The paper combined mathematical and computational modelling to prove a weakness in a very precious model in neuroscience. This paper is intended to analyse all-or-none principle in Hodgkin-Huxley mathematical model. By implementation the computational model of Hodgkin-Huxley model and applying the concept of all-or-none principle, an investigation on this mathematical model has been performed. The results clearly showed that the mathematical model of Hodgkin-Huxley does not observe this fundamental law in neurophysiology to generating action potentials. This study shows that further mathematical studies on the Hodgkin-Huxley model are needed in order to create a model without this weakness.

Keywords: all-or-none, computational modelling, mathematical model, transmembrane voltage, action potential

Procedia PDF Downloads 616
16761 Multiscale Modelling of Citrus Black Spot Transmission Dynamics along the Pre-Harvest Supply Chain

Authors: Muleya Nqobile, Winston Garira

Abstract:

We presented a compartmental deterministic multi-scale model which encompass internal plant defensive mechanism and pathogen interaction, then we consider nesting the model into the epidemiological model. The objective was to improve our understanding of the transmission dynamics of within host and between host of Guignardia citricapa Kiely. The inflow of infected class was scaled down to individual level while the outflow was scaled up to average population level. Conceptual model and mathematical model were constructed to display a theoretical framework which can be used for predicting or identify disease pattern.

Keywords: epidemiological model, mathematical modelling, multi-scale modelling, immunological model

Procedia PDF Downloads 457
16760 Proposal for a Generic Context Meta-Model

Authors: Jaouadi Imen, Ben Djemaa Raoudha, Ben Abdallah Hanene

Abstract:

The access to relevant information that is adapted to users’ needs, preferences and environment is a challenge in many applications running. That causes an appearance of context-aware systems. To facilitate the development of this class of applications, it is necessary that these applications share a common context meta-model. In this article, we will present our context meta-model that is defined using the OMG Meta Object facility (MOF). This meta-model is based on the analysis and synthesis of context concepts proposed in literature.

Keywords: context, meta-model, MOF, awareness system

Procedia PDF Downloads 558