Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 29

Search results for: generative

29 Enhancing Hand Efficiency of Smart Glass Cleaning Robot through Generative Design Module

Authors: Pankaj Gupta, Amit Kumar Srivastava, Nitesh Pandey

Abstract:

This article explores the domain of generative design in order to enhance the development of robot designs for innovative and efficient maintenance approaches for tall buildings. This study aims to optimize the design of robotic hands by focusing on minimizing mass and volume while ensuring they can withstand the specified pressure with equal strength. The research procedure is structured and systematic. The purpose of optimization is to enhance the efficiency of the robot and reduce the manufacturing expenses. The project seeks to investigate the application of generative design in order to optimize products. Autodesk Fusion 360 offers the capability to immediately apply the generative design functionality to the solid model. The effort involved creating a solid model of the Smart Glass Cleaning Robot and optimizing one of its components, the Hand, using generative techniques. The article has thoroughly examined the designs, outcomes, and procedure. These loads serve as a benchmark for creating designs that can endure the necessary level of pressure and preserve their structural integrity. The efficacy of the generative design process is contingent upon the selection of materials, as different materials possess distinct physical attributes. The study utilizes five different materials, namely Steel, Stainless Steel, Titanium, Aluminum, and CFRP (Carbon Fiber Reinforced Polymer), in order to investigate a range of design possibilities.

Keywords: Generative design, mass and volume optimization, material strength analysis, generative design, smart glass cleaning robot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 200

28 Vision Based Hand Gesture Recognition Using Generative and Discriminative Stochastic Models

Authors: Mahmoud Elmezain, Samar El-shinawy

Abstract:

Many approaches to pattern recognition are founded on probability theory, and can be broadly characterized as either generative or discriminative according to whether or not the distribution of the image features. Generative and discriminative models have very different characteristics, as well as complementary strengths and weaknesses. In this paper, we study these models to recognize the patterns of alphabet characters (A-Z) and numbers (0-9). To handle isolated pattern, generative model as Hidden Markov Model (HMM) and discriminative models like Conditional Random Field (CRF), Hidden Conditional Random Field (HCRF) and Latent-Dynamic Conditional Random Field (LDCRF) with different number of window size are applied on extracted pattern features. The gesture recognition rate is improved initially as the window size increase, but degrades as window size increase further. Experimental results show that the LDCRF is the best in terms of results than CRF, HCRF and HMM at window size equal 4. Additionally, our results show that; an overall recognition rates are 91.52%, 95.28%, 96.94% and 98.05% for CRF, HCRF, HMM and LDCRF respectively.

Keywords: Statistical Pattern Recognition, Generative Model, Discriminative Model, Human Computer Interaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2935

27 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: Outlier detection, generative adversary networks, semi-supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074

26 Time Series Simulation by Conditional Generative Adversarial Net

Authors: Rao Fu, Jie Chen, Shutian Zeng, Yiping Zhuang, Agus Sudjianto

Abstract:

Generative Adversarial Net (GAN) has proved to be a powerful machine learning tool in image data analysis and generation. In this paper, we propose to use Conditional Generative Adversarial Net (CGAN) to learn and simulate time series data. The conditions include both categorical and continuous variables with different auxiliary information. Our simulation studies show that CGAN has the capability to learn different types of normal and heavy-tailed distributions, as well as dependent structures of different time series. It also has the capability to generate conditional predictive distributions consistent with training data distributions. We also provide an in-depth discussion on the rationale behind GAN and the neural networks as hierarchical splines to establish a clear connection with existing statistical methods of distribution generation. In practice, CGAN has a wide range of applications in market risk and counterparty risk analysis: it can be applied to learn historical data and generate scenarios for the calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), and it can also predict the movement of the market risk factors. We present a real data analysis including a backtesting to demonstrate that CGAN can outperform Historical Simulation (HS), a popular method in market risk analysis to calculate VaR. CGAN can also be applied in economic time series modeling and forecasting. In this regard, we have included an example of hypothetical shock analysis for economic models and the generation of potential CCAR scenarios by CGAN at the end of the paper.

Keywords: Conditional Generative Adversarial Net, market and credit risk management, neural network, time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1199

25 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: Retrieval, generative, deep learning, response generation, knowledge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1204

24 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat

Abstract:

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832

23 Evaluating Generative Neural Attention Weights-Based Chatbot on Customer Support Twitter Dataset

Authors: Sinarwati Mohamad Suhaili, Naomie Salim, Mohamad Nazim Jambli

Abstract:

Sequence-to-sequence (seq2seq) models augmented with attention mechanisms are increasingly important in automated customer service. These models, adept at recognizing complex relationships between input and output sequences, are essential for optimizing chatbot responses. Central to these mechanisms are neural attention weights that determine the model’s focus during sequence generation. Despite their widespread use, there remains a gap in the comparative analysis of different attention weighting functions within seq2seq models, particularly in the context of chatbots utilizing the Customer Support Twitter (CST) dataset. This study addresses this gap by evaluating four distinct attention-scoring functions—dot, multiplicative/general, additive, and an extended multiplicative function with a tanh activation parameter — in neural generative seq2seq models. Using the CST dataset, these models were trained and evaluated over 10 epochs with the AdamW optimizer. Evaluation criteria included validation loss and BLEU scores implemented under both greedy and beam search strategies with a beam size of k = 3. Results indicate that the model with the tanh-augmented multiplicative function significantly outperforms its counterparts, achieving the lowest validation loss (1.136484) and the highest BLEU scores (0.438926 under greedy search, 0.443000 under beam search, k = 3). These findings emphasize the crucial influence of selecting an appropriate attention-scoring function to enhance the performance of seq2seq models for chatbots, particularly highlighting the model integrating tanh activation as a promising approach to improving chatbot quality in customer support contexts.

Keywords: Attention weight, chatbot, encoder-decoder, neural generative attention, score function, sequence-to-sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 89

22 ECG-Based Heartbeat Classification Using Convolutional Neural Networks

Authors: Jacqueline R. T. Alipo-on, Francesca I. F. Escobar, Myles J. T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Electrocardiogram (ECG) signal analysis and processing are crucial in the diagnosis of cardiovascular diseases which are considered as one of the leading causes of mortality worldwide. However, the traditional rule-based analysis of large volumes of ECG data is time-consuming, labor-intensive, and prone to human errors. With the advancement of the programming paradigm, algorithms such as machine learning have been increasingly used to perform an analysis on the ECG signals. In this paper, various deep learning algorithms were adapted to classify five classes of heart beat types. The dataset used in this work is the synthetic MIT-Beth Israel Hospital (MIT-BIH) Arrhythmia dataset produced from generative adversarial networks (GANs). Various deep learning models such as ResNet-50 convolutional neural network (CNN), 1-D CNN, and long short-term memory (LSTM) were evaluated and compared. ResNet-50 was found to outperform other models in terms of recall and F1 score using a five-fold average score of 98.88% and 98.87%, respectively. 1-D CNN, on the other hand, was found to have the highest average precision of 98.93%.

Keywords: Heartbeat classification, convolutional neural network, electrocardiogram signals, ECG signals, generative adversarial networks, long short-term memory, LSTM, ResNet-50.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 188

21 Retrieval Augmented Generation against the Machine: Merging Human Cyber Security Expertise with Generative AI

Authors: Brennan Lodge

Abstract:

Amidst a complex regulatory landscape, Retrieval Augmented Generation (RAG) emerges as a transformative tool for Governance Risk and Compliance (GRC) officers. This paper details the application of RAG in synthesizing Large Language Models (LLMs) with external knowledge bases, offering GRC professionals an advanced means to adapt to rapid changes in compliance requirements. While the development for standalone LLMs is exciting, such models do have their downsides. LLMs cannot easily expand or revise their memory, and they cannot straightforwardly provide insight into their predictions, and may produce “hallucinations.” Leveraging a pre-trained seq2seq transformer and a dense vector index of domain-specific data, this approach integrates real-time data retrieval into the generative process, enabling gap analysis and the dynamic generation of compliance and risk management content. We delve into the mechanics of RAG, focusing on its dual structure that pairs parametric knowledge contained within the transformer model with non-parametric data extracted from an updatable corpus. This hybrid model enhances decision-making through context-rich insights, drawing from the most current and relevant information, thereby enabling GRC officers to maintain a proactive compliance stance. Our methodology aligns with the latest advances in neural network fine-tuning, providing a granular, token-level application of retrieved information to inform and generate compliance narratives. By employing RAG, we exhibit a scalable solution that can adapt to novel regulatory challenges and cybersecurity threats, offering GRC officers a robust, predictive tool that augments their expertise. The granular application of RAG’s dual structure not only improves compliance and risk management protocols but also informs the development of compliance narratives with pinpoint accuracy. It underscores AI’s emerging role in strategic risk mitigation and proactive policy formation, positioning GRC officers to anticipate and navigate the complexities of regulatory evolution confidently.

Keywords: Retrieval Augmented Generation, Governance Risk and Compliance, Cybersecurity, AI-driven Compliance, Risk Management, Generative AI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 124

20 3D Modeling Approach for Cultural Heritage Structures: The Case of Virgin of Loreto Chapel in Cusco, Peru

Authors: Rony Reátegui, Cesar Chácara, Benjamin Castañeda, Rafael Aguilar

Abstract:

Nowadays, Heritage Building Information Modeling (HBIM) is considered an efficient tool to represent and manage information of Cultural Heritage (CH). The basis of this tool relies on a 3D model generally obtained from a Cloud-to-BIM procedure. There are different methods to create an HBIM model that goes from manual modeling based on the point cloud to the automatic detection of shapes and the creation of objects. The selection of these methods depends on the desired Level of Development (LOD), Level of Information (LOI), Grade of Generation (GOG) as well as on the availability of commercial software. This paper presents the 3D modeling of a stone masonry chapel using Recap Pro, Revit and Dynamo interface following a three-step methodology. The first step consists of the manual modeling of simple structural (e.g., regular walls, columns, floors, wall openings, etc.) and architectural (e.g., cornices, moldings and other minor details) elements using the point cloud as reference. Then, Dynamo is used for generative modeling of complex structural elements such as vaults, infills and domes. Finally, semantic information (e.g., materials, typology, state of conservation, etc.) and pathologies are added within the HBIM model as text parameters and generic models’ families respectively. The application of this methodology allows the documentation of CH following a relatively simple to apply process that ensures adequate LOD, LOI and GOG levels. In addition, the easy implementation of the method as well as the fact of using only one BIM software with its respective plugin for the scan-to-BIM modeling process means that this methodology can be adopted by a larger number of users with intermediate knowledge and limited resources, since the BIM software used has a free student license.

Keywords: Cloud-to-BIM, cultural heritage, generative modeling, HBIM, parametric modeling, Revit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 927

19 Variational Explanation Generator: Generating Explanation for Natural Language Inference Using Variational Auto-Encoder

Authors: Zhen Cheng, Xinyu Dai, Shujian Huang, Jiajun Chen

Abstract:

Recently, explanatory natural language inference has attracted much attention for the interpretability of logic relationship prediction, which is also known as explanation generation for Natural Language Inference (NLI). Existing explanation generators based on discriminative Encoder-Decoder architecture have achieved noticeable results. However, we find that these discriminative generators usually generate explanations with correct evidence but incorrect logic semantic. It is due to that logic information is implicitly encoded in the premise-hypothesis pairs and difficult to model. Actually, logic information identically exists between premise-hypothesis pair and explanation. And it is easy to extract logic information that is explicitly contained in the target explanation. Hence we assume that there exists a latent space of logic information while generating explanations. Specifically, we propose a generative model called Variational Explanation Generator (VariationalEG) with a latent variable to model this space. Training with the guide of explicit logic information in target explanations, latent variable in VariationalEG could capture the implicit logic information in premise-hypothesis pairs effectively. Additionally, to tackle the problem of posterior collapse while training VariaztionalEG, we propose a simple yet effective approach called Logic Supervision on the latent variable to force it to encode logic information. Experiments on explanation generation benchmark—explanation-Stanford Natural Language Inference (e-SNLI) demonstrate that the proposed VariationalEG achieves significant improvement compared to previous studies and yields a state-of-the-art result. Furthermore, we perform the analysis of generated explanations to demonstrate the effect of the latent variable.

Keywords: Natural Language Inference, explanation generation, variational auto-encoder, generative model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 692

18 Fuel Cell/DC-DC Convertor Control by Sliding Mode Method

Authors: Farzad Abdous

Abstract:

Fuel cell's system requires regulating circuit for voltage and current in order to control power in case of connecting to other generative devices or load. In this paper Fuel cell system and convertor, which is a multi-variable system, are controlled using sliding mode method. Use of weighting matrix in design procedure made it possible to regulate speed of control. Simulation results show the robustness and accuracy of proposed controller for controlling desired of outputs.

Keywords: DC-DC converter, Fuel cell, PEM, Slides mode control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613

17 Generative Design of Acoustical Diffuser and Absorber Elements Using Large-Scale Additive Manufacturing

Authors: S. Aziz, B. Alexander, C. Gengnagel, S. Weinzierl

Abstract:

This paper explores a generative design, simulation, and optimization workflow for the integration of acoustical diffuser and/or absorber geometry with embedded coupled Helmholtz-resonators for full scale 3D printed building components. Large-scale additive manufacturing in conjunction with algorithmic CAD design tools enables a vast amount of control when creating geometry. This is advantageous regarding the increasing demands of comfort standards for indoor spaces and the use of more resourceful and sustainable construction methods and materials. The presented methodology highlights these new technological advancements and offers a multimodal and integrative design solution with the potential for an immediate application in the AEC-Industry. In principle, the methodology can be applied to a wide range of structural elements that can be manufactured by additive manufacturing processes. The current paper focuses on a case study of an application for a biaxial load-bearing beam grillage made of reinforced concrete, which allows for a variety of applications through the combination of additive prefabricated semi-finished parts and in-situ concrete supplementation. The semi-prefabricated parts or formwork bodies form the basic framework of the supporting structure and at the same time have acoustic absorption and diffusion properties that are precisely acoustically programmed for the space underneath the structure. To this end, a hybrid validation strategy is being explored using a digital and cross-platform simulation environment, verified with physical prototyping. The iterative workflow starts with the generation of a parametric design model for the acoustical geometry using the algorithmic visual scripting editor Grasshopper3D inside the Building Information Modeling (BIM) software Revit. Various geometric attributes (i.e., bottleneck and cavity dimensions) of the resonator are parameterized and fed to a numerical optimization algorithm which can modify the geometry with the goal of increasing absorption at resonance and increasing the bandwidth of the effective absorption range. Using Rhino.Inside and LiveLink for Revit the generative model was imported directly into the Multiphysics simulation environment COMSOL. The geometry was further modified and prepared for simulation in a semi-automated process. The incident and scattered pressure fields were simulated from which the surface normal absorption coefficients were calculated. This reciprocal process was repeated to further optimize the geometric parameters. Subsequently the numerical models were compared to a set of 3D concrete printed physical twin models which were tested in a .25 m x .25 m impedance tube. The empirical results served to improve the starting parameter settings of the initial numerical model. The geometry resulting from the numerical optimization was finally returned to grasshopper for further implementation in an interdisciplinary study.

Keywords: Acoustical design, additive manufacturing, computational design, multimodal optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 603

16 On Dialogue Systems Based on Deep Learning

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.

Keywords: Dialogue management, response generation, reinforcement learning, deep learning, evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 787

15 A Hybrid GMM/SVM System for Text Independent Speaker Identification

Authors: Rafik Djemili, Mouldi Bedda, Hocine Bourouba

Abstract:

This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers' space into small subsets of speakers within a hierarchical tree structure. During testing a speech token is assigned to its corresponding group and evaluation using gaussian mixture models (GMMs) is then processed. Experimental results show that the proposed method can significantly improve the performance of text independent speaker identification task. We report improvements of up to 50% reduction in identification error rate compared to the baseline statistical model.

Keywords: Speaker identification, Gaussian mixture model (GMM), support vector machine (SVM), hybrid GMM/SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2236

14 Rejuvenate: Face and Body Retouching Using Image Inpainting

Authors: H. AbdelRahman, S. Rostom, Y. Lotfy, S. Salah Eldeen, R. Yassein, N. Awny

Abstract:

People are growing more concerned with their appearance in today's society. But they are terrified of what they will look like after a plastic surgery. People's mental health suffers when they have accidents, burns, or genetic issues that cause them to cleave certain body parts, which makes them feel uncomfortable and unappreciated. The method provides an innovative deep learning-based technique for image inpainting that analyzes different picture structures and fixes damaged images. This study proposes a model based on the Stable Diffusion Inpainting method for in-painting medical images. One significant advancement made possible by deep neural networks is image inpainting, which is the process of reconstructing damaged and missing portions of an image. The patient can see the outcome more easily since the system uses the user's input of an image to identify a problem. It then modifies the image and outputs a fixed image.

Keywords: Generative Adversarial Network, GAN, Large Mask Inpainting, LAMA, Stable Diffusion Inpainting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 107

13 Evaluation of Potential Production of Maize Genotypes of Early Maturity in Rainfed Lowland

Authors: St. Subaedah, A. Takdir, Netty, D. Hidrawati

Abstract:

Maize development at the rainfed lowland after rice is often confronted with the occurrence of drought stress at the time of entering the generative phase, which will cause be hampered crop production. Consequently, in the utilization of the rainfed lowland areas optimally, an effort that can be done using the varieties of early maturity to minimize crop failures due to its short rainy season. The aim of this research was evaluating the potential yield of genotypes of candidates of maize early maturity in the rainfed lowland areas. The study was conducted during May to August 2016 at South Sulawesi, Indonesia. The study used randomized block design to compare 12 treatments and consists of 8 genotypes namely CH1, CH2, CH3, CH4, CH5, CH6, CH7, CH8 and the use of four varieties, namely Bima 3, Bima 7, Lamuru and Gumarang. The results showed that genotype of CH2, CH3, CH5, CH 6, CH7 and CH8 harvesting has less than 90 days. There are two genotypes namely genotypes of CH7 and CH8 that have a fairly high production respectively of 7.16 tons / ha and 8.11 tons/ ha and significantly not different from the superior varieties Bima3.

Keywords: Evaluation, maize, early maturity, yield potential.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1290

12 Different Tillage Possibilities for Second Crop in Green Bean Farming

Authors: Yilmaz Bayhan, Emin Güzel, Ömer Barış Özlüoymak, Ahmet İnce, Abdullah Sessiz

Abstract:

In this study, determining of reduced tillage techniques in green bean farming as a second crop after harvesting wheat was targeted. To this aim, four different soil tillage methods namely, heavy-duty disc harrow (HD), rotary tiller (ROT), heavy-duty disc harrow plus rotary tiller (HD+ROT) and no-tillage (NT) (seeding by direct drill) were examined. Experiments were arranged in a randomized block design with three replications. The highest green beans yields were obtained in HD+ROT and NT as 5,862.1 and 5,829.3 Mg/ha, respectively. The lowest green bean yield was found in HD as 3,076.7 Mg/ha. The highest fuel consumption was measured 30.60 L ha^-1for HD+ROT whereas the lowest value was found 7.50 L ha^-1 for NT. No tillage method gave the best results for fuel consumption and effective power requirement. It is concluded that no-tillage method can be used in second crop green bean in the Thrace Region due to economic and erosion conditions.

Keywords: Soil tillage, green bean, vegetative, generative, yield.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1086

11 User Intention Generation with Large Language Models Using Chain-of-Thought Prompting

Authors: Gangmin Li, Fan Yang

Abstract:

Personalized recommendation is crucial for any recommendation system. One of the techniques for personalized recommendation is to identify the intention. Traditional user intention identification uses the user’s selection when facing multiple items. This modeling relies primarily on historical behavior data resulting in challenges such as the cold start, unintended choice, and failure to capture intention when items are new. Motivated by recent advancements in Large Language Models (LLMs) like ChatGPT, we present an approach for user intention identification by embracing LLMs with Chain-of-Thought (CoT) prompting. We use the initial user profile as input to LLMs and design a collection of prompts to align the LLM's response through various recommendation tasks encompassing rating prediction, search and browse history, user clarification, etc. Our tests on real-world datasets demonstrate the improvements in recommendation by explicit user intention identification and, with that intention, merged into a user model.

Keywords: Personalized recommendation, generative user modeling, user intention identification, large language models, chain-of-thought prompting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 87

10 Learning Algorithms for Fuzzy Inference Systems Composed of Double- and Single-Input Rule Modules

Authors: Hirofumi Miyajima, Kazuya Kishida, Noritaka Shigei, Hiromi Miyajima

Abstract:

Most of self-tuning fuzzy systems, which are automatically constructed from learning data, are based on the steepest descent method (SDM). However, this approach often requires a large convergence time and gets stuck into a shallow local minimum. One of its solutions is to use fuzzy rule modules with a small number of inputs such as DIRMs (Double-Input Rule Modules) and SIRMs (Single-Input Rule Modules). In this paper, we consider a (generalized) DIRMs model composed of double and single-input rule modules. Further, in order to reduce the redundant modules for the (generalized) DIRMs model, pruning and generative learning algorithms for the model are suggested. In order to show the effectiveness of them, numerical simulations for function approximation, Box-Jenkins and obstacle avoidance problems are performed.

Keywords: Box-Jenkins’s problem, Double-input rule module, Fuzzy inference model, Obstacle avoidance, Single-input rule module.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1957

9 Generative Syntaxes: Macro-Heterophony and the Form of ‘Synchrony’

Authors: Luminiţa Duţică, Gheorghe Duţică

Abstract:

One of the most powerful language innovation in the twentieth century music was the heterophony–hypostasis of the vertical syntax entered into the sphere of interest of many composers, such as George Enescu, Pierre Boulez, Mauricio Kagel, György Ligeti and others. The heterophonic syntax has a history of its growth, which means a succession of different concepts and writing techniques. The trajectory of settling this phenomenon does not necessarily take into account the chronology: there are highly complex primary stages and advanced stages of returning to the simple forms of writing. In folklore, the plurimelodic simultaneities are free or random and originate from the (unintentional) differences/‘deviations’ from the state of unison, through a variety of ornaments, melismas, imitations, elongations and abbreviations, all in a flexible rhythmic and non-periodic/immeasurable framework, proper to the parlando-rubato rhythmics. Within the general framework of the multivocal organization, the heterophonic syntax in elaborate (academic) version has imposed itself relatively late compared with polyphony and homophony. Of course, the explanation is simple, if we consider the causal relationship between the sound vocabulary elements – in this case, the modalism – and the typologies of vertical organization appropriate for it. Therefore, adding up the ‘classic’ pathway of the writing typologies (monody – polyphony – homophony), heterophony - applied equally to the structures of modal, serial or synthesis vocabulary – reclaims necessarily an own macrotemporal form, in the sense of the analogies enshrined by the evolution of the musical styles and languages: polyphony→fugue, homophony→sonata. Concerned about the prospect of edifying a new musical ontology, the composer Ştefan Niculescu experienced – along with the mathematical organization of heterophony according to his own original methods – the possibility of extrapolation of this phenomenon in macrostructural plan, reaching this way to the unique form of ‘synchrony’. Founded on coincidentia oppositorum principle (involving the ‘one-multiple’ binom), the sound architecture imagined by Ştefan Niculescu consists in one (temporal) model / algorithm of articulation of two sound states: 1. monovocality state (principle of identity) and 2. multivocality state (principle of difference). In this context, the heterophony becomes an (auto)generative mechanism, with macrotemporal amplitude, strategy that will be grown by the composer, practically throughout his creation (see the works: Ison I, Ison II, Unisonos I, Unisonos II, Duplum, Triplum, Psalmus, Héterophonies pour Montreux (Homages to Enescu and Bartók etc.). For the present demonstration, we selected one of the most edifying works of Ştefan Niculescu – Simphony II, Opus dacicum – where the form of (heterophony-)synchrony acquires monumental-symphonic features, representing an emblematic case for the complexity level achieved by this type of vertical syntax in the twentieth century music.

Keywords: Heterophony, modalism, serialism, synchrony, syntax.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 729

8 Adversarial Disentanglement Using Latent Classifier for Pose-Independent Representation

Authors: Hamed Alqahtani, Manolya Kavakli-Thorne

Abstract:

The large pose discrepancy is one of the critical challenges in face recognition during video surveillance. Due to the entanglement of pose attributes with identity information, the conventional approaches for pose-independent representation lack in providing quality results in recognizing largely posed faces. In this paper, we propose a practical approach to disentangle the pose attribute from the identity information followed by synthesis of a face using a classifier network in latent space. The proposed approach employs a modified generative adversarial network framework consisting of an encoder-decoder structure embedded with a classifier in manifold space for carrying out factorization on the latent encoding. It can be further generalized to other face and non-face attributes for real-life video frames containing faces with significant attribute variations. Experimental results and comparison with state of the art in the field prove that the learned representation of the proposed approach synthesizes more compelling perceptual images through a combination of adversarial and classification losses.

Keywords: Video surveillance, disentanglement, face detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 607

7 Automation System for Optimization of Electrical and Thermal Energy Production in Cogenerative Gas Power Plants

Authors: Ion Miciu

Abstract:

The system is made with main distributed components: First Level: Industrial Computers placed in Control Room (monitors thermal and electrical processes based on the data provided by the second level); Second Level: PLCs which collects data from process and transmits information on the first level; also takes commands from this level which are further, passed to execution elements from third level; Third Level: field elements consisting in 3 categories: data collecting elements; data transfer elements from the third level to the second; execution elements which take commands from the second level PLCs and executes them after which transmits the confirmation of execution to them. The purpose of the automatic functioning is the optimization of the co-generative electrical energy commissioning in the national energy system and the commissioning of thermal energy to the consumers. The integrated system treats the functioning of all the equipments and devices as a whole: Gas Turbine Units (GTU); MT 20kV Medium Voltage Station (MVS); 0,4 kV Low Voltage Station (LVS); Main Hot Water Boilers (MHW); Auxiliary Hot Water Boilers (AHW); Gas Compressor Unit (GCU); Thermal Agent Circulation Pumping Unit (TPU); Water Treating Station (WTS).

Keywords: Automation System, Cogenerative Power Plant, Control, Monitoring, Real Time

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1976

6 Generative Adversarial Network Based Fingerprint Anti-Spoofing Limitations

Authors: Yehjune Heo

Abstract:

Fingerprint Anti-Spoofing approaches have been actively developed and applied in real-world applications. One of the main problems for Fingerprint Anti-Spoofing is not robust to unseen samples, especially in real-world scenarios. A possible solution will be to generate artificial, but realistic fingerprint samples and use them for training in order to achieve good generalization. This paper contains experimental and comparative results with currently popular GAN based methods and uses realistic synthesis of fingerprints in training in order to increase the performance. Among various GAN models, the most popular StyleGAN is used for the experiments. The CNN models were first trained with the dataset that did not contain generated fake images and the accuracy along with the mean average error rate were recorded. Then, the fake generated images (fake images of live fingerprints and fake images of spoof fingerprints) were each combined with the original images (real images of live fingerprints and real images of spoof fingerprints), and various CNN models were trained. The best performances for each CNN model, trained with the dataset of generated fake images and each time the accuracy and the mean average error rate, were recorded. We observe that current GAN based approaches need significant improvements for the Anti-Spoofing performance, although the overall quality of the synthesized fingerprints seems to be reasonable. We include the analysis of this performance degradation, especially with a small number of samples. In addition, we suggest several approaches towards improved generalization with a small number of samples, by focusing on what GAN based approaches should learn and should not learn.

Keywords: Anti-spoofing, CNN, fingerprint recognition, GAN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 593

5 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith

Abstract:

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Keywords: Multimodal image registration, GAN, cycle consistency, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 810

4 SMaTTS: Standard Malay Text to Speech System

Authors: Othman O. Khalifa, Zakiah Hanim Ahmad, Teddy Surya Gunawan

Abstract:

This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.

Keywords: Natural Language Processing, Text-To-Speech (TTS), Diphone, source filter, low-/ high- level synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972

3 A Prevalence of Phonological Disorder in Children with Specific Language Impairment

Authors: Etim, Victoria Enefiok, Dada, Oluseyi Akintunde, Bassey Okon

Abstract:

Phonological disorder is a serious and disturbing issue to many parents and teachers. Efforts towards resolving the problem have been undermined by other specific disabilities which were hidden to many regular and special education teachers. It is against this background that this study was motivated to provide data on the prevalence of phonological disorders in children with specific language impairment (CWSLI) as the first step towards critical intervention. The study was a survey of 15 CWSLI from St. Louise Inclusive schools, Ikot Ekpene in Akwa Ibom State of Nigeria. Phonological Processes Diagnostic Scale (PPDS) with 17 short sentences, which cut across the five phonological processes that were examined, were validated by experts in test measurement, phonology and special education. The respondents were made to read the sentences with emphasis on the targeted sounds. Their utterances were recorded and analyzed in the language laboratory using Praat Software. Data were also collected through friendly interactions at different times from the clients. The theory of generative phonology was adopted for the descriptive analysis of the phonological processes. Data collected were analyzed using simple percentage and composite bar chart for better understanding of the result. The study found out that CWSLI exhibited the five phonological processes under investigation. It was revealed that 66.7%, 80%, 73.3%, 80%, and 86.7% of the respondents have severe deficit in fricative stopping, velar fronting, liquid gliding, final consonant deletion and cluster reduction, respectively. It was therefore recommended that a nationwide survey should be carried out to have national statistics of CWSLI with phonological deficits and develop intervention strategies for effective therapy to remediate the disorder.

Keywords: Language disorders, phonology, phonological processes, specific language impairment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074

2 A Game-Based Product Modelling Environment for Non-Engineer

Authors: Guolong Zhong, Venkatesh Chennam Vijay, Ilias Oraifige

Abstract:

In the last 20 years, Knowledge Based Engineering (KBE) has shown its advantages in product development in different engineering areas such as automation, mechanical, civil and aerospace engineering in terms of digital design automation and cost reduction by automating repetitive design tasks through capturing, integrating, utilising and reusing the existing knowledge required in various aspects of the product design. However, in primary design stages, the descriptive information of a product is discrete and unorganized while knowledge is in various forms instead of pure data. Thus, it is crucial to have an integrated product model which can represent the entire product information and its associated knowledge at the beginning of the product design. One of the shortcomings of the existing product models is a lack of required knowledge representation in various aspects of product design and its mapping to an interoperable schema. To overcome the limitation of the existing product model and methodologies, two key factors are considered. First, the product model must have well-defined classes that can represent the entire product information and its associated knowledge. Second, the product model needs to be represented in an interoperable schema to ensure a steady data exchange between different product modelling platforms and CAD software. This paper introduced a method to provide a general product model as a generative representation of a product, which consists of the geometry information and non-geometry information, through a product modelling framework. The proposed method for capturing the knowledge from the designers through a knowledge file provides a simple and efficient way of collecting and transferring knowledge. Further, the knowledge schema provides a clear view and format on the data that needed to be gathered in order to achieve a unified knowledge exchange between different platforms. This study used a game-based platform to make product modelling environment accessible for non-engineers. Further the paper goes on to test use case based on the proposed game-based product modelling environment to validate the effectiveness among non-engineers.

Keywords: Game-based learning, knowledge based engineering, product modelling, design automation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 742

1 An AI-Generated Semantic Communication Platform in Human-Computer Interaction Course

Authors: Yi Yang, Jiasong Sun

Abstract:

Almost every aspect of our daily lives is now intertwined with some degree of Human-Computer Interaction (HCI). HCI courses draw on knowledge from disciplines as diverse as computer science, psychology, design principles, anthropology and more. The HCI courses in the Department of Electronics at Tsinghua University, known as the Media and Cognition course, is constantly updated to reflect the most advanced technological advances, such as virtual reality, augmented reality and artificial intelligence-based interaction. For more than a decade, this course has used an interest-based approach to teaching, in which students proactively propose some research-based questions and collaborate with teachers, using course knowledge to explore potential solutions. Semantic communication plays a key role in facilitating understanding and interaction between users and computer systems, ultimately enhancing system usability and user experience. The advancements in AI-generated technology, which has gained significant attention from both academia and industry in recent years, are exemplified by language models like GPT-3 that generate human-like dialogues from given prompts. The latest version of the HCI course practices a semantic communication platform based on AI-generated techniques. We explored a student-centered model and proposed an interest-based teaching method. Students are no longer just recipients of knowledge, but become active participants in the learning process driven by personal interests, thereby encouraging students to take responsibility for their own education. One of the latest results of this teaching approach in the course "Media and Cognition" is a student proposal to develop a semantic communication platform rooted in artificial intelligence generative technologies. The platform solves a key challenge in communications technology: the ability to preserve visual signals. The interest-based approach emphasizes personal curiosity and active participation, and the proposal of an artificial intelligence-generated semantic communication platform is an example and successful result of how students can exert greater creativity when they have the power to control their own learning.

Keywords: Human-computer interaction, media and cognition course, semantic communication, retain ability, prompts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 164