Search results for: large language models (LLMS)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9592

Search results for: large language models (LLMS)

9592 User Intention Generation with Large Language Models Using Chain-of-Thought Prompting Title

Authors: Gangmin Li, Fan Yang

Abstract:

Personalized recommendation is crucial for any recommendation system. One of the techniques for personalized recommendation is to identify the intention. Traditional user intention identification uses the user’s selection when facing multiple items. This modeling relies primarily on historical behaviour data resulting in challenges such as the cold start, unintended choice, and failure to capture intention when items are new. Motivated by recent advancements in Large Language Models (LLMs) like ChatGPT, we present an approach for user intention identification by embracing LLMs with Chain-of-Thought (CoT) prompting. We use the initial user profile as input to LLMs and design a collection of prompts to align the LLM's response through various recommendation tasks encompassing rating prediction, search and browse history, user clarification, etc. Our tests on real-world datasets demonstrate the improvements in recommendation by explicit user intention identification and, with that intention, merged into a user model.

Keywords: personalized recommendation, generative user modelling, user intention identification, large language models, chain-of-thought prompting

Procedia PDF Downloads 53
9591 Exploring Tweet Geolocation: Leveraging Large Language Models for Post-Hoc Explanations

Authors: Sarra Hasni, Sami Faiz

Abstract:

In recent years, location prediction on social networks has gained significant attention, with short and unstructured texts like tweets posing additional challenges. Advanced geolocation models have been proposed, increasing the need to explain their predictions. In this paper, we provide explanations for a geolocation black-box model using LIME and SHAP, two state-of-the-art XAI (eXplainable Artificial Intelligence) methods. We extend our evaluations to Large Language Models (LLMs) as post hoc explainers for tweet geolocation. Our preliminary results show that LLMs outperform LIME and SHAP by generating more accurate explanations. Additionally, we demonstrate that prompts with examples and meta-prompts containing phonetic spelling rules improve the interpretability of these models, even with informal input data. This approach highlights the potential of advanced prompt engineering techniques to enhance the effectiveness of black-box models in geolocation tasks on social networks.

Keywords: large language model, post hoc explainer, prompt engineering, local explanation, tweet geolocation

Procedia PDF Downloads 25
9590 Predictive Analysis of Chest X-rays Using NLP and Large Language Models with the Indiana University Dataset and Random Forest Classifier

Authors: Azita Ramezani, Ghazal Mashhadiagha, Bahareh Sanabakhsh

Abstract:

This study researches the combination of Random. Forest classifiers with large language models (LLMs) and natural language processing (NLP) to improve diagnostic accuracy in chest X-ray analysis using the Indiana University dataset. Utilizing advanced NLP techniques, the research preprocesses textual data from radiological reports to extract key features, which are then merged with image-derived data. This improved dataset is analyzed with Random Forest classifiers to predict specific clinical results, focusing on the identification of health issues and the estimation of case urgency. The findings reveal that the combination of NLP, LLMs, and machine learning not only increases diagnostic precision but also reliability, especially in quickly identifying critical conditions. Achieving an accuracy of 99.35%, the model shows significant advancements over conventional diagnostic techniques. The results emphasize the large potential of machine learning in medical imaging, suggesting that these technologies could greatly enhance clinician judgment and patient outcomes by offering quicker and more precise diagnostic approximations.

Keywords: natural language processing (NLP), large language models (LLMs), random forest classifier, chest x-ray analysis, medical imaging, diagnostic accuracy, indiana university dataset, machine learning in healthcare, predictive modeling, clinical decision support systems

Procedia PDF Downloads 44
9589 Large Language Model Powered Chatbots Need End-to-End Benchmarks

Authors: Debarag Banerjee, Pooja Singh, Arjun Avadhanam, Saksham Srivastava

Abstract:

Autonomous conversational agents, i.e., chatbots, are becoming an increasingly common mechanism for enterprises to provide support to customers and partners. In order to rate chatbots, especially ones powered by Generative AI tools like Large Language Models (LLMs), we need to be able to accurately assess their performance. This is where chatbot benchmarking becomes important. In this paper, authors propose the use of a benchmark that they call the E2E (End to End) benchmark and show how the E2E benchmark can be used to evaluate the accuracy and usefulness of the answers provided by chatbots, especially ones powered by LLMs. The authors evaluate an example chatbot at different levels of sophistication based on both our E2E benchmark as well as other available metrics commonly used in the state of the art and observe that the proposed benchmark shows better results compared to others. In addition, while some metrics proved to be unpredictable, the metric associated with the E2E benchmark, which uses cosine similarity, performed well in evaluating chatbots. The performance of our best models shows that there are several benefits of using the cosine similarity score as a metric in the E2E benchmark.

Keywords: chatbot benchmarking, end-to-end (E2E) benchmarking, large language model, user centric evaluation.

Procedia PDF Downloads 66
9588 Improving Student Programming Skills in Introductory Computer and Data Science Courses Using Generative AI

Authors: Genady Grabarnik, Serge Yaskolko

Abstract:

Generative Artificial Intelligence (AI) has significantly expanded its applicability with the incorporation of Large Language Models (LLMs) and become a technology with promise to automate some areas that were very difficult to automate before. The paper describes the introduction of generative Artificial Intelligence into Introductory Computer and Data Science courses and analysis of effect of such introduction. The generative Artificial Intelligence is incorporated in the educational process two-fold: For the instructors, we create templates of prompts for generation of tasks, and grading of the students work, including feedback on the submitted assignments. For the students, we introduce them to basic prompt engineering, which in turn will be used for generation of test cases based on description of the problems, generating code snippets for the single block complexity programming, and partitioning into such blocks of an average size complexity programming. The above-mentioned classes are run using Large Language Models, and feedback from instructors and students and courses’ outcomes are collected. The analysis shows statistically significant positive effect and preference of both stakeholders.

Keywords: introductory computer and data science education, generative AI, large language models, application of LLMS to computer and data science education

Procedia PDF Downloads 58
9587 Project Progress Prediction in Software Devlopment Integrating Time Prediction Algorithms and Large Language Modeling

Authors: Dong Wu, Michael Grenn

Abstract:

Managing software projects effectively is crucial for meeting deadlines, ensuring quality, and managing resources well. Traditional methods often struggle with predicting project timelines accurately due to uncertain schedules and complex data. This study addresses these challenges by combining time prediction algorithms with Large Language Models (LLMs). It makes use of real-world software project data to construct and validate a model. The model takes detailed project progress data such as task completion dynamic, team Interaction and development metrics as its input and outputs predictions of project timelines. To evaluate the effectiveness of this model, a comprehensive methodology is employed, involving simulations and practical applications in a variety of real-world software project scenarios. This multifaceted evaluation strategy is designed to validate the model's significant role in enhancing forecast accuracy and elevating overall management efficiency, particularly in complex software project environments. The results indicate that the integration of time prediction algorithms with LLMs has the potential to optimize software project progress management. These quantitative results suggest the effectiveness of the method in practical applications. In conclusion, this study demonstrates that integrating time prediction algorithms with LLMs can significantly improve the predictive accuracy and efficiency of software project management. This offers an advanced project management tool for the industry, with the potential to improve operational efficiency, optimize resource allocation, and ensure timely project completion.

Keywords: software project management, time prediction algorithms, large language models (LLMS), forecast accuracy, project progress prediction

Procedia PDF Downloads 79
9586 Domain specific Ontology-Based Knowledge Extraction Using R-GNN and Large Language Models

Authors: Andrey Khalov

Abstract:

The rapid proliferation of unstructured data in IT infrastructure management demands innovative approaches for extracting actionable knowledge. This paper presents a framework for ontology-based knowledge extraction that combines relational graph neural networks (R-GNN) with large language models (LLMs). The proposed method leverages the DOLCE framework as the foundational ontology, extending it with concepts from ITSMO for domain-specific applications in IT service management and outsourcing. A key component of this research is the use of transformer-based models, such as DeBERTa-v3-large, for automatic entity and relationship extraction from unstructured texts. Furthermore, the paper explores how transfer learning techniques can be applied to fine-tune large language models (LLaMA) for using to generate synthetic datasets to improve precision in BERT-based entity recognition and ontology alignment. The resulting IT Ontology (ITO) serves as a comprehensive knowledge base that integrates domain-specific insights from ITIL processes, enabling more efficient decision-making. Experimental results demonstrate significant improvements in knowledge extraction and relationship mapping, offering a cutting-edge solution for enhancing cognitive computing in IT service environments.

Keywords: ontology mapping, R-GNN, knowledge extraction, large language models, NER, knowlege graph

Procedia PDF Downloads 16
9585 Life Stage Customer Segmentation by Fine-Tuning Large Language Models

Authors: Nikita Katyal, Shaurya Uppal

Abstract:

This paper tackles the significant challenge of accurately classifying customers within a retailer’s customer base. Accurate classification is essential for developing targeted marketing strategies that effectively engage this important demographic. To address this issue, we propose a method that utilizes Large Language Models (LLMs). By employing LLMs, we analyze the metadata associated with product purchases derived from historical data to identify key product categories that act as distinguishing factors. These categories, such as baby food, eldercare products, or family-sized packages, offer valuable insights into the likely household composition of customers, including families with babies, families with kids/teenagers, families with pets, households caring for elders, or mixed households. We segment high-confidence customers into distinct categories by integrating historical purchase behavior with LLM-powered product classification. This paper asserts that life stage segmentation can significantly enhance e-commerce businesses’ ability to target the appropriate customers with tailored products and campaigns, thereby augmenting sales and improving customer retention. Additionally, the paper details the data sources, model architecture, and evaluation metrics employed for the segmentation task.

Keywords: LLMs, segmentation, product tags, fine-tuning, target segments, marketing communication

Procedia PDF Downloads 23
9584 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 87
9583 Cross-Dialect Sentence Transformation: A Comparative Analysis of Language Models for Adapting Sentences to British English

Authors: Shashwat Mookherjee, Shruti Dutta

Abstract:

This study explores linguistic distinctions among American, Indian, and Irish English dialects and assesses various Language Models (LLMs) in their ability to generate British English translations from these dialects. Using cosine similarity analysis, the study measures the linguistic proximity between original British English translations and those produced by LLMs for each dialect. The findings reveal that Indian and Irish English translations maintain notably high similarity scores, suggesting strong linguistic alignment with British English. In contrast, American English exhibits slightly lower similarity, reflecting its distinct linguistic traits. Additionally, the choice of LLM significantly impacts translation quality, with Llama-2-70b consistently demonstrating superior performance. The study underscores the importance of selecting the right model for dialect translation, emphasizing the role of linguistic expertise and contextual understanding in achieving accurate translations.

Keywords: cross-dialect translation, language models, linguistic similarity, multilingual NLP

Procedia PDF Downloads 75
9582 Translation Training in the AI Era

Authors: Min Gao

Abstract:

In the past year, the advent of large language models (LLMs) has brought about a revolution in the language service industry, making it possible to efficiently produce more satisfactory and higher-quality translations. This is groundbreaking news for commercial companies involved in language services since much of a translator's work can now be completed by machines. However, it may be bad news for universities that provide translation training programs. They need to confront the challenges posed by AI in education by reconsidering issues such as the reform of traditional teaching methods, the translation ethics of students, and the new demands of the job market for their graduates. This article is an exploratory study of these issues based on the author's experiences in translation teaching. The research combines methods in the form of questionnaires and interviews. The findings include: (1) students may lose their motivation to learn in the AI era, but this can be compensated for by encouragement from the lecturer; (2) Translation ethics are not a serious problem in schools, considering the strict policies and regulations in place; (3) The role of translators has evolved in the new era, necessitating a reform of the traditional teaching methods.

Keywords: job market of translation, large language model, translation ethics, translation training

Procedia PDF Downloads 68
9581 A Large Language Model-Driven Method for Automated Building Energy Model Generation

Authors: Yake Zhang, Peng Xu

Abstract:

The development of building energy models (BEM) required for architectural design and analysis is a time-consuming and complex process, demanding a deep understanding and proficient use of simulation software. To streamline the generation of complex building energy models, this study proposes an automated method for generating building energy models using a large language model and the BEM library aimed at improving the efficiency of model generation. This method leverages a large language model to parse user-specified requirements for target building models, extracting key features such as building location, window-to-wall ratio, and thermal performance of the building envelope. The BEM library is utilized to retrieve energy models that match the target building’s characteristics, serving as reference information for the large language model to enhance the accuracy and relevance of the generated model, allowing for the creation of a building energy model that adapts to the user’s modeling requirements. This study enables the automatic creation of building energy models based on natural language inputs, reducing the professional expertise required for model development while significantly decreasing the time and complexity of manual configuration. In summary, this study provides an efficient and intelligent solution for building energy analysis and simulation, demonstrating the potential of a large language model in the field of building simulation and performance modeling.

Keywords: artificial intelligence, building energy modelling, building simulation, large language model

Procedia PDF Downloads 26
9580 Coupling Large Language Models with Disaster Knowledge Graphs for Intelligent Construction

Authors: Zhengrong Wu, Haibo Yang

Abstract:

In the context of escalating global climate change and environmental degradation, the complexity and frequency of natural disasters are continually increasing. Confronted with an abundance of information regarding natural disasters, traditional knowledge graph construction methods, which heavily rely on grammatical rules and prior knowledge, demonstrate suboptimal performance in processing complex, multi-source disaster information. This study, drawing upon past natural disaster reports, disaster-related literature in both English and Chinese, and data from various disaster monitoring stations, constructs question-answer templates based on large language models. Utilizing the P-Tune method, the ChatGLM2-6B model is fine-tuned, leading to the development of a disaster knowledge graph based on large language models. This serves as a knowledge database support for disaster emergency response.

Keywords: large language model, knowledge graph, disaster, deep learning

Procedia PDF Downloads 56
9579 TutorBot+: Automatic Programming Assistant with Positive Feedback based on LLMs

Authors: Claudia Martínez-Araneda, Mariella Gutiérrez, Pedro Gómez, Diego Maldonado, Alejandra Segura, Christian Vidal-Castro

Abstract:

The purpose of this document is to showcase the preliminary work in developing an EduChatbot-type tool and measuring the effects of its use aimed at providing effective feedback to students in programming courses. This bot, hereinafter referred to as tutorBot+, was constructed based on chatGPT and is tasked with assisting and delivering timely positive feedback to students in the field of computer science at the Universidad Católica de Concepción. The proposed working method consists of four stages: (1) Immersion in the domain of Large Language Models (LLMs), (2) Development of the tutorBot+ prototype and integration, (3) Experiment design, and (4) Intervention. The first stage involves a literature review on the use of artificial intelligence in education and the evaluation of intelligent tutors, as well as research on types of feedback for learning and the domain of chatGPT. The second stage encompasses the development of tutorBot+, and the final stage involves a quasi-experimental study with students from the Programming and Database labs, where the learning outcome involves the development of computational thinking skills, enabling the use and measurement of the tool's effects. The preliminary results of this work are promising, as a functional chatBot prototype has been developed in both conversational and non-conversational versions integrated into an open-source online judge and programming contest platform system. There is also an exploration of the possibility of generating a custom model based on a pre-trained one tailored to the domain of programming. This includes the integration of the created tool and the design of the experiment to measure its utility.

Keywords: assessment, chatGPT, learning strategies, LLMs, timely feedback

Procedia PDF Downloads 68
9578 Models and Metamodels for Computer-Assisted Natural Language Grammar Learning

Authors: Evgeny Pyshkin, Maxim Mozgovoy, Vladislav Volkov

Abstract:

The paper follows a discourse on computer-assisted language learning. We examine problems of foreign language teaching and learning and introduce a metamodel that can be used to define learning models of language grammar structures in order to support teacher/student interaction. Special attention is paid to the concept of a virtual language lab. Our approach to language education assumes to encourage learners to experiment with a language and to learn by discovering patterns of grammatically correct structures created and managed by a language expert.

Keywords: computer-assisted instruction, language learning, natural language grammar models, HCI

Procedia PDF Downloads 519
9577 Comparison Analysis of Multi-Channel Echo Cancellation Using Adaptive Filters

Authors: Sahar Mobeen, Anam Rafique, Irum Baig

Abstract:

Acoustic echo cancellation in multichannel is a system identification application. In real time environment, signal changes very rapidly which required adaptive algorithms such as Least Mean Square (LMS), Leaky Least Mean Square (LLMS), Normalized Least Mean square (NLMS) and average (AFA) having high convergence rate and stable. LMS and NLMS are widely used adaptive algorithm due to less computational complexity and AFA used of its high convergence rate. This research is based on comparison of acoustic echo (generated in a room) cancellation thorough LMS, LLMS, NLMS, AFA and newly proposed average normalized leaky least mean square (ANLLMS) adaptive filters.

Keywords: LMS, LLMS, NLMS, AFA, ANLLMS

Procedia PDF Downloads 566
9576 Effect of Large English Studies Classes on Linguistic Achievement and Classroom Discourse at Junior Secondary Level in Yobe State

Authors: Clifford Irikefe Gbeyonron

Abstract:

Applied linguists concur that there is low-level achievement in English language use among Nigerian secondary school students. One of the factors that exacerbate this is classroom feature of which large class size is obvious. This study investigated the impact of large classes on learning English as a second language (ESL) at junior secondary school (JSS) in Yobe State. To achieve this, Solomon four-group experimental design was used. 382 subjects were divided into four groups and taught ESL for thirteen weeks. 356 subjects wrote the post-test. Data from the systematic observation and post-test were analyzed via chi square and ANOVA. Results indicated that learners in large classes (LLC) attain lower linguistic progress than learners in small classes (LSC). Furthermore, LSC have more chances to access teacher evaluation and participate actively in classroom discourse than LLC. In consequence, large classes have adverse effects on learning ESL in Yobe State. This is inimical to English language education given that each learner of ESL has their individual peculiarity within each class. It is recommended that strategies that prioritize individualization, grouping, use of language teaching aides, and theorization of innovative models in respect of large classes be considered.

Keywords: large classes, achievement, classroom discourse

Procedia PDF Downloads 409
9575 The Content-Based Classroom: Perspectives on Integrating Language and Content

Authors: Mourad Ben Bennani

Abstract:

Views of language and language learning have undergone a tremendous change over the last decades. Language is no longer seen as a set of structured rules. It is rather viewed as a tool of interaction and communication. This shift in views has resulted in change in viewing language learning, which gave birth to various approaches and methodologies of language teaching. Two of these approaches are content-based instruction and content and language integrated learning (CLIL). These are similar approaches which integrate content and foreign/second language learning through various methodologies and models as a result of different implementations around the world. This presentation deals with sociocultural view of CBI and CLIL. It also defines language and content as vital components of CBI and CLIL. Next it reviews the origins of CBI and the continuum perspectives and CLIL definitions and models featured in the literature. Finally it summarizes current aspects around research in program evaluation with a focus on the benefits and challenges of these innovative approaches for second language teaching.

Keywords: CBI, CLIL, CBI continuum, CLIL models

Procedia PDF Downloads 435
9574 Dual Language Immersion Models in Theory and Practice

Authors: S. Gordon

Abstract:

Dual language immersion is growing fast in language teaching today. This study provides an overview and evaluation of the different models of Dual language immersion programs in US K-12 schools. First, the paper provides a brief current literature review on the theory of Dual Language Immersion (DLI) in Second Language Acquisition (SLA) studies. Second, examples of several types of DLI language teaching models in US K-12 public schools are presented (including 50/50 models, 90/10 models, etc.). Third, we focus on the unique example of DLI education in the state of Utah, a successful, growing program in K-12 schools that includes: French, Chinese, Spanish, and Portuguese. The project investigates the theory and practice particularly of the case of public elementary and secondary school children that study half their school day in the L1 and the other half in the chosen L2, from kindergarten (age 5-6) through high school (age 17-18). Finally, the project takes the observations of Utah French DLI elementary through secondary programs as a case study. To conclude, we look at the principal challenges, pedagogical objectives and outcomes, and important implications for other US states and other countries (such as France currently) that are in the process of developing similar language learning programs.

Keywords: dual language immersion, second language acquisition, language teaching, pedagogy, teaching, French

Procedia PDF Downloads 175
9573 Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach

Authors: Adeep Hande, Shubham Agarwal

Abstract:

This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.

Keywords: large language models, semi-supervised learning, sexism detection, data sparsity

Procedia PDF Downloads 70
9572 Harnessing the Power of Large Language Models in Orthodontics: AI-Generated Insights on Class II and Class III Orthopedic Appliances: A Cross-Sectional Study

Authors: Laiba Amin, Rashna H. Sukhia, Mubassar Fida

Abstract:

Introduction: This study evaluates the accuracy of responses from ChatGPT, Google Bard, and Microsoft Copilot regarding dentofacial orthopedic appliances. As artificial intelligence (AI) increasingly enhances various fields, including healthcare, understanding its reliability in specialized domains like orthodontics becomes crucial. By comparing the accuracy of different AI models, this study aims to shed light on their effectiveness and potential limitations in providing technical insights. Materials and Methods: A total of 110 questions focused on dentofacial orthopedic appliances were posed to each AI model. The responses were then evaluated by five experienced orthodontists using a modified 5-point Likert scale to ensure a thorough assessment of accuracy. This structured approach allowed for consistent and objective rating, facilitating a meaningful comparison between the AI systems. Results: The results revealed that Google Bard demonstrated the highest accuracy at 74%, followed by Microsoft Copilot, with an accuracy of 72.2%. In contrast, ChatGPT was found to be the least accurate, achieving only 52.2%. These results highlight significant differences in the performance of the AI models when addressing orthodontic queries. Conclusions: Our study highlights the need for caution in relying on AI for orthodontic insights. The overall accuracy of the three chatbots was 66%, with Google Bard performing best for removable Class II appliances. Microsoft Copilot was more accurate than ChatGPT, which, despite its popularity, was the least accurate. This variability emphasizes the importance of human expertise in interpreting AI-generated information. Further research is necessary to improve the reliability of AI models in specialized healthcare settings.

Keywords: artificial intelligence, large language models, orthodontics, dentofacial orthopaedic appliances, accuracy assessment.

Procedia PDF Downloads 8
9571 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 68
9570 Language Switching Errors of Bilinguals: Role of Top down and Bottom up Process

Authors: Numra Qayyum, Samina Sarwat, Noor ul Ain

Abstract:

Bilingual speakers generally can speak both languages with the same competency without mixing them intentionally and making mistakes, but sometimes errors occur in language selection. This quantitative study particularly deals with the language errors made by Urdu-English bilinguals. In this research, researchers have given special attention to the part played by bottom-up priming and top-down cognitive control in these errors. Unstable Urdu-English bilingual participants termed pictures and were prompted to shift from one language to another under the pressure of time. Different situations were given to manipulate the participants. The long and short runs trials of the same language were also given before switching to another language. The study is concluded with the findings that bilinguals made more errors when switching to the first language from their second language, and these errors are large in number, especially when a speaker is switching from L2 (second language) to L1 (first language) after a long run. When the switching is reversed, i.e., from L2 to LI, it had no effect at all. These results gave the clear responsibility of all these errors to top-down cognitive control.

Keywords: bottom up priming, language error, language switching, top down cognitive control

Procedia PDF Downloads 137
9569 A Comparative Study of Approaches in User-Centred Health Information Retrieval

Authors: Harsh Thakkar, Ganesh Iyer

Abstract:

In this paper, we survey various user-centered or context-based biomedical health information retrieval systems. We present and discuss the performance of systems submitted in CLEF eHealth 2014 Task 3 for this purpose. We classify and focus on comparing the two most prevalent retrieval models in biomedical information retrieval namely: Language Model (LM) and Vector Space Model (VSM). We also report on the effectiveness of using external medical resources and ontologies like MeSH, Metamap, UMLS, etc. We observed that the LM based retrieval systems outperform VSM based systems on various fronts. From the results we conclude that the state-of-art system scores for MAP was 0.4146, P@10 was 0.7560 and NDCG@10 was 0.7445, respectively. All of these score were reported by systems built on language modeling approaches.

Keywords: clinical document retrieval, concept-based information retrieval, query expansion, language models, vector space models

Procedia PDF Downloads 320
9568 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 487
9567 A Mathematical Agent-Based Model to Examine Two Patterns of Language Change

Authors: Gareth Baxter

Abstract:

We use a mathematical model of language change to examine two recently observed patterns of language change: one in which most speakers change gradually, following the mean of the community change, and one in which most individuals use predominantly one variant or another, and change rapidly if they change at all. The model is based on Croft’s Utterance Selection account of language change, which views language change as an evolutionary process, in which different variants (different ‘ways of saying the same thing’) compete for usage in a population of speakers. Language change occurs when a new variant replaces an older one as the convention within a given population. The present model extends a previous simpler model to include effects related to speaker aging and interspeaker variation in behaviour. The two patterns of individual change (one more centralized and the other more polarized) were recently observed in historical language changes, and it was further observed that slower changes were more associated with the centralized pattern, while quicker changes were more polarized. Our model suggests that the two patterns of change can be explained by different balances between the preference of speakers to use one variant over another and the degree of accommodation to (propensity to adapt towards) other speakers. The correlation with the rate of change appears naturally in our model, and results from the fact that both differential weighting of variants and the degree of accommodation affect the time for change to occur, while also determining the patterns of change. This work represents part of an ongoing effort to examine phenomena in language change through the use of mathematical models. This offers another way to evaluate qualitative explanations that cannot be practically tested (or cannot be tested at all) in a real-world, large-scale speech community.

Keywords: agent based modeling, cultural evolution, language change, social behavior modeling, social influence

Procedia PDF Downloads 235
9566 Recurrent Patterns of Netspeak among Selected Nigerians on WhatsApp Platform: A Quest for Standardisation

Authors: Lily Chimuanya, Esther Ajiboye, Emmanuel Uba

Abstract:

One of the consequences of online communication is the birth of new orthography genres characterised by novel conventions of abbreviation and acronyms usually referred to as Netspeak. Netspeak, also known as internet slang, is a style of writing mainly used in online communication to limit the length of text characters and to save time. The aim of this study is to evaluate how second language users of the English language have internalised this new convention of writing; identify the recurrent patterns of Netspeak; and assess the consistency of the use of the identified patterns in relation to their meanings. The study is corpus-based, and data drawn from WhatsApp chart pages of selected groups of Nigerian English speakers show a large occurrence of inconsistencies in the patterns of Netspeak and their meanings. The study argues that rather than emphasise the negative impact of Netspeak on the communicative competence of second language users, studies should focus on suggesting models as yardsticks for standardising the usage of Netspeak and indeed all other emerging language conventions resulting from online communication. This stance stems from the inevitable global language transformation that is eminent with the coming of age of information technology.

Keywords: abbreviation, acronyms, Netspeak, online communication, standardisation

Procedia PDF Downloads 391
9565 Diagonal Vector Autoregressive Models and Their Properties

Authors: Usoro Anthony E., Udoh Emediong

Abstract:

Diagonal Vector Autoregressive Models are special classes of the general vector autoregressive models identified under certain conditions, where parameters are restricted to the diagonal elements in the coefficient matrices. Variance, autocovariance, and autocorrelation properties of the upper and lower diagonal VAR models are derived. The new set of VAR models is verified with empirical data and is found to perform favourably with the general VAR models. The advantage of the diagonal models over the existing models is that the new models are parsimonious, given the reduction in the interactive coefficients of the general VAR models.

Keywords: VAR models, diagonal VAR models, variance, autocovariance, autocorrelations

Procedia PDF Downloads 116
9564 Efficient Layout-Aware Pretraining for Multimodal Form Understanding

Authors: Armineh Nourbakhsh, Sameena Shah, Carolyn Rose

Abstract:

Layout-aware language models have been used to create multimodal representations for documents that are in image form, achieving relatively high accuracy in document understanding tasks. However, the large number of parameters in the resulting models makes building and using them prohibitive without access to high-performing processing units with large memory capacity. We propose an alternative approach that can create efficient representations without the need for a neural visual backbone. This leads to an 80% reduction in the number of parameters compared to the smallest SOTA model, widely expanding applicability. In addition, our layout embeddings are pre-trained on spatial and visual cues alone and only fused with text embeddings in downstream tasks, which can facilitate applicability to low-resource of multi-lingual domains. Despite using 2.5% of training data, we show competitive performance on two form understanding tasks: semantic labeling and link prediction.

Keywords: layout understanding, form understanding, multimodal document understanding, bias-augmented attention

Procedia PDF Downloads 148
9563 Exploring Teachers’ Beliefs about Diagnostic Language Assessment Practices in a Large-Scale Assessment Program

Authors: Oluwaseun Ijiwade, Chris Davison, Kelvin Gregory

Abstract:

In Australia, like other parts of the world, the debate on how to enhance teachers using assessment data to inform teaching and learning of English as an Additional Language (EAL, Australia) or English as a Foreign Language (EFL, United States) have occupied the centre of academic scholarship. Traditionally, this approach was conceptualised as ‘Formative Assessment’ and, in recent times, ‘Assessment for Learning (AfL)’. The central problem is that teacher-made tests are limited in providing data that can inform teaching and learning due to variability of classroom assessments, which are hindered by teachers’ characteristics and assessment literacy. To address this concern, scholars in language education and testing have proposed a uniformed large-scale computer-based assessment program to meet the needs of teachers and promote AfL in language education. In Australia, for instance, the Victoria state government commissioned a large-scale project called 'Tools to Enhance Assessment Literacy (TEAL) for Teachers of English as an additional language'. As part of the TEAL project, a tool called ‘Reading and Vocabulary assessment for English as an Additional Language (RVEAL)’, as a diagnostic language assessment (DLA), was developed by language experts at the University of New South Wales for teachers in Victorian schools to guide EAL pedagogy in the classroom. Therefore, this study aims to provide qualitative evidence for understanding beliefs about the diagnostic language assessment (DLA) among EAL teachers in primary and secondary schools in Victoria, Australia. To realize this goal, this study raises the following questions: (a) How do teachers use large-scale assessment data for diagnostic purposes? (b) What skills do language teachers think are necessary for using assessment data for instruction in the classroom? and (c) What factors, if any, contribute to teachers’ beliefs about diagnostic assessment in a large-scale assessment? Semi-structured interview method was used to collect data from at least 15 professional teachers who were selected through a purposeful sampling. The findings from the resulting data analysis (thematic analysis) provide an understanding of teachers’ beliefs about DLA in a classroom context and identify how these beliefs are crystallised in language teachers. The discussion shows how the findings can be used to inform professional development processes for language teachers as well as informing important factor of teacher cognition in the pedagogic processes of language assessment. This, hopefully, will help test developers and testing organisations to align the outcome of this study with their test development processes to design assessment that can enhance AfL in language education.

Keywords: beliefs, diagnostic language assessment, English as an additional language, teacher cognition

Procedia PDF Downloads 199