Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 15672

Search results for: large language models

15642 Challenges of Teaching English Language in Polytechnics

Abstract:

The 21st century is marked by increased industrialization and a great spurt of technical institutes in almost all parts of the country. In this changing scenario, teaching English language to the students of polytechnic institutes, situated in the small towns of the country is a great challenge as well as responsibility. The learners have very strong vernacular roots and their adaptation to the English language is really slow, as a result teaching English language to them is a herculean task. The students of polytechnics get admission despite of low grades, the base of English has to be prepared at the plus two level, the influence of the local language looms large and the reluctance to learn the English language is obvious. However, the needs of the industries have to be kept in mind and the prospective engineers have to be taught the language. There is an urgent need to devise new ways of teaching the language keeping in mind the requirements of the industry, the capability of the students and maintaining the sanctity of the language. A way has to be carved out.

Keywords: industrialization, herculean, prospective, sanctity, vernacular

Procedia PDF Downloads 414

15641 Project Progress Prediction in Software Devlopment Integrating Time Prediction Algorithms and Large Language Modeling

Authors: Dong Wu, Michael Grenn

Abstract:

Managing software projects effectively is crucial for meeting deadlines, ensuring quality, and managing resources well. Traditional methods often struggle with predicting project timelines accurately due to uncertain schedules and complex data. This study addresses these challenges by combining time prediction algorithms with Large Language Models (LLMs). It makes use of real-world software project data to construct and validate a model. The model takes detailed project progress data such as task completion dynamic, team Interaction and development metrics as its input and outputs predictions of project timelines. To evaluate the effectiveness of this model, a comprehensive methodology is employed, involving simulations and practical applications in a variety of real-world software project scenarios. This multifaceted evaluation strategy is designed to validate the model's significant role in enhancing forecast accuracy and elevating overall management efficiency, particularly in complex software project environments. The results indicate that the integration of time prediction algorithms with LLMs has the potential to optimize software project progress management. These quantitative results suggest the effectiveness of the method in practical applications. In conclusion, this study demonstrates that integrating time prediction algorithms with LLMs can significantly improve the predictive accuracy and efficiency of software project management. This offers an advanced project management tool for the industry, with the potential to improve operational efficiency, optimize resource allocation, and ensure timely project completion.

Keywords: software project management, time prediction algorithms, large language models (LLMS), forecast accuracy, project progress prediction

Procedia PDF Downloads 52

15640 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 448

15639 Computational Linguistic Implications of Gender Bias: Machines Reflect Misogyny in Society

Authors: Irene Yi

Abstract:

Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Computational linguistics is a growing field dealing with such issues of data collection for technological development. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Computational analysis on such linguistic data is used to find patterns of misogyny. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: computational analysis, gendered grammar, misogynistic language, neural networks

Procedia PDF Downloads 96

15638 Probing Syntax Information in Word Representations with Deep Metric Learning

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, with the development of large-scale pre-trained lan-guage models, building vector representations of text through deep neural network models has become a standard practice for natural language processing tasks. From the performance on downstream tasks, we can know that the text representation constructed by these models contains linguistic information, but its encoding mode and extent are unclear. In this work, a structural probe is proposed to detect whether the vector representation produced by a deep neural network is embedded with a syntax tree. The probe is trained with the deep metric learning method, so that the distance between word vectors in the metric space it defines encodes the distance of words on the syntax tree, and the norm of word vectors encodes the depth of words on the syntax tree. The experiment results on ELMo and BERT show that the syntax tree is encoded in their parameters and the word representations they produce.

Keywords: deep metric learning, syntax tree probing, natural language processing, word representations

Procedia PDF Downloads 40

15637 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 44

15636 A Comparative Study of Approaches in User-Centred Health Information Retrieval

Authors: Harsh Thakkar, Ganesh Iyer

Abstract:

In this paper, we survey various user-centered or context-based biomedical health information retrieval systems. We present and discuss the performance of systems submitted in CLEF eHealth 2014 Task 3 for this purpose. We classify and focus on comparing the two most prevalent retrieval models in biomedical information retrieval namely: Language Model (LM) and Vector Space Model (VSM). We also report on the effectiveness of using external medical resources and ontologies like MeSH, Metamap, UMLS, etc. We observed that the LM based retrieval systems outperform VSM based systems on various fronts. From the results we conclude that the state-of-art system scores for MAP was 0.4146, P@10 was 0.7560 and NDCG@10 was 0.7445, respectively. All of these score were reported by systems built on language modeling approaches.

Keywords: clinical document retrieval, concept-based information retrieval, query expansion, language models, vector space models

Procedia PDF Downloads 294

15635 A Mathematical Agent-Based Model to Examine Two Patterns of Language Change

Authors: Gareth Baxter

Abstract:

We use a mathematical model of language change to examine two recently observed patterns of language change: one in which most speakers change gradually, following the mean of the community change, and one in which most individuals use predominantly one variant or another, and change rapidly if they change at all. The model is based on Croft’s Utterance Selection account of language change, which views language change as an evolutionary process, in which different variants (different ‘ways of saying the same thing’) compete for usage in a population of speakers. Language change occurs when a new variant replaces an older one as the convention within a given population. The present model extends a previous simpler model to include effects related to speaker aging and interspeaker variation in behaviour. The two patterns of individual change (one more centralized and the other more polarized) were recently observed in historical language changes, and it was further observed that slower changes were more associated with the centralized pattern, while quicker changes were more polarized. Our model suggests that the two patterns of change can be explained by different balances between the preference of speakers to use one variant over another and the degree of accommodation to (propensity to adapt towards) other speakers. The correlation with the rate of change appears naturally in our model, and results from the fact that both differential weighting of variants and the degree of accommodation affect the time for change to occur, while also determining the patterns of change. This work represents part of an ongoing effort to examine phenomena in language change through the use of mathematical models. This offers another way to evaluate qualitative explanations that cannot be practically tested (or cannot be tested at all) in a real-world, large-scale speech community.

Keywords: agent based modeling, cultural evolution, language change, social behavior modeling, social influence

Procedia PDF Downloads 211

15634 Dialect as a Means of Identification among Hausa Speakers

Authors: Hassan Sabo

Abstract:

Language is a system of conventionally spoken, manual and written symbols by human beings that members of a certain social group and participants in its culture express themselves. Communication, expression of identity and imaginative expression are among the functions of language. Dialect is a form of language, or a regional variety of language that is spoken in a particular geographical setting by a particular group of people. Hausa is one of the major languages in Africa, in terms of large number of people for whom it is the first language. Hausa is one of the western Chadic groups of languages. It constitutes one of the five or six branches of Afro-Asiatic family. The predominant Hausa speakers are in Nigeria and they live in different geographical locations which resulted to variety of dialects within the Hausa language apart of the standard Hausa language, the Hausa language has a variety of dialect that distinguish from one another by such features as phonology, grammar and vocabulary. This study intends to examine such features that serve as means of identification among Hausa speakers who are set off from others, geographically or socially.

Keywords: dialect, features, geographical location, Hausa language

Procedia PDF Downloads 169

15633 Recurrent Patterns of Netspeak among Selected Nigerians on WhatsApp Platform: A Quest for Standardisation

Authors: Lily Chimuanya, Esther Ajiboye, Emmanuel Uba

Abstract:

One of the consequences of online communication is the birth of new orthography genres characterised by novel conventions of abbreviation and acronyms usually referred to as Netspeak. Netspeak, also known as internet slang, is a style of writing mainly used in online communication to limit the length of text characters and to save time. The aim of this study is to evaluate how second language users of the English language have internalised this new convention of writing; identify the recurrent patterns of Netspeak; and assess the consistency of the use of the identified patterns in relation to their meanings. The study is corpus-based, and data drawn from WhatsApp chart pages of selected groups of Nigerian English speakers show a large occurrence of inconsistencies in the patterns of Netspeak and their meanings. The study argues that rather than emphasise the negative impact of Netspeak on the communicative competence of second language users, studies should focus on suggesting models as yardsticks for standardising the usage of Netspeak and indeed all other emerging language conventions resulting from online communication. This stance stems from the inevitable global language transformation that is eminent with the coming of age of information technology.

Keywords: abbreviation, acronyms, Netspeak, online communication, standardisation

Procedia PDF Downloads 358

15632 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 115

15631 The Pen Is Mightier than the Sword: Kurdish Language Policy in Turkey

Authors: Irene Yi

Abstract:

This paper analyzes the development of Kurdish language endangerment in Turkey and Kurdish language education over time. It examines the historical context of the Turkish state, as well as reasons for the Turkish language hegemony. From a linguistic standpoint, the Kurdish language is in danger of extinction despite a large number of speakers, lest Kurdish language education is more widely promoted. The paper argues that Kurdish is no longer in a stable diglossic state; if the current trends continue, the language will lose its vitality. This paper recognizes the importance of education in preserving the language while discussing the changing political and institutional regard for Kurdish education. Lastly, the paper outlines solutions to the issue by looking at a variety of proposals, from creating a Kurdistan to merely changing the linguistic landscape in Turkey. After analysis of possible solutions in terms of realistic ability and effectiveness, the paper concludes that changing linguistic landscape and increasing Kurdish language education are the most ideal first steps in a long fight for Kurdish linguistic equality.

Keywords: endangered, Kurdish, oppression, policy

Procedia PDF Downloads 127

15630 Exploring Teachers’ Beliefs about Diagnostic Language Assessment Practices in a Large-Scale Assessment Program

Authors: Oluwaseun Ijiwade, Chris Davison, Kelvin Gregory

Abstract:

In Australia, like other parts of the world, the debate on how to enhance teachers using assessment data to inform teaching and learning of English as an Additional Language (EAL, Australia) or English as a Foreign Language (EFL, United States) have occupied the centre of academic scholarship. Traditionally, this approach was conceptualised as ‘Formative Assessment’ and, in recent times, ‘Assessment for Learning (AfL)’. The central problem is that teacher-made tests are limited in providing data that can inform teaching and learning due to variability of classroom assessments, which are hindered by teachers’ characteristics and assessment literacy. To address this concern, scholars in language education and testing have proposed a uniformed large-scale computer-based assessment program to meet the needs of teachers and promote AfL in language education. In Australia, for instance, the Victoria state government commissioned a large-scale project called 'Tools to Enhance Assessment Literacy (TEAL) for Teachers of English as an additional language'. As part of the TEAL project, a tool called ‘Reading and Vocabulary assessment for English as an Additional Language (RVEAL)’, as a diagnostic language assessment (DLA), was developed by language experts at the University of New South Wales for teachers in Victorian schools to guide EAL pedagogy in the classroom. Therefore, this study aims to provide qualitative evidence for understanding beliefs about the diagnostic language assessment (DLA) among EAL teachers in primary and secondary schools in Victoria, Australia. To realize this goal, this study raises the following questions: (a) How do teachers use large-scale assessment data for diagnostic purposes? (b) What skills do language teachers think are necessary for using assessment data for instruction in the classroom? and (c) What factors, if any, contribute to teachers’ beliefs about diagnostic assessment in a large-scale assessment? Semi-structured interview method was used to collect data from at least 15 professional teachers who were selected through a purposeful sampling. The findings from the resulting data analysis (thematic analysis) provide an understanding of teachers’ beliefs about DLA in a classroom context and identify how these beliefs are crystallised in language teachers. The discussion shows how the findings can be used to inform professional development processes for language teachers as well as informing important factor of teacher cognition in the pedagogic processes of language assessment. This, hopefully, will help test developers and testing organisations to align the outcome of this study with their test development processes to design assessment that can enhance AfL in language education.

Keywords: beliefs, diagnostic language assessment, English as an additional language, teacher cognition

Procedia PDF Downloads 175

15629 Towards Efficient Reasoning about Families of Class Diagrams Using Union Models

Authors: Tejush Badal, Sanaa Alwidian

Abstract:

Class diagrams are useful tools within the Unified Modelling Language (UML) to model and visualize the relationships between, and properties of objects within a system. As a system evolves over time and space (e.g., products), a series of models with several commonalities and variabilities create what is known as a model family. In circumstances where there are several versions of a model, examining each model individually, becomes expensive in terms of computation resources. To avoid performing redundant operations, this paper proposes an approach for representing a family of class diagrams into Union Models to represent model families using a single generic model. The paper aims to analyze and reason about a family of class diagrams using union models as opposed to individual analysis of each member model in the family. The union algorithm provides a holistic view of the model family, where the latter cannot be otherwise obtained from an individual analysis approach, this in turn, enhances the analysis performed in terms of speeding up the time needed to analyze a family of models together as opposed to analyzing individual models, one model at a time.

Keywords: analysis, class diagram, model family, unified modeling language, union model

Procedia PDF Downloads 49

15628 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 118

15627 Named Entity Recognition System for Tigrinya Language

Authors: Sham Kidane, Fitsum Gaim, Ibrahim Abdella, Sirak Asmerom, Yoel Ghebrihiwot, Simon Mulugeta, Natnael Ambassager

Abstract:

The lack of annotated datasets is a bottleneck to the progress of NLP in low-resourced languages. The work presented here consists of large-scale annotated datasets and models for the named entity recognition (NER) system for the Tigrinya language. Our manually constructed corpus comprises over 340K words tagged for NER, with over 118K of the tokens also having parts-of-speech (POS) tags, annotated with 12 distinct classes of entities, represented using several types of tagging schemes. We conducted extensive experiments covering convolutional neural networks and transformer models; the highest performance achieved is 88.8% weighted F1-score. These results are especially noteworthy given the unique challenges posed by Tigrinya’s distinct grammatical structure and complex word morphologies. The system can be an essential building block for the advancement of NLP systems in Tigrinya and other related low-resourced languages and serve as a bridge for cross-referencing against higher-resourced languages.

Keywords: Tigrinya NER corpus, TiBERT, TiRoBERTa, BiLSTM-CRF

Procedia PDF Downloads 76

15626 Efficient Layout-Aware Pretraining for Multimodal Form Understanding

Authors: Armineh Nourbakhsh, Sameena Shah, Carolyn Rose

Abstract:

Layout-aware language models have been used to create multimodal representations for documents that are in image form, achieving relatively high accuracy in document understanding tasks. However, the large number of parameters in the resulting models makes building and using them prohibitive without access to high-performing processing units with large memory capacity. We propose an alternative approach that can create efficient representations without the need for a neural visual backbone. This leads to an 80% reduction in the number of parameters compared to the smallest SOTA model, widely expanding applicability. In addition, our layout embeddings are pre-trained on spatial and visual cues alone and only fused with text embeddings in downstream tasks, which can facilitate applicability to low-resource of multi-lingual domains. Despite using 2.5% of training data, we show competitive performance on two form understanding tasks: semantic labeling and link prediction.

Keywords: layout understanding, form understanding, multimodal document understanding, bias-augmented attention

Procedia PDF Downloads 125

15625 Diagonal Vector Autoregressive Models and Their Properties

Authors: Usoro Anthony E., Udoh Emediong

Abstract:

Diagonal Vector Autoregressive Models are special classes of the general vector autoregressive models identified under certain conditions, where parameters are restricted to the diagonal elements in the coefficient matrices. Variance, autocovariance, and autocorrelation properties of the upper and lower diagonal VAR models are derived. The new set of VAR models is verified with empirical data and is found to perform favourably with the general VAR models. The advantage of the diagonal models over the existing models is that the new models are parsimonious, given the reduction in the interactive coefficients of the general VAR models.

Keywords: VAR models, diagonal VAR models, variance, autocovariance, autocorrelations

Procedia PDF Downloads 87

15624 Models of Bilingual Education in Majority Language Contexts: An Exploratory Study of Bilingual Programmes in Qatari Primary Schools

Authors: Fatma Al-Maadheed

Abstract:

Following an ethnographic approach this study explored bilingual programmes offered by two types of primary schools in Qatar: international and Independent schools. Qatar with its unique linguistic and socio-economic situation launched a new initiative for educatiobnal development in 2001 but with hardly any research linked to theses changes. The study reveals that the Qatari bilingual schools context was one of heteroglossia, with three codes in operation: Modern Standard Arabic, Colloquial Arabic dialects and English. The two schools adopted different models of bilingualism. The international school adopted a strict separation policy between the two languages following a monoglossic belief. The independent school was found to apply a flexible language policy. The study also highlighted the daily challnges produced from the diglossia situation in Qatar, the difference between students and teacher dialect as well as acquiring literacy in the formal language. In addition to an abscence of a clear language policy in Schools, the study brought attention to the instructional methods utilised in language teaching which are mostly associated with successful bilingual education.

Keywords: diglossia, instructional methods, language policy, qatari primary schools

Procedia PDF Downloads 453

15623 Improving Academic Literacy in the Secondary History Classroom

Authors: Wilhelmina van den Berg

Abstract:

Through intentionally developing the Register Continuum and the Functional Model of Language in the secondary history classroom, teachers can effectively build a teaching and learning cycle geared towards literacy improvement and EAL differentiation. Developing an understanding of and engaging students in the field, tenor, and tone of written and spoken language, allows students to build the foundation for greater academic achievement due to integrated literacy skills in the history classroom. Building a variety of scaffolds during lessons within these models means students can improve their academic language and communication skills.

Keywords: academic language, EAL, functional model of language, international baccalaureate, literacy skills

Procedia PDF Downloads 42

15622 CFD Simulation of a Large Scale Unconfined Hydrogen Deflagration

Authors: I. C. Tolias, A. G. Venetsanos, N. Markatos

Abstract:

In the present work, CFD simulations of a large scale open deflagration experiment are performed. Stoichiometric hydrogen-air mixture occupies a 20 m hemisphere. Two combustion models are compared and are evaluated against the experiment. The Eddy Dissipation Model and a Multi-physics combustion model which is based on Yakhot’s equation for the turbulent flame speed. The values of models’ critical parameters are investigated. The effect of the turbulence model is also examined. k-ε model and LES approach were tested.

Keywords: CFD, deflagration, hydrogen, combustion model

Procedia PDF Downloads 475

15621 Generating Insights from Data Using a Hybrid Approach

Authors: Allmin Susaiyah, Aki Härmä, Milan Petković

Abstract:

Automatic generation of insights from data using insight mining systems (IMS) is useful in many applications, such as personal health tracking, patient monitoring, and business process management. Existing IMS face challenges in controlling insight extraction, scaling to large databases, and generalising to unseen domains. In this work, we propose a hybrid approach consisting of rule-based and neural components for generating insights from data while overcoming the aforementioned challenges. Firstly, a rule-based data 2CNL component is used to extract statistically significant insights from data and represent them in a controlled natural language (CNL). Secondly, a BERTSum-based CNL2NL component is used to convert these CNLs into natural language texts. We improve the model using task-specific and domain-specific fine-tuning. Our approach has been evaluated using statistical techniques and standard evaluation metrics. We overcame the aforementioned challenges and observed significant improvement with domain-specific fine-tuning.

Keywords: data mining, insight mining, natural language generation, pre-trained language models

Procedia PDF Downloads 81

15620 A Syntactic Approach to Applied and Socio-Linguistics in Arabic Language in Modern Communications

Authors: Adeyemo Abduljeeel Taiwo

Abstract:

This research is an attempt that creates a conducive atmosphere of a phonological and morphological compendium of Arabic language in Modern Standard Arabic (MSA) for modern day communications. The research is carried out with the chief aim of grammatical analysis of the two broad fields of Arabic linguistics namely: Applied and Socio-Linguistics. It draws a pictorial record of Applied and Socio-Linguistics in Arabic phonology and morphology. Thematically, it postulates and contemplates to a large degree, the theory of concord in contemporary modern Arabic language acquisition. It utilizes an analytical method while it portrays Arabic as a Semitic language that promotes linguistics and syntax among the scholars of the fields.

Keywords: Arabic language, applied linguistics, socio-linguistics, modern communications

Procedia PDF Downloads 301

15619 Hand Motion Trajectory Analysis for Dynamic Hand Gestures Used in Indian Sign Language

Authors: Daleesha M. Viswanathan, Sumam Mary Idicula

Abstract:

Dynamic hand gestures are an intrinsic component in sign language communication. Extracting spatial temporal features of the hand gesture trajectory plays an important role in a dynamic gesture recognition system. Finding a discrete feature descriptor for the motion trajectory based on the orientation feature is the main concern of this paper. Kalman filter algorithm and Hidden Markov Models (HMM) models are incorporated with this recognition system for hand trajectory tracking and for spatial temporal classification, respectively.

Keywords: orientation features, discrete feature vector, HMM., Indian sign language

Procedia PDF Downloads 344

15618 Enhancing Technical Trading Strategy on the Bitcoin Market using News Headlines and Language Models

Authors: Mohammad Hosein Panahi, Naser Yazdani

Abstract:

we present a technical trading strategy that leverages the FinBERT language model and financial news analysis with a focus on news related to a subset of Nasdaq 100 stocks. Our approach surpasses the baseline Range Break-out strategy in the Bitcoin market, yielding a remarkable 24.8% increase in the win ratio for all Friday trades and an impressive 48.9% surge in short trades specifically on Fridays. Moreover, we conduct rigorous hypothesis testing to establish the statistical significance of these improvements. Our findings underscore considerable potential of our NLP-driven approach in enhancing trading strategies and achieving greater profitability within financial markets.

Keywords: quantitative finance, technical analysis, bitcoin market, NLP, language models, FinBERT, technical trading

Procedia PDF Downloads 37

15617 Enhancing English Language Learning through Learners Cultural Background

Authors: A. Attahiru, Rabi Abdullahi Danjuma, Fatima Bint

Abstract:

Language and culture are two concepts which are closely related that one affects the other. This paper attempts to examine the definition of language and culture by discussing the relationship between them. The paper further presents some instructional strategies for the teaching of language and culture as well as the influence of culture on language. It also looks at its implication to language education and finally some recommendation and conclusion were drawn.

Keywords: culture, language, relationship, strategies, teaching

Procedia PDF Downloads 383

15616 ‘Non-Legitimate’ Voices as L2 Models: Towards Becoming a Legitimate L2 Speaker

Authors: M. Rilliard

Abstract:

Based on a Multiliteracies-inspired and sociolinguistically-informed advanced French composition class, this study employed autobiographical narratives from speakers traditionally considered non-legitimate models for L2 teaching purposes of inspiring students to develop an authentic L2 voice and to see themselves as legitimate L2 speakers. Students explored their L2 identities in French through a self-inspired fictional character. Two autobiographical narratives of identity quest by non-traditional French speakers provided them guidance through this process: the novel Le Bleu des Abeilles (2013) and the film Qu’Allah Bénisse la France (2014). Written and French oral productions for different genres, as well as metalinguistic reflections in English, were collected and analyzed. Results indicate that ideas and materials that were relatable to students, namely relatable experiences and relatable language, were most useful to them in developing their L2 voices and achieving authentic and legitimate L2 speakership. These results point towards the benefits of using non-traditional speakers as pedagogical models, as they serve to legitimize students’ sense of their own L2-speakership, which ultimately leads them towards a better, more informed, mastery of the language.

Keywords: foreign language classroom, L2 identity, L2 learning and teaching, L2 writing, sociolinguistics

Procedia PDF Downloads 110

15615 Aspects of Diglossia in Arabic Language Learning

Authors: Adil Ishag

Abstract:

Diglossia emerges in a situation where two distinctive varieties of a language are used alongside within a certain community. In this case, one is considered as a high or standard variety and the second one as a low or colloquial variety. Arabic is an extreme example of a highly diglossic language. This diglossity is due to the fact that Arabic is one of the most spoken languages and spread over 22 Countries in two continents as a mother tongue, and it is also widely spoken in many other Islamic countries as a second language or simply the language of Quran. The geographical variation between the countries where the language is spoken and the duality of the classical Arabic and daily spoken dialects in the Arab world on the other hand; makes the Arabic language one of the most diglossic languages. This paper tries to investigate this phenomena and its relation to learning Arabic as a first and second language.

Keywords: Arabic language, diglossia, first and second language, language learning

Procedia PDF Downloads 535

15614 Cross-Dialect Sentence Transformation: A Comparative Analysis of Language Models for Adapting Sentences to British English

Authors: Shashwat Mookherjee, Shruti Dutta

Abstract:

This study explores linguistic distinctions among American, Indian, and Irish English dialects and assesses various Language Models (LLMs) in their ability to generate British English translations from these dialects. Using cosine similarity analysis, the study measures the linguistic proximity between original British English translations and those produced by LLMs for each dialect. The findings reveal that Indian and Irish English translations maintain notably high similarity scores, suggesting strong linguistic alignment with British English. In contrast, American English exhibits slightly lower similarity, reflecting its distinct linguistic traits. Additionally, the choice of LLM significantly impacts translation quality, with Llama-2-70b consistently demonstrating superior performance. The study underscores the importance of selecting the right model for dialect translation, emphasizing the role of linguistic expertise and contextual understanding in achieving accurate translations.

Keywords: cross-dialect translation, language models, linguistic similarity, multilingual NLP

Procedia PDF Downloads 30

15613 Evolution of Classroom Languaging over the Years: Prospects for Teaching Mathematics Differently

Authors: Jabulani Sibanda, Clemence Chikiwa

Abstract:

This paper traces diverse language practices representative of equally diverse conceptions of language. To be dynamic with languaging practices, one needs to appreciate nuanced languaging practices, their challenges, prospects, and opportunities. The paper presents what we envision as three major conceptions of language that give impetus to diverse language practices. It examines theoretical models of the bilingual mental lexicon and how they inform language practices. The paper explores classroom languaging practices that have been promulgated and experimented with. The paper advocates the deployment of multisensory semiotic systems to complement linguistic classroom communication and the acknowledgement of learners’ linguistic and semiotic resources as valid in the learning enterprise. It recommends the enactment of specific clauses on language in education policies and curriculum documents that empower classroom interactants to exercise discretion in languaging practices.

Keywords: languaging, monolingual, multilingual, semiotic and linguistic repertoire

Procedia PDF Downloads 43