Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 11627

Search results for: natural language generation

11597 The Lexical Eidos as an Invariant of a Polysemantic Word

Abstract:

Phenomenological analysis is not based on natural language, but ideal language which is able to be a carrier of ideal meanings – eidos representing typical structures or essences. For this purpose, it’s necessary to release from the spatio-temporal definiteness of a subject and then state its noetic essence (eidos) by means of free fantasy generation. Herewith, as if a totally new objectness is created - the universal, confirming the thesis that thinking process takes place in generalizations passing by numerous means through the specific to the general and from the general through the specific to the singular.

Keywords: lexical eidos, phenomenology, noema, polysemantic word, semantic core

Procedia PDF Downloads 247

11596 Literacy in First and Second Language: Implication for Language Education

Authors: Inuwa Danladi Bawa

Abstract:

One of the challenges of African states in the development of education in the past and the present is the problem of literacy. Literacy in the first language is seen as a strong base for the development of second language; they are mostly the language of education. Language development is an offshoot of language planning; so the need to develop literacy in both first and second language affects language education and predicts the extent of achievement of the entire education sector. The need to balance literacy acquisition in first language for good conditioning the acquisition of second language is paramount. Likely constraints that includes; non-standardization, underdeveloped and undeveloped first languages are among many. Solutions to some of these include the development of materials and use of the stages and levels of literacy acquisition. This is with believed that a child writes well in second language if he has literacy in the first language.

Keywords: first language, second language, literacy, english language, linguistics

Procedia PDF Downloads 405

11595 Towards Creative Movie Title Generation Using Deep Neural Models

Authors: Simon Espigolé, Igor Shalyminov, Helen Hastie

Abstract:

Deep machine learning techniques including deep neural networks (DNN) have been used to model language and dialogue for conversational agents to perform tasks, such as giving technical support and also for general chit-chat. They have been shown to be capable of generating long, diverse and coherent sentences in end-to-end dialogue systems and natural language generation. However, these systems tend to imitate the training data and will only generate the concepts and language within the scope of what they have been trained on. This work explores how deep neural networks can be used in a task that would normally require human creativity, whereby the human would read the movie description and/or watch the movie and come up with a compelling, interesting movie title. This task differs from simple summarization in that the movie title may not necessarily be derivable from the content or semantics of the movie description. Here, we train a type of DNN called a sequence-to-sequence model (seq2seq) that takes as input a short textual movie description and some information on e.g. genre of the movie. It then learns to output a movie title. The idea is that the DNN will learn certain techniques and approaches that the human movie titler may deploy that may not be immediately obvious to the human-eye. To give an example of a generated movie title, for the movie synopsis: ‘A hitman concludes his legacy with one more job, only to discover he may be the one getting hit.’; the original, true title is ‘The Driver’ and the one generated by the model is ‘The Masquerade’. A human evaluation was conducted where the DNN output was compared to the true human-generated title, as well as a number of baselines, on three 5-point Likert scales: ‘creativity’, ‘naturalness’ and ‘suitability’. Subjects were also asked which of the two systems they preferred. The scores of the DNN model were comparable to the scores of the human-generated movie title, with means m=3.11, m=3.12, respectively. There is room for improvement in these models as they were rated significantly less ‘natural’ and ‘suitable’ when compared to the human title. In addition, the human-generated title was preferred overall 58% of the time when pitted against the DNN model. These results, however, are encouraging given the comparison with a highly-considered, well-crafted human-generated movie title. Movie titles go through a rigorous process of assessment by experts and focus groups, who have watched the movie. This process is in place due to the large amount of money at stake and the importance of creating an effective title that captures the audiences’ attention. Our work shows progress towards automating this process, which in turn may lead to a better understanding of creativity itself.

Keywords: creativity, deep machine learning, natural language generation, movies

Procedia PDF Downloads 301

11594 Revitalization of Sign Language through Deaf Theatre: A Linguistic Analysis of an Art Form Which Combines Physical Theatre, Poetry, and Sign Language

Authors: Gal Belsitzman, Rose Stamp, Atay Citron, Wendy Sandler

Abstract:

Sign languages are considered endangered. The vitality of sign languages is compromised by its unique sociolinguistic situation, in which hearing parents that give birth to deaf children usually decide to cochlear implant their child. Therefore, these children don’t acquire their natural language – Sign Language. Despite this, many sign languages, such as Israeli Sign Language (ISL) are thriving. The continued survival of similar languages under threat has been associated with the remarkable resilience of the language community. In particular, deaf literary traditions are central in reminding the community of the importance of the language. One example of a deaf literary tradition which has received increased popularity in recent years is deaf theatre. The Ebisu Sign Language Theatre Laboratory, developed as part of the multidisciplinary Grammar of the Body Research Project, is the first deaf theatre company in Israel. Ebisu Theatre combines physical theatre and sign language research, to allow for a natural laboratory to analyze the creative use of the body. In this presentation, we focus on the recent theatre production called ‘Their language’ which tells of the struggle faced by the deaf community to use their own natural language in the education system. A thorough analysis unravels how linguistic properties are integrated with the use of poetic devices and physical theatre techniques in this performance, enabling wider access by both deaf and hearing audiences, without interpretation. Interviews with the audience illustrate the significance of this art form which serves a dual purpose, both as empowering for the deaf community and educational for the hearing and deaf audiences, by raising awareness of community-related issues.

Keywords: deaf theatre, empowerment, language revitalization, sign language

Procedia PDF Downloads 142

11593 An Experimental Study on Evacuated Tube Solar Collector for Steam Generation in India

Authors: Avadhesh Yadav, Anunaya Saraswat

Abstract:

An evacuated tube solar collector is experimentally studied for steam generation. When the solar radiation falls on evacuated tubes, this energy is absorbed by the tubes and transferred to water with natural conduction and convection. A natural circulation of water occurs due to the inclination in tubes and header. In this experimental study, the efficiency of collector has been calculated. The result shows that the collector attains the maximum efficiency of 46.26% during 14:00 to 15:00h. Steam has been generated for two hours from 13:30 to 15:30 h on a winter day. Maximum solar intensity and maximum ambient temperatures are 795W/m² and 19^oC respectively on this day.

Keywords: evacuated tube, solar collector, hot water, steam generation

Procedia PDF Downloads 272

11592 Evaluation of Heat Transfer and Entropy Generation by Al2O3-Water Nanofluid

Authors: Houda Jalali, Hassan Abbassi

Abstract:

In this numerical work, natural convection and entropy generation of Al₂O₃–water nanofluid in square cavity have been studied. A two-dimensional steady laminar natural convection in a differentially heated square cavity of length L, filled with a nanofluid is investigated numerically. The horizontal walls are considered adiabatic. Vertical walls corresponding to x=0 and x=L are respectively maintained at hot temperature, T_hand cold temperature, T_c. The resolution is performed by the CFD code "FLUENT" in combination with GAMBIT as mesh generator. These simulations are performed by maintaining the Rayleigh numbers varied as 10³ ≤ Ra ≤ 10⁶, while the solid volume fraction varied from 1% to 5%, the particle size is fixed at dp=33 nm and a range of the temperature from 20 to 70 °C. We used models of thermophysical nanofluids properties based on experimental measurements for studying the effect of adding solid particle into water in natural convection heat transfer and entropy generation of nanofluid. Such as models of thermal conductivity and dynamic viscosity which are dependent on solid volume fraction, particle size and temperature. The average Nusselt number is calculated at the hot wall of the cavity in a different solid volume fraction. The most important results is that at low temperatures (less than 40 °C), the addition of nanosolids Al₂O₃ into water leads to a decrease in heat transfer and entropy generation instead of the expected increase, whereas at high temperature, heat transfer and entropy generation increase with the addition of nanosolids. This behavior is due to the contradictory effects of viscosity and thermal conductivity of the nanofluid. These effects are discussed in this work.

Keywords: entropy generation, heat transfer, nanofluid, natural convection

Procedia PDF Downloads 247

11591 Automatic MC/DC Test Data Generation from Software Module Description

Authors: Sekou Kangoye, Alexis Todoskoff, Mihaela Barreau

Abstract:

Modified Condition/Decision Coverage (MC/DC) is a structural coverage criterion that is highly recommended or required for safety-critical software coverage. Therefore, many testing standards include this criterion and require it to be satisfied at a particular level of testing (e.g. validation and unit levels). However, an important amount of time is needed to meet those requirements. In this paper we propose to automate MC/DC test data generation. Thus, we present an approach to automatically generate MC/DC test data, from software module description written over a dedicated language. We introduce a new merging approach that provides high MC/DC coverage for the description, with only a little number of test cases.

Keywords: domain-specific language, MC/DC, test data generation, safety-critical software coverage

Procedia PDF Downloads 409

11590 Coupling Large Language Models with Disaster Knowledge Graphs for Intelligent Construction

Authors: Zhengrong Wu, Haibo Yang

Abstract:

In the context of escalating global climate change and environmental degradation, the complexity and frequency of natural disasters are continually increasing. Confronted with an abundance of information regarding natural disasters, traditional knowledge graph construction methods, which heavily rely on grammatical rules and prior knowledge, demonstrate suboptimal performance in processing complex, multi-source disaster information. This study, drawing upon past natural disaster reports, disaster-related literature in both English and Chinese, and data from various disaster monitoring stations, constructs question-answer templates based on large language models. Utilizing the P-Tune method, the ChatGLM2-6B model is fine-tuned, leading to the development of a disaster knowledge graph based on large language models. This serves as a knowledge database support for disaster emergency response.

Keywords: large language model, knowledge graph, disaster, deep learning

Procedia PDF Downloads 23

11589 Detecting Paraphrases in Arabic Text

Authors: Amal Alshahrani, Allan Ramsay

Abstract:

Paraphrasing is one of the important tasks in natural language processing; i.e. alternative ways to express the same concept by using different words or phrases. Paraphrases can be used in many natural language applications, such as Information Retrieval, Machine Translation, Question Answering, Text Summarization, or Information Extraction. To obtain pairs of sentences that are paraphrases we create a system that automatically extracts paraphrases from a corpus, which is built from different sources of news article since these are likely to contain paraphrases when they report the same event on the same day. There are existing simple standard approaches (e.g. TF-IDF vector space, cosine similarity) and alignment technique (e.g. Dynamic Time Warping (DTW)) for extracting paraphrase which have been applied to the English. However, the performance of these approaches could be affected when they are applied to another language, for instance Arabic language, due to the presence of phenomena which are not present in English, such as Free Word Order, Zero copula, and Pro-dropping. These phenomena will affect the performance of these algorithms. Thus, if we can analysis how the existing algorithms for English fail for Arabic then we can find a solution for Arabic. The results are promising.

Keywords: natural language processing, TF-IDF, cosine similarity, dynamic time warping (DTW)

Procedia PDF Downloads 355

11588 From User's Requirements to UML Class Diagram

Authors: Zeineb Ben Azzouz, Wahiba Ben Abdessalem Karaa

Abstract:

The automated extraction of UML class diagram from natural language requirements is a highly challenging task. Many approaches, frameworks and tools have been presented in this field. Nonetheless, the experiments of these tools have shown that there is no approach that can work best all the time. In this context, we propose a new accurate approach to facilitate the automatic mapping from textual requirements to UML class diagram. Our new approach integrates the best properties of statistical Natural Language Processing (NLP) techniques to reduce ambiguity when analysing natural language requirements text. In addition, our approach follows the best practices defined by conceptual modelling experts to determine some patterns indispensable for the extraction of basic elements and concepts of the class diagram. Once the relevant information of class diagram is captured, a XMI document is generated and imported with a CASE tool to build the corresponding UML class diagram.

Keywords: class diagram, user’s requirements, XMI, software engineering

Procedia PDF Downloads 443

11587 Analyzing the Perception of Identity in Bilingual Communities: Case Study of Eritrean Immigrants in Switzerland

Authors: Warsa Melles

Abstract:

This study examines the way second-generation Eritrean immigrants living in the French-speaking part of Switzerland behave linguistically and culturally. The aim of this research is to demonstrate how the participants deal with their bilingualism (Tigrinya and French). More precisely, how does their language use correlates with their socio-cultural attitudes and how do these aspects (re)construct their identity? Data for this research was collected via, questionnaires and semi-structured interviews. Participants were asked to answer questions regarding their linguistic habits, their perception on being bilingual and their cultural identity. The major findings demonstrate that generation 2 relates more with the host country’s language since French is used as the main language in their daily interactions. On the other hand, due to the fact that they have never lived in Eritrea yet were raised by Eritrean born parents in a foreign country, it is more difficult for them to unanimously identify with just one culture. In that sense, intergenerational transmission plays a major role in the perception of identity. All the participants have at least a basic knowledge of Tigrinya, but the use of languages varies according to the purpose. Proficiency in the native language and sense of belonging can be correlated with the frequency of visits to Eritrea. In conclusion, the question of identity in the second-generation Eritrean community cannot be given a categorical and clear-cut answer instead, the new-self image that this social group aims to build is shaped by different factors that are essential to take into consideration.

Keywords: biculturalism, identity, language, migration

Procedia PDF Downloads 228

11586 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 70

11585 User Guidance for Effective Query Interpretation in Natural Language Interfaces to Ontologies

Authors: Aliyu Isah Agaie, Masrah Azrifah Azmi Murad, Nurfadhlina Mohd Sharef, Aida Mustapha

Abstract:

Natural Language Interfaces typically support a restricted language and also have scopes and limitations that naïve users are unaware of, resulting in errors when the users attempt to retrieve information from ontologies. To overcome this challenge, an auto-suggest feature is introduced into the querying process where users are guided through the querying process using interactive query construction system. Guiding users to formulate their queries, while providing them with an unconstrained (or almost unconstrained) way to query the ontology results in better interpretation of the query and ultimately lead to an effective search. The approach described in this paper is unobtrusive and subtly guides the users, so that they have a choice of either selecting from the suggestion list or typing in full. The user is not coerced into accepting system suggestions and can express himself using fragments or full sentences.

Keywords: auto-suggest, expressiveness, habitability, natural language interface, query interpretation, user guidance

Procedia PDF Downloads 452

11584 Transportation Language Register as One of Language Community

Authors: Diyah Atiek Mustikawati

Abstract:

Language register refers to a variety of a language used for particular purpose or in a particular social setting. Language register also means as a concept of adapting one’s use of language to conform to standards or tradition in a given professional or social situation. This descriptive study tends to discuss about the form of language register in transportation aspect, factors, also the function of use it. Mostly, language register in transportation aspect uses short sentences in form of informal register. The factor caused language register used are speaker, word choice, background of language. The functions of language register in transportations aspect are to make communication between crew easily, also to keep safety when they were in bad condition. Transportation language register developed naturally as one of variety of language used.

Keywords: language register, language variety, communication, transportation

Procedia PDF Downloads 443

11583 Anti-Fables and Their Linguo Cultural Characteristics

Authors: Tamila Dilaverova

Abstract:

In our era of globalization, the unhindered intercultural communication represents an essential element of development. To be proficient in a language one needs to get acquainted with cultural and national peculiarities of the language of native speakers. Cultural peculiarities are explicitly reflected in the nation’s cultural heritage, monuments, literary works, tales, even in clothes. A specific role in the evaluation of cultural performances, establishment, broadcasting, and preservation of behavior norms belongs to the folk texts, among which the fables occupy one of the most important places. A fable, as a genre has existed since ancient times. Fables are universal because they are absolutely appropriate for any century and for any society. Even in the era of the internet, fables turned out to be actual. Internet offers a wide range of re-made fables. Generally, they are new interpretations of Aesop’s fables, but in some cases they are original. These fables became the subject of our research because they contain some modern slangs and jargons and their language is not that much literary. But, besides the changes in the language, there are some changes in the fields of their occupation, everyday activities and the ways of moneymaking. Because of the numerous changes, these new fables can be called 'anti-fables.' Anti-fables are the very new kinds of fables, that can be absolutely suitable for internet generation and perfectly reflects modern reality. All these changes are the reflections of thoughts and actions of new generation and these anti-fables can become a new internet literary genre.

Keywords: intercultural, fable, language, internet

Procedia PDF Downloads 186

11582 Methodological Proposal, Archival Thesaurus in Colombian Sign Language

Authors: Pedro A. Medina-Rios, Marly Yolie Quintana-Daza

Abstract:

Having the opportunity to communicate in a social, academic and work context is very relevant for any individual and more for a deaf person when oral language is not their natural language, and written language is their second language. Currently, in Colombia, there is not a specialized dictionary for our best knowledge in sign language archiving. Archival is one of the areas that the deaf community has a greater chance of performing. Nourishing new signs in dictionaries for deaf people extends the possibility that they have the appropriate signs to communicate and improve their performance. The aim of this work was to illustrate the importance of designing pedagogical and technological strategies of knowledge management, for the academic inclusion of deaf people through proposals of lexicon in Colombian sign language (LSC) in the area of archival. As a method, the analytical study was used to identify relevant words in the technical area of the archival and its counterpart with the LSC, 30 deaf people, apprentices - students of the Servicio Nacional de Aprendizaje (SENA) in Documentary or Archival Management programs, were evaluated through direct interviews in LSC. For the analysis tools were maintained to evaluate correlation patterns and linguistic methods of visual, gestural analysis and corpus; besides, methods of linear regression were used. Among the results, significant data were found among the variables socioeconomic stratum, academic level, labor location. The need to generate new signals on the subject of the file to improve communication between the deaf person, listener and the sign language interpreter. It is concluded that the generation of new signs to nourish the LSC dictionary in archival subjects is necessary to improve the labor inclusion of deaf people in Colombia.

Keywords: archival, inclusion, deaf, thesaurus

Procedia PDF Downloads 246

11581 Three-Dimensional Unsteady Natural Convection and Entropy Generation in an Inclined Cubical Trapezoidal Cavity Subjected to Uniformly Heated Bottom Wall

Authors: Farshid Fathinia

Abstract:

Numerical computation of unsteady laminar three-dimensional natural convection and entropy generation in an inclined cubical trapezoidal air-filled cavity is performed for the first time in this work. The vertical right and left sidewalls of the cavity are maintained at constant cold temperatures. The lower wall is subjected to a constant hot temperature, while the upper one is considered insulated. Computations are performed for Rayleigh numbers varied as 103 ≤ Ra ≤ 105, while the trapezoidal cavity inclination angle is varied as 0° ≤ ϕ ≤ 180°. Prandtl number is considered constant at Pr = 0.71. The second law of thermodynamics is applied to obtain thermodynamic losses inside the cavity due to both heat transfer and fluid friction irreversibilities. The variation of local and average Nusselt numbers are presented and discussed.While, streamlines, isotherms and entropy contours are presented in both two and three-dimensional pattern. The results show that when the Rayleigh number increases, the flow patterns are changed especially in three-dimensional results and the flow circulation increases. Also, the inclination angle effect on the total entropy generation becomes insignificant when the Rayleigh number is low.Moreover, when the Rayleigh number increases the average Nusselt number increases.

Keywords: transient natural convection, trapezoidal cavity, three-dimensional flow, entropy generation, second law

Procedia PDF Downloads 321

11580 An Experimental Study of Scalar Implicature Processing in Chinese

Authors: Liu Si, Wang Chunmei, Liu Huangmei

Abstract:

A prominent component of the semantic versus pragmatic debate, scalar implicature (SI) has been gaining great attention ever since it was proposed by Horn. The constant debate is between the structural and pragmatic approach. The former claims that generation of SI is costless, automatic, and dependent mostly on the structural properties of sentences, whereas the latter advocates both that such generation is largely dependent upon context, and that the process is costly. Many experiments, among which Katsos’s text comprehension experiments are influential, have been designed and conducted in order to verify their views, but the results are not conclusive. Besides, most of the experiments were conducted in English language materials. Katsos conducted one off-line and three on-line text comprehension experiments, in which the previous shortcomings were addressed on a certain extent and the conclusion was in favor of the pragmatic approach. We intend to test the results of Katsos’s experiment in Chinese scalar implicature. Four experiments in both off-line and on-line conditions to examine the generation and response time of SI in Chinese "yixie" (some) and "quanbu (dou)" (all) will be conducted in order to find out whether the structural or the pragmatic approach could be sustained. The study mainly aims to answer the following questions: (1) Can SI be generated in the upper- and lower-bound contexts as Katsos confirmed when Chinese language materials are used in the experiment? (2) Can SI be first generated, then cancelled as default view claimed or can it not be generated in a neutral context when Chinese language materials are used in the experiment? (3) Is SI generation costless or costly in terms of processing resources? (4) In line with the SI generation process, what conclusion can be made about the cognitive processing model of language meaning? Is it a parallel model or a linear model? Or is it a dynamic and hierarchical model? According to previous theoretical debates and experimental conflicts, presumptions could be made that SI, in Chinese language, might be generated in the upper-bound contexts. Besides, the response time might be faster in upper-bound than that found in lower-bound context. SI generation in neutral context might be the slowest. At last, a conclusion would be made that the processing model of SI could not be verified by either absolute structural or pragmatic approaches. It is, rather, a dynamic and complex processing mechanism, in which the interaction of language forms, ad hoc context, mental context, background knowledge, speakers’ interaction, etc. are involved.

Keywords: cognitive linguistics, pragmatics, scalar implicture, experimental study, Chinese language

Procedia PDF Downloads 338

11579 From the “Movement Language” to Communication Language

Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov

Abstract:

The origin of ‘Human Language’ is still a secret and the most interesting subject of historical linguistics. The core element is the nature of labeling or coding the things or processes with symbols and sounds. In this paper, we investigate human’s involuntary Paired Sounds and Shape Production (PSSP) and its contribution to the development of early human communication. Aimed at twenty-six volunteers who provided many physical movements with various difficulties, the research team investigated the natural, repeatable, and paired sounds and shape productions during human activities. The paper claims the involvement of Paired Sounds and Shape Production (PSSP) in the phonetic origin of some modern words and the existence of similarities between elements of PSSP with characters of the classic Latin alphabet. The results may be used not only as a supporting idea for existing theories but to create a closer look at some fundamental nature of the origin of the languages as well.

Keywords: body shape, body language, coding, Latin alphabet, merging method, movement language, movement sound, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic

Procedia PDF Downloads 160

11578 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 455

11577 Exploring Factors Affecting Electricity Production in Malaysia

Authors: Endang Jati Mat Sahid, Hussain Ali Bekhet

Abstract:

Ability to supply reliable and secure electricity has been one of the crucial components of economic development for any country. Forecasting of electricity production is therefore very important for accurate investment planning of generation power plants. In this study, we aim to examine and analyze the factors that affect electricity generation. Multiple regression models were used to find the relationship between various variables and electricity production. The models will simultaneously determine the effects of the variables on electricity generation. Many variables influencing electricity generation, i.e. natural gas (NG), coal (CO), fuel oil (FO), renewable energy (RE), gross domestic product (GDP) and fuel prices (FP), were examined for Malaysia. The results demonstrate that NG, CO, and FO were the main factors influencing electricity generation growth. This study then identified a number of policy implications resulting from the empirical results.

Keywords: energy policy, energy security, electricity production, Malaysia, the regression model

Procedia PDF Downloads 125

11576 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 96

11575 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 69

11574 An Intergenerational Study of Iranian Migrant Families in Australia: Exploring Language, Identity, and Acculturation

Authors: Alireza Fard Kashani

Abstract:

This study reports on the experiences and attitudes of six Iranian migrant families, from two groups of asylum seekers and skilled workers, with regard to their language, identity, and acculturation in Australia. The participants included first generation parents and 1.5-generation adolescents, who had lived in Australia for a minimum of three years. For this investigation, Mendoza’s (1984, 2016) acculturation model, as well as poststructuralist views of identity, were employed. The semi-structured interview results have highlighted that Iranian parents and adolescents face low degrees of intergenerational conflicts in most domains of their acculturation. However, the structural and lawful patterns in Australia have caused some internal conflicts for the parents, especially fathers (e.g., their power status within the family or their children’s freedom). Furthermore, while most participants reported ‘cultural eclecticism’ as their preferred acculturation orientation, female participants seemed to be more eclectic than their male counterparts who showed inclination towards keeping more aspects of their home culture. This finding, however, highlights a meaningful effort on the part of husbands that in order to make their married lives continue well in Australia they need to re-consider the traditional male-dominated customs they used to have in Iran. As for identity, not only the parents but also the adolescents proudly identified themselves as Persians. In addition, with respect to linguistic behaviour, almost all adolescents showed enthusiasm to retain the Persian language at home to be able to maintain contacts with their relatives and friends in Iran and to enjoy many other benefits the language may offer them in the future.

Keywords: acculturation, asylum seekers, identity, intergenerational conflicts, language, skilled workers, 1.5 generation

Procedia PDF Downloads 205

11573 Research on the Risks of Railroad Receiving and Dispatching Trains Operators: Natural Language Processing Risk Text Mining

Authors: Yangze Lan, Ruihua Xv, Feng Zhou, Yijia Shan, Longhao Zhang, Qinghui Xv

Abstract:

Receiving and dispatching trains is an important part of railroad organization, and the risky evaluation of operating personnel is still reflected by scores, lacking further excavation of wrong answers and operating accidents. With natural language processing (NLP) technology, this study extracts the keywords and key phrases of 40 relevant risk events about receiving and dispatching trains and reclassifies the risk events into 8 categories, such as train approach and signal risks, dispatching command risks, and so on. Based on the historical risk data of personnel, the K-Means clustering method is used to classify the risk level of personnel. The result indicates that the high-risk operating personnel need to strengthen the training of train receiving and dispatching operations towards essential trains and abnormal situations.

Keywords: receiving and dispatching trains, natural language processing, risk evaluation, K-means clustering

Procedia PDF Downloads 43

11572 Enhancing Large Language Models' Data Analysis Capability with Planning-and-Execution and Code Generation Agents: A Use Case for Southeast Asia Real Estate Market Analytics

Authors: Kien Vu, Jien Min Soh, Mohamed Jahangir Abubacker, Piyawut Pattamanon, Soojin Lee, Suvro Banerjee

Abstract:

Recent advances in Generative Artificial Intelligence (GenAI), in particular Large Language Models (LLMs) have shown promise to disrupt multiple industries at scale. However, LLMs also present unique challenges, notably, these so-called "hallucination" which is the generation of outputs that are not grounded in the input data that hinders its adoption into production. Common practice to mitigate hallucination problem is utilizing Retrieval Agmented Generation (RAG) system to ground LLMs'response to ground truth. RAG converts the grounding documents into embeddings, retrieve the relevant parts with vector similarity between user's query and documents, then generates a response that is not only based on its pre-trained knowledge but also on the specific information from the retrieved documents. However, the RAG system is not suitable for tabular data and subsequent data analysis tasks due to multiple reasons such as information loss, data format, and retrieval mechanism. In this study, we have explored a novel methodology that combines planning-and-execution and code generation agents to enhance LLMs' data analysis capabilities. The approach enables LLMs to autonomously dissect a complex analytical task into simpler sub-tasks and requirements, then convert them into executable segments of code. In the final step, it generates the complete response from output of the executed code. When deployed beta version on DataSense, the property insight tool of PropertyGuru, the approach yielded promising results, as it was able to provide market insights and data visualization needs with high accuracy and extensive coverage by abstracting the complexities for real-estate agents and developers from non-programming background. In essence, the methodology not only refines the analytical process but also serves as a strategic tool for real estate professionals, aiding in market understanding and enhancement without the need for programming skills. The implication extends beyond immediate analytics, paving the way for a new era in the real estate industry characterized by efficiency and advanced data utilization.

Keywords: large language model, reasoning, planning and execution, code generation, natural language processing, prompt engineering, data analysis, real estate, data sense, PropertyGuru

Procedia PDF Downloads 40

11571 Migration and Identity Erosion: An Exploratory Study of First Generation Nigerian-Americans

Authors: Lolade Siyonbola

Abstract:

Nigerians are often celebrated as being the most educated cultural group in America. The cultural values and history that have led to this reality are particular to a generation that came of age post colonialism. Many of these cultural values have been passed down from post-colonial parent to millennial child, but most have not. This study, based on interviews and surveys of Nigerian millennials and their parents in the United States, explores the degree to which identity has been eroded in the millennial generation due to a lack of imparted cultural values and knowledge from the previous generation. Most of the subjects do not speak their native language or identify with their cultural heritage sufficiently to build ties with their native land. Most are experiencing some degree of identity crisis, and therefore limited self-actualization, with little to no support; as there are few successful tools available to this population. If governmental programs to reverse these trends are not implemented within this generation, the implications to the individual, family and home nation (Nigeria), will be felt for generations to come.

Keywords: identity, culture, self-actualization, social identity theory, migration, transnationalism, value systems

Procedia PDF Downloads 347

11570 Intelligent Chatbot Generating Dynamic Responses Through Natural Language Processing

Authors: Aarnav Singh, Jatin Moolchandani

Abstract:

The proposed research work aims to build a query-based AI chatbot that can answer any question related to any topic. A chatbot is software that converses with users via text messages. In the proposed system, we aim to build a chatbot that generates a response based on the user’s query. For this, we use natural language processing to analyze the query and some set of texts to form a concise answer. The texts are obtained through web-scrapping and filtering all the credible sources from a web search. The objective of this project is to provide a chatbot that is able to provide simple and accurate answers without the user having to read through a large number of articles and websites. Creating an AI chatbot that can answer a variety of user questions on a variety of topics is the goal of the proposed research project. This chatbot uses natural language processing to comprehend user inquiries and provides succinct responses by examining a collection of writings that were scraped from the internet. The texts are carefully selected from reliable websites that are found via internet searches. This project aims to provide users with a chatbot that provides clear and precise responses, removing the need to go through several articles and web pages in great detail. In addition to exploring the reasons for their broad acceptance and their usefulness across many industries, this article offers an overview of the interest in chatbots throughout the world.

Keywords: Chatbot, Artificial Intelligence, natural language processing, web scrapping

Procedia PDF Downloads 33

11569 A Controlled Natural Language Assisted Approach for the Design and Automated Processing of Service Level Agreements

Authors: Christopher Schwarz, Katrin Riegler, Erwin Zinser

Abstract:

The management of outsourcing relationships between IT service providers and their customers proofs to be a critical issue that has to be stipulated by means of Service Level Agreements (SLAs). Since service requirements differ from customer to customer, SLA content and language structures vary largely, standardized SLA templates may not be used and an automated processing of SLA content is not possible. Hence, SLA management is usually a time-consuming and inefficient manual process. For overcoming these challenges, this paper presents an innovative and ITIL V3-conform approach for automated SLA design and management using controlled natural language in enterprise collaboration portals. The proposed novel concept is based on a self-developed controlled natural language that follows a subject-predicate-object approach to specify well-defined SLA content structures that act as templates for customized contracts and support automated SLA processing. The derived results eventually enable IT service providers to automate several SLA request, approval and negotiation processes by means of workflows and business rules within an enterprise collaboration portal. The illustrated prototypical realization gives evidence of the practical relevance in service-oriented scenarios as well as the high flexibility and adaptability of the presented model. Thus, the prototype enables the automated creation of well defined, customized SLA documents, providing a knowledge representation that is both human understandable and machine processable.

Keywords: automated processing, controlled natural language, knowledge representation, information technology outsourcing, service level management

Procedia PDF Downloads 396

11568 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 120