Search results for: text processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4634

Search results for: text processing

4454 JaCoText: A Pretrained Model for Java Code-Text Generation

Authors: Jessica Lopez Espejel, Mahaman Sanoussi Yahaya Alassan, Walid Dahhane, El Hassane Ettifouri

Abstract:

Pretrained transformer-based models have shown high performance in natural language generation tasks. However, a new wave of interest has surged: automatic programming language code generation. This task consists of translating natural language instructions to a source code. Despite the fact that well-known pre-trained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformer neural network. It aims to generate java source code from natural language text. JaCoText leverages the advantages of both natural language and code generation models. More specifically, we study some findings from state of the art and use them to (1) initialize our model from powerful pre-trained models, (2) explore additional pretraining on our java dataset, (3) lead experiments combining the unimodal and bimodal data in training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

Keywords: java code generation, natural language processing, sequence-to-sequence models, transformer neural networks

Procedia PDF Downloads 236
4453 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 69
4452 IoT Based Information Processing and Computing

Authors: Mannan Ahmad Rasheed, Sawera Kanwal, Mansoor Ahmad Rasheed

Abstract:

The Internet of Things (IoT) has revolutionized the way we collect and process information, making it possible to gather data from a wide range of connected devices and sensors. This has led to the development of IoT-based information processing and computing systems that are capable of handling large amounts of data in real time. This paper provides a comprehensive overview of the current state of IoT-based information processing and computing, as well as the key challenges and gaps that need to be addressed. This paper discusses the potential benefits of IoT-based information processing and computing, such as improved efficiency, enhanced decision-making, and cost savings. Despite the numerous benefits of IoT-based information processing and computing, several challenges need to be addressed to realize the full potential of these systems. These challenges include security and privacy concerns, interoperability issues, scalability and reliability of IoT devices, and the need for standardization and regulation of IoT technologies. Moreover, this paper identifies several gaps in the current research related to IoT-based information processing and computing. One major gap is the lack of a comprehensive framework for designing and implementing IoT-based information processing and computing systems.

Keywords: IoT, computing, information processing, Iot computing

Procedia PDF Downloads 146
4451 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 327
4450 A Text-Oriented Study on Treatises and the End of the Struggles in Silius

Authors: Arianna Sacerdoti

Abstract:

This paper is original and fills, to our best Knowledge, a gap in secondary literature. It analyzes the presence of treatises in Silius Italicus’ Punica and what happens in the plot when a struggle ends. As a result, we will understand if treatises are stipulated or broken, and which narrative devices go with the presence of treatises and the end of the battles. Methodology will be text-oriented, and all the passages will be presented in the Latin language and discussed. In concluding, it is important to understand – in a poem based on war – the role of treatises and the end of battles in Silius Italicus.

Keywords: Flavian Epic, Silius Italicus, Punica, treatises

Procedia PDF Downloads 102
4449 Investigating the Effectiveness of Multilingual NLP Models for Sentiment Analysis

Authors: Othmane Touri, Sanaa El Filali, El Habib Benlahmar

Abstract:

Natural Language Processing (NLP) has gained significant attention lately. It has proved its ability to analyze and extract insights from unstructured text data in various languages. It is found that one of the most popular NLP applications is sentiment analysis which aims to identify the sentiment expressed in a piece of text, such as positive, negative, or neutral, in multiple languages. While there are several multilingual NLP models available for sentiment analysis, there is a need to investigate their effectiveness in different contexts and applications. In this study, we aim to investigate the effectiveness of different multilingual NLP models for sentiment analysis on a dataset of online product reviews in multiple languages. The performance of several NLP models, including Google Cloud Natural Language API, Microsoft Azure Cognitive Services, Amazon Comprehend, Stanford CoreNLP, spaCy, and Hugging Face Transformers are being compared. The models based on several metrics, including accuracy, precision, recall, and F1 score, are being evaluated and compared to their performance across different categories of product reviews. In order to run the study, preprocessing of the dataset has been performed by cleaning and tokenizing the text data in multiple languages. Then training and testing each model has been applied using a cross-validation approach where randomly dividing the dataset into training and testing sets and repeating the process multiple times has been used. A grid search approach to optimize the hyperparameters of each model and select the best-performing model for each category of product reviews and language has been applied. The findings of this study provide insights into the effectiveness of different multilingual NLP models for Multilingual Sentiment Analysis and their suitability for different languages and applications. The strengths and limitations of each model were identified, and recommendations for selecting the most performant model based on the specific requirements of a project were provided. This study contributes to the advancement of research methods in multilingual NLP and provides a practical guide for researchers and practitioners in the field.

Keywords: NLP, multilingual, sentiment analysis, texts

Procedia PDF Downloads 58
4448 Math and Religion in Arvo Pärt's Out of the Depths

Authors: Ismael Lins Patriota

Abstract:

Arvo Pärt is an Estonian composer who started his musical career under the influence of twelve-tone music and dodecaphonism. From 1968 to 1976, he isolated himself to search for a new path as a composer. In this period, he converted to Russian orthodoxy and changed his composing to tintinnabuli, a musical technique combining triadic chords with simple melodies. The recent analysis of Pärt’s output demonstrates that mathematics remained an influence after the invention of tintinnabuli. The present discussion deals with the relationship between math and religion in his work Out of the Depths (1980), proposing a musical-text approach and examining the minimum elements of the piece, such as motives and sub-phrases, which is the main focus of this work, considering text patterns and the role of the organ, which also uses the tintinnabuli system. The analysis of these elements demonstrates that Pärt uses math as a formal element, and the composer combines musical parameters to execute a personal and innovative interpretation of the text.

Keywords: Arvo Pärt, Out of the Depths, math, religion, analysis

Procedia PDF Downloads 52
4447 A Hebbian Neural Network Model of the Stroop Effect

Authors: Vadim Kulikov

Abstract:

The classical Stroop effect is the phenomenon that it takes more time to name the ink color of a printed word if the word denotes a conflicting color than if it denotes the same color. Over the last 80 years, there have been many variations of the experiment revealing various mechanisms behind semantic, attentional, behavioral and perceptual processing. The Stroop task is known to exhibit asymmetry. Reading the words out loud is hardly dependent on the ink color, but naming the ink color is significantly influenced by the incongruent words. This asymmetry is reversed, if instead of naming the color, one has to point at a corresponding color patch. Another debated aspects are the notions of automaticity and how much of the effect is due to semantic and how much due to response stage interference. Is automaticity a continuous or an all-or-none phenomenon? There are many models and theories in the literature tackling these questions which will be discussed in the presentation. None of them, however, seems to capture all the findings at once. A computational model is proposed which is based on the philosophical idea developed by the author that the mind operates as a collection of different information processing modalities such as different sensory and descriptive modalities, which produce emergent phenomena through mutual interaction and coherence. This is the framework theory where ‘framework’ attempts to generalize the concepts of modality, perspective and ‘point of view’. The architecture of this computational model consists of blocks of neurons, each block corresponding to one framework. In the simplest case there are four: visual color processing, text reading, speech production and attention selection modalities. In experiments where button pressing or pointing is required, a corresponding block is added. In the beginning, the weights of the neural connections are mostly set to zero. The network is trained using Hebbian learning to establish connections (corresponding to ‘coherence’ in framework theory) between these different modalities. The amount of data fed into the network is supposed to mimic the amount of practice a human encounters, in particular it is assumed that converting written text into spoken words is a more practiced skill than converting visually perceived colors to spoken color-names. After the training, the network performs the Stroop task. The RT’s are measured in a canonical way, as these are continuous time recurrent neural networks (CTRNN). The above-described aspects of the Stroop phenomenon along with many others are replicated. The model is similar to some existing connectionist models but as will be discussed in the presentation, has many advantages: it predicts more data, the architecture is simpler and biologically more plausible.

Keywords: connectionism, Hebbian learning, artificial neural networks, philosophy of mind, Stroop

Procedia PDF Downloads 239
4446 A Novel Probabilistic Spatial Locality of Reference Technique for Automatic Cleansing of Digital Maps

Authors: A. Abdullah, S. Abushalmat, A. Bakshwain, A. Basuhail, A. Aslam

Abstract:

GIS (Geographic Information System) applications require geo-referenced data, this data could be available as databases or in the form of digital or hard-copy agro-meteorological maps. These parameter maps are color-coded with different regions corresponding to different parameter values, converting these maps into a database is not very difficult. However, text and different planimetric elements overlaid on these maps makes an accurate image to database conversion a challenging problem. The reason being, it is almost impossible to exactly replace what was underneath the text or icons; thus, pointing to the need for inpainting. In this paper, we propose a probabilistic inpainting approach that uses the probability of spatial locality of colors in the map for replacing overlaid elements with underlying color. We tested the limits of our proposed technique using non-textual simulated data and compared text removing results with a popular image editing tool using public domain data with promising results.

Keywords: noise, image, GIS, digital map, inpainting

Procedia PDF Downloads 322
4445 Role of Gender in Apparel Stores' Consumer Review: A Sentiment Analysis

Authors: Sarif Ullah Patwary, Matthew Heinrich, Brandon Payne

Abstract:

The ubiquity of web 2.0 platforms, in the form of wikis, social media (e.g., Facebook, Twitter, etc.) and online review portals (e.g., Yelp), helps shape today’s apparel consumers’ purchasing decision. Online reviews play important role towards consumers’ apparel purchase decision. Each of the consumer reviews carries a sentiment (positive, negative or neutral) towards products. Commercially, apparel brands and retailers analyze sentiment of this massive amount of consumer review data to update their inventory and bring new products in the market. The purpose of this study is to analyze consumer reviews of selected apparel stores with a view to understand, 1) the difference of sentiment expressed through men’s and woman’s text reviews, 2) the difference of sentiment expressed through men’s and woman’s star-based reviews, and 3) the difference of sentiment between star-based reviews and text-based reviews. A total of 9,363 reviews (1,713 men and 7,650 women) were collected using Yelp Dataset Challenge. Sentiment analysis of collected reviews was carried out in two dimensions: star-based reviews and text-based reviews. Sentiment towards apparel stores expressed through star-based reviews was deemed: 1) positive for 3 or 4 stars 2) negative for 1 or 2 stars and 3) neutral for 3 stars. Sentiment analysis of text-based reviews was carried out using Bing Liu dictionary. The analysis was conducted in IPyhton 5.0. Space. The sentiment analysis results revealed the percentage of positive text reviews by men (80%) and women (80%) were identical. Women reviewers (12%) provided more neutral (e.g., 3 out of 5 stars) star reviews than men (6%). Star-based reviews were more negative than the text-based reviews. In other words, while 80% men and women wrote positive reviews for the stores, less than 70% ended up giving 4 or 5 stars in those reviews. One of the key takeaways of the study is that star reviews provide slightly negative sentiment of the consumer reviews. Therefore, in order to understand sentiment towards apparel products, one might need to combine both star and text aspects of consumer reviews. This study used a specific dataset consisting of selected apparel stores from particular geographical locations (the information was not given for privacy concern). Future studies need to include more data from more stores and locations to generalize the findings of the study.

Keywords: apparel, consumer review, sentiment analysis, gender

Procedia PDF Downloads 138
4444 A Novel Machine Learning Approach to Aid Agrammatism in Non-fluent Aphasia

Authors: Rohan Bhasin

Abstract:

Agrammatism in non-fluent Aphasia Cases can be defined as a language disorder wherein a patient can only use content words ( nouns, verbs and adjectives ) for communication and their speech is devoid of functional word types like conjunctions and articles, generating speech of with extremely rudimentary grammar . Past approaches involve Speech Therapy of some order with conversation analysis used to analyse pre-therapy speech patterns and qualitative changes in conversational behaviour after therapy. We describe this approach as a novel method to generate functional words (prepositions, articles, ) around content words ( nouns, verbs and adjectives ) using a combination of Natural Language Processing and Deep Learning algorithms. The applications of this approach can be used to assist communication. The approach the paper investigates is : LSTMs or Seq2Seq: A sequence2sequence approach (seq2seq) or LSTM would take in a sequence of inputs and output sequence. This approach needs a significant amount of training data, with each training data containing pairs such as (content words, complete sentence). We generate such data by starting with complete sentences from a text source, removing functional words to get just the content words. However, this approach would require a lot of training data to get a coherent input. The assumptions of this approach is that the content words received in the inputs of both text models are to be preserved, i.e, won't alter after the functional grammar is slotted in. This is a potential limit to cases of severe Agrammatism where such order might not be inherently correct. The applications of this approach can be used to assist communication mild Agrammatism in non-fluent Aphasia Cases. Thus by generating these function words around the content words, we can provide meaningful sentence options to the patient for articulate conversations. Thus our project translates the use case of generating sentences from content-specific words into an assistive technology for non-Fluent Aphasia Patients.

Keywords: aphasia, expressive aphasia, assistive algorithms, neurology, machine learning, natural language processing, language disorder, behaviour disorder, sequence to sequence, LSTM

Procedia PDF Downloads 140
4443 Importance of Punctuation in Communicative Competence

Authors: Khayriniso Bakhtiyarovna Ganiyeva

Abstract:

The article explores the significance of punctuation in achieving communicative competence. It underscores that effective communication goes beyond simply using punctuation correctly. In the successful completion of a communicative activity, it is important not that the writer correctly uses punctuation marks but that he was able to achieve a goal aimed at expressing a certain meaning. The unanimity of the writer and the reader in the mutual understanding of the text is of primary importance. It should also be taken into account that situational communication provides special informative content and expressiveness of speech. Also, the norms of the situation are determined by the nature of the information in the text, and the punctuation marks expressed in accordance with the norm perform logical-semantic, highlighting expressive-emotional and signaling functions. It is a mistake to classify the signs subject to the norm of the situation as created by the author because they functionally reflect the general stylistic features of different texts. Such signs are among the common signs that are codified only by the semantics and structure of the created text.

Keywords: communicative-pragmatic approach, expressiveness of speech, stylistic features, comparative analysis

Procedia PDF Downloads 30
4442 Text as Reader Device Improving Subjectivity on the Role of Attestation between Interpretative Semiotics and Discursive Linguistics

Authors: Marco Castagna

Abstract:

Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.

Keywords: attestation, meaning, reader, text

Procedia PDF Downloads 217
4441 Music Reading Expertise Facilitates Implicit Statistical Learning of Sentence Structures in a Novel Language: Evidence from Eye Movement Behavior

Authors: Sara T. K. Li, Belinda H. J. Chung, Jeffery C. N. Yip, Janet H. Hsiao

Abstract:

Music notation and text reading both involve statistical learning of music or linguistic structures. However, it remains unclear how music reading expertise influences text reading behavior. The present study examined this issue through an eye-tracking study. Chinese-English bilingual musicians and non-musicians read English sentences, Chinese sentences, musical phrases, and sentences in Tibetan, a language novel to the participants, with their eye movement recorded. Each set of stimuli consisted of two conditions in terms of structural regularity: syntactically correct and syntactically incorrect musical phrases/sentences. They then completed a sentence comprehension (for syntactically correct sentences) or a musical segment/word recognition task afterwards to test their comprehension/recognition abilities. The results showed that in reading musical phrases, as compared with non-musicians, musicians had a higher accuracy in the recognition task, and had shorter reading time, fewer fixations, and shorter fixation duration when reading syntactically correct (i.e., in diatonic key) than incorrect (i.e., in non-diatonic key/atonal) musical phrases. This result reflects their expertise in music reading. Interestingly, in reading Tibetan sentences, which was novel to both participant groups, while non-musicians did not show any behavior differences between reading syntactically correct or incorrect Tibetan sentences, musicians showed a shorter reading time and had marginally fewer fixations when reading syntactically correct sentences than syntactically incorrect ones. However, none of the musicians reported discovering any structural regularities in the Tibetan stimuli after the experiment when being asked explicitly, suggesting that they may have implicitly acquired the structural regularities in Tibetan sentences. This group difference was not observed when they read English or Chinese sentences. This result suggests that music reading expertise facilities reading texts in a novel language (i.e., Tibetan), but not in languages that the readers are already familiar with (i.e., English and Chinese). This phenomenon may be due to the similarities between reading music notations and reading texts in a novel language, as in both cases the stimuli follow particular statistical structures but do not involve semantic or lexical processing. Thus, musicians may transfer their statistical learning skills stemmed from music notation reading experience to implicitly discover structures of sentences in a novel language. This speculation is consistent with a recent finding showing that music reading expertise modulates the processing of English nonwords (i.e., words that do not follow morphological or orthographic rules) but not pseudo- or real words. These results suggest that the modulation of music reading expertise on language processing depends on the similarities in the cognitive processes involved. It also has important implications for the benefits of music education on language and cognitive development.

Keywords: eye movement behavior, eye-tracking, music reading expertise, sentence reading, structural regularity, visual processing

Procedia PDF Downloads 351
4440 Valence and Arousal-Based Sentiment Analysis: A Comparative Study

Authors: Usama Shahid, Muhammad Zunnurain Hussain

Abstract:

This research paper presents a comprehensive analysis of a sentiment analysis approach that employs valence and arousal as its foundational pillars, in comparison to traditional techniques. Sentiment analysis is an indispensable task in natural language processing that involves the extraction of opinions and emotions from textual data. The valence and arousal dimensions, representing the intensity and positivity/negativity of emotions, respectively, enable the creation of four quadrants, each representing a specific emotional state. The study seeks to determine the impact of utilizing these quadrants to identify distinct emotional states on the accuracy and efficiency of sentiment analysis, in comparison to traditional techniques. The results reveal that the valence and arousal-based approach outperforms other approaches, particularly in identifying nuanced emotions that may be missed by conventional methods. The study's findings are crucial for applications such as social media monitoring and market research, where the accurate classification of emotions and opinions is paramount. Overall, this research highlights the potential of using valence and arousal as a framework for sentiment analysis and offers invaluable insights into the benefits of incorporating specific types of emotions into the analysis. These findings have significant implications for researchers and practitioners in the field of natural language processing, as they provide a basis for the development of more accurate and effective sentiment analysis tools.

Keywords: sentiment analysis, valence and arousal, emotional states, natural language processing, machine learning, text analysis, sentiment classification, opinion mining

Procedia PDF Downloads 55
4439 Differences in the Processing of Sentences with Lexical Ambiguity and Structural Ambiguity: An Experimental Study

Authors: Mariana T. Teixeira, Joana P. Luz

Abstract:

This paper is based on assumptions of psycholinguistics and investigates the processing of ambiguous sentences in Brazilian Portuguese. Specifically, it aims to verify if there is a difference in processing time between sentences with lexical ambiguity and sentences with structural (or syntactic) ambiguity. We hypothesize, based on the Garden Path Theory, that the two types of ambiguity entail different cognitive efforts, since sentences with structural ambiguity require that two structures be processed, whereas ambiguous phrases whose root of ambiguity is in a word require the processing of a single structure, which admits a variation of punctual meaning, within the scope of only one lexical item. In order to test this hypothesis, 25 undergraduate students, whose average age was 27.66 years, native speakers of Brazilian Portuguese, performed a self-monitoring reading task of ambiguous sentences, which had lexical and structural ambiguity. The results suggest that unambiguous sentence processing is faster than ambiguous sentence processing, whether it has lexical or structural ambiguity. In addition, participants presented a mean reading time greater for sentences with syntactic ambiguity than for sentences with lexical ambiguity, evidencing a greater cognitive effort in sentence processing with structural ambiguity.

Keywords: Brazilian portuguese, lexical ambiguity, sentence processing, syntactic ambiguity

Procedia PDF Downloads 194
4438 An Event-Related Potentials Study on the Processing of English Subjunctive Mood by Chinese ESL Learners

Authors: Yan Huang

Abstract:

Event-related potentials (ERPs) technique helps researchers to make continuous measures on the whole process of language comprehension, with an excellent temporal resolution at the level of milliseconds. The research on sentence processing has developed from the behavioral level to the neuropsychological level, which brings about a variety of sentence processing theories and models. However, the applicability of these models to L2 learners is still under debate. Therefore, the present study aims to investigate the neural mechanisms underlying English subjunctive mood processing by Chinese ESL learners. To this end, English subject clauses with subjunctive moods are used as the stimuli, all of which follow the same syntactic structure, “It is + adjective + that … + (should) do + …” Besides, in order to examine the role that language proficiency plays on L2 processing, this research deals with two groups of Chinese ESL learners (18 males and 22 females, mean age=21.68), namely, high proficiency group (Group H) and low proficiency group (Group L). Finally, the behavioral and neurophysiological data analysis reveals the following findings: 1) Syntax and semantics interact with each other on the SECOND phase (300-500ms) of sentence processing, which is partially in line with the Three-phase Sentence Model; 2) Language proficiency does affect L2 processing. Specifically, for Group H, it is the syntactic processing that plays the dominant role in sentence processing while for Group L, semantic processing also affects the syntactic parsing during the THIRD phase of sentence processing (500-700ms). Besides, Group H, compared to Group L, demonstrates a richer native-like ERPs pattern, which further demonstrates the role of language proficiency in L2 processing. Based on the research findings, this paper also provides some enlightenment for the L2 pedagogy as well as the L2 proficiency assessment.

Keywords: Chinese ESL learners, English subjunctive mood, ERPs, L2 processing

Procedia PDF Downloads 107
4437 Making Sense of Places: A Comparative Study of Three Contexts in Thailand

Authors: Thirayu Jumsai Na Ayudhya

Abstract:

The study of what architecture means to people in their everyday lives inadequately addresses the contextualized and holistic theoretical framework. This article succinctly presents theoretical framework obtained from the comparative study of how people experience the everyday architecture in three different contexts including 1) Bangkok CBD, 2) Phuket island old-town, and 3) Nan province old-town. The way people make sense of the everyday architecture can be addressed in four super-ordinate themes; (1) building in urban (text), (2) building in (text), (3) building in human (text), (4) and building in time (text). In this article, these super-ordinate themes were verified whether they recur in three studied-contexts. In each studied-context, the participants were divided into two groups, 1) local people, 2) visitors. Participants were asked to take photographs of the everyday architecture during the everyday routine and to participate the elicit-interview with photographs produced by themselves. Interpretative phenomenological analysis (IPA) was adopted to interpret elicit-interview data. Sub-themes emerging in each studied-context were brought into the cross-comparison among three studied- contexts. It is found that four super-ordinate themes recur with additional distinctive sub-themes. Further studies in other different contexts, such as socio-political, economic, cultural differences, are recommended to complete the theoretical framework.

Keywords: sense of place, the everyday architecture, architectural experience, the everyday

Procedia PDF Downloads 130
4436 Recognizing Customer Preferences Using Review Documents: A Hybrid Text and Data Mining Approach

Authors: Oshin Anand, Atanu Rakshit

Abstract:

The vast increment in the e-commerce ventures makes this area a prominent research stream. Besides several quantified parameters, the textual content of reviews is a storehouse of many information that can educate companies and help them earn profit. This study is an attempt in this direction. The article attempts to categorize data based on a computed metric that quantifies the influencing capacity of reviews rendering two categories of high and low influential reviews. Further, each of these document is studied to conclude several product feature categories. Each of these categories along with the computed metric is converted to linguistic identifiers and are used in an association mining model. The article makes a novel attempt to combine feature attraction with quantified metric to categorize review text and finally provide frequent patterns that depict customer preferences. Frequent mentions in a highly influential score depict customer likes or preferred features in the product whereas prominent pattern in low influencing reviews highlights what is not important for customers. This is achieved using a hybrid approach of text mining for feature and term extraction, sentiment analysis, multicriteria decision-making technique and association mining model.

Keywords: association mining, customer preference, frequent pattern, online reviews, text mining

Procedia PDF Downloads 363
4435 Investigating the Relationship and Interaction between Auditory Processing Disorder and Auditory Attention

Authors: Amirreza Razzaghipour Sorkhab

Abstract:

The exploration of the connection between cognition and Auditory Processing Disorder (APD) holds significant value. Individuals with APD experience challenges in processing auditory information through the central auditory nervous system's varied pathways. Understanding the importance of auditory attention in individuals with APD, as well as the primary diagnostic tools such as language and auditory attention tests, highlights the critical need for assessing their auditory attention abilities. While not all children with Auditory Processing Disorder (APD) show deficits in auditory attention, there are often deficiencies in cognitive and attentional performance. The link between various types of attention deficits and APD suggests impairments in sustained and divided auditory attention. Research into the origins of APD should also encompass higher-level processes, such as auditory attention. It is evident that investigating the interaction between APD and auditory and cognitive functions holds significant value. Furthermore, it was demonstrated that APD tests may be influenced by cognitive factors, but despite signs of auditory attention interaction with auditory processing skills and the influence of cognitive factors on tests for this disorder, auditory attention measures are not typically included in APD diagnostic protocols. Therefore, incorporating attention assessment tests into the battery of tests for individuals with auditory processing disorder will be beneficial for obtaining useful insights into their attentional abilities.

Keywords: auditory processing disorder, auditory attention, central auditory processing disorder, top-down pathway

Procedia PDF Downloads 26
4434 Evaluation Means in English and Russian Academic Discourse: Through Comparative Analysis towards Translation

Authors: Albina Vodyanitskaya

Abstract:

Given the culture- and language-specific nature of evaluation, this phenomenon is widely studied around the linguistic world and may be regarded as a challenge for translators. Evaluation penetrates all the levels of a scientific text, influences its composition and the reader’s attitude towards the information presented. One of the most challenging and rarely studied phenomena is the individual style of the scientific writer, which is mostly reflected in the use of evaluative language means. The evaluative and expressive potential of a scientific text is becoming more and more welcoming area for researchers, which stems in the shift towards anthropocentric paradigm in linguistics. Other reasons include: the cognitive and psycholinguistic processes that accompany knowledge acquisition, a genre-determined nature of a scientific text, the increasing public concern about the quality of scientific papers and some such. One more important issue, is the fact that linguists all over the world still argue about the definition of evaluation and its functions in the text. The author analyzes various approaches towards the study of evaluation and scientific texts. A comparative analysis of English and Russian dissertations and other scientific papers with regard to evaluative language means reveals major differences and similarities between English and Russian scientific style. Though standardized and genre-specific, English scientific texts contain more figurative and expressive evaluative means than the Russian ones, which should be taken into account while translating scientific papers. The processes that evaluation undergoes while being expressed by means of a target language are also analyzed. The author offers a target-language-dependent strategy for the translation of evaluation in English and Russian scientific texts. The findings may contribute to the theory and practice of translation and can increase scientific writers’ awareness of inter-language and intercultural differences in evaluative language means.

Keywords: academic discourse, evaluation, scientific text, scientific writing, translation

Procedia PDF Downloads 322
4433 Leaf Image Processing: Review

Authors: T. Vijayashree, A. Gopal

Abstract:

The aim of the work is to classify and authenticate medicinal plant materials and herbs widely used for Indian herbal medicinal preparation. The quality and authenticity of these raw materials are to be ensured for the preparation of herbal medicines. These raw materials are to be carefully screened, analyzed and documented due to mistaken of look-alike materials which do not have medicinal characteristics.

Keywords: authenticity, standardization, principal component analysis, imaging processing, signal processing

Procedia PDF Downloads 215
4432 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 186
4431 Communication through Technology: SMS Taking Most of the Time Impacting the Standard English

Authors: Nazia Sulemna, Sadia Gul

Abstract:

With the invade of mobile phones text messaging has become a popular medium of communication. Its users are multiplying with every passing day. Its use is not only limites to informal but to formal communication as well. Students are the advent users of mobile phones and of SMS as well. The present study manifests the fact that students are practicing SMS for a number of reasons and a good amount of time is spent upon it which is resulting in typographical features, graphones and rebus writing. Data was collected through questionnaires and came to the conclusion that its effect is obvious in the L2 users and in exam as well.

Keywords: text messaging, technology, exams, formal writing

Procedia PDF Downloads 710
4430 A Methodology for Developing New Technology Ideas to Avoid Patent Infringement: F-Term Based Patent Analysis

Authors: Kisik Song, Sungjoo Lee

Abstract:

With the growing importance of intangible assets recently, the impact of patent infringement on the business of a company has become more evident. Accordingly, it is essential for firms to estimate the risk of patent infringement risk before developing a technology and create new technology ideas to avoid the risk. Recognizing the needs, several attempts have been made to help develop new technology opportunities and most of them have focused on identifying emerging vacant technologies from patent analysis. In these studies, the IPC (International Patent Classification) system or keywords from text-mining application to patent documents was generally used to define vacant technologies. Unlike those studies, this study adopted F-term, which classifies patent documents according to the technical features of the inventions described in them. Since the technical features are analyzed by various perspectives by F-term, F-term provides more detailed information about technologies compared to IPC while more systematic information compared to keywords. Therefore, if well utilized, it can be a useful guideline to create a new technology idea. Recognizing the potential of F-term, this paper aims to suggest a novel approach to developing new technology ideas to avoid patent infringement based on F-term. For this purpose, we firstly collected data about F-term and then applied text-mining to the descriptions about classification criteria and attributes. From the text-mining results, we could identify other technologies with similar technical features of the existing one, the patented technology. Finally, we compare the technologies and extract the technical features that are commonly used in other technologies but have not been used in the existing one. These features are presented in terms of “purpose”, “function”, “structure”, “material”, “method”, “processing and operation procedure” and “control means” and so are useful for creating new technology ideas that help avoid infringing patent rights of other companies. Theoretically, this is one of the earliest attempts to adopt F-term to patent analysis; the proposed methodology can show how to best take advantage of F-term with the wealth of technical information. In practice, the proposed methodology can be valuable in the ideation process for successful product and service innovation without infringing the patents of other companies.

Keywords: patent infringement, new technology ideas, patent analysis, F-term

Procedia PDF Downloads 245
4429 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 123
4428 Influence of Processing Regime and Contaminants on the Properties of Postconsumer Thermoplastics

Authors: Fares Alsewailem

Abstract:

Material recycling of thermoplastic waste offers practical solution for municipal solid waste reduction. Post-consumer plastics such as polyethylene (PE), polyethyleneterephtalate (PET), and polystyrene (PS) may be separated from each other by physical methods such as density difference and hence processed as single plastic, however one should be cautious about the contaminants presence in the waste stream inform of paper, glue, etc. since these articles even in trace amount may deteriorate properties of the recycled plastics especially the mechanical properties. furthermore, melt processing methods used to recycle thermoplastics such as extrusion and compression molding may induce degradation of some of the recycled plastics such as PET and PS. In this research, it is shown that care should be taken when processing recycled plastics by melt processing means in two directions, first contaminants should be extremely minimized, and secondly melt processing steps should also be minimum.

Keywords: Recycling, PET, PS, HDPE, mechanical

Procedia PDF Downloads 259
4427 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 159
4426 Google Translate: AI Application

Authors: Shaima Almalhan, Lubna Shukri, Miriam Talal, Safaa Teskieh

Abstract:

Since artificial intelligence is a rapidly evolving topic that has had a significant impact on technical growth and innovation, this paper examines people's awareness, use, and engagement with the Google Translate application. To see how familiar aware users are with the app and its features, quantitative and qualitative research was conducted. The findings revealed that consumers have a high level of confidence in the application and how far people they benefit from this sort of innovation and how convenient it makes communication.

Keywords: artificial intelligence, google translate, speech recognition, language translation, camera translation, speech to text, text to speech

Procedia PDF Downloads 119
4425 Graph Similarity: Algebraic Model and Its Application to Nonuniform Signal Processing

Authors: Nileshkumar Vishnav, Aditya Tatu

Abstract:

A recent approach of representing graph signals and graph filters as polynomials is useful for graph signal processing. In this approach, the adjacency matrix plays pivotal role; instead of the more common approach involving graph-Laplacian. In this work, we follow the adjacency matrix based approach and corresponding algebraic signal model. We further expand the theory and introduce the concept of similarity of two graphs. The similarity of graphs is useful in that key properties (such as filter-response, algebra related to graph) get transferred from one graph to another. We demonstrate potential applications of the relation between two similar graphs, such as nonuniform filter design, DTMF detection and signal reconstruction.

Keywords: graph signal processing, algebraic signal processing, graph similarity, isospectral graphs, nonuniform signal processing

Procedia PDF Downloads 318