Search results for: BERT–BiLSTM–Attention
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4069

Search results for: BERT–BiLSTM–Attention

4069 Document-level Sentiment Analysis: An Exploratory Case Study of Low-resource Language Urdu

Authors: Ammarah Irum, Muhammad Ali Tahir

Abstract:

Document-level sentiment analysis in Urdu is a challenging Natural Language Processing (NLP) task due to the difficulty of working with lengthy texts in a language with constrained resources. Deep learning models, which are complex neural network architectures, are well-suited to text-based applications in addition to data formats like audio, image, and video. To investigate the potential of deep learning for Urdu sentiment analysis, we implemented five different deep learning models, including Bidirectional Long Short Term Memory (BiLSTM), Convolutional Neural Network (CNN), Convolutional Neural Network with Bidirectional Long Short Term Memory (CNN-BiLSTM), and Bidirectional Encoder Representation from Transformer (BERT). In this study, we developed a hybrid deep learning model called BiLSTM-Single Layer Multi Filter Convolutional Neural Network (BiLSTM-SLMFCNN) by fusing BiLSTM and CNN architecture. The proposed and baseline techniques are applied on Urdu Customer Support data set and IMDB Urdu movie review data set by using pre-trained Urdu word embedding that are suitable for sentiment analysis at the document level. Results of these techniques are evaluated and our proposed model outperforms all other deep learning techniques for Urdu sentiment analysis. BiLSTM-SLMFCNN outperformed the baseline deep learning models and achieved 83%, 79%, 83% and 94% accuracy on small, medium and large sized IMDB Urdu movie review data set and Urdu Customer Support data set respectively.

Keywords: urdu sentiment analysis, deep learning, natural language processing, opinion mining, low-resource language

Procedia PDF Downloads 28
4068 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing

Procedia PDF Downloads 91
4067 An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory

Authors: Yin Yuanling

Abstract:

A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task.

Keywords: event relations, deep learning, DFRNN models, bi-directional long and short-term memory networks

Procedia PDF Downloads 87
4066 BERT-Based Chinese Coreference Resolution

Authors: Li Xiaoge, Wang Chaodong

Abstract:

We introduce the first Chinese Coreference Resolution Model based on BERT (CCRM-BERT) and show that it significantly outperforms all previous work. The key idea is to consider the features of the mention, such as part of speech, width of spans, distance between spans, etc. And the influence of each features on the model is analyzed. The model computes mention embeddings that combine BERT with features. Compared to the existing state-of-the-art span-ranking approach, our model significantly improves accuracy on the Chinese OntoNotes benchmark.

Keywords: BERT, coreference resolution, deep learning, nature language processing

Procedia PDF Downloads 165
4065 A Context-Centric Chatbot for Cryptocurrency Using the Bidirectional Encoder Representations from Transformers Neural Networks

Authors: Qitao Xie, Qingquan Zhang, Xiaofei Zhang, Di Tian, Ruixuan Wen, Ting Zhu, Ping Yi, Xin Li

Abstract:

Inspired by the recent movement of digital currency, we are building a question answering system concerning the subject of cryptocurrency using Bidirectional Encoder Representations from Transformers (BERT). The motivation behind this work is to properly assist digital currency investors by directing them to the corresponding knowledge bases that can offer them help and increase the querying speed. BERT, one of newest language models in natural language processing, was investigated to improve the quality of generated responses. We studied different combinations of hyperparameters of the BERT model to obtain the best fit responses. Further, we created an intelligent chatbot for cryptocurrency using BERT. A chatbot using BERT shows great potential for the further advancement of a cryptocurrency market tool. We show that the BERT neural networks generalize well to other tasks by applying it successfully to cryptocurrency.

Keywords: bidirectional encoder representations from transformers, BERT, chatbot, cryptocurrency, deep learning

Procedia PDF Downloads 104
4064 A Grey-Box Text Attack Framework Using Explainable AI

Authors: Esther Chiramal, Kelvin Soh Boon Kai

Abstract:

Explainable AI is a strong strategy implemented to understand complex black-box model predictions in a human-interpretable language. It provides the evidence required to execute the use of trustworthy and reliable AI systems. On the other hand, however, it also opens the door to locating possible vulnerabilities in an AI model. Traditional adversarial text attack uses word substitution, data augmentation techniques, and gradient-based attacks on powerful pre-trained Bidirectional Encoder Representations from Transformers (BERT) variants to generate adversarial sentences. These attacks are generally white-box in nature and not practical as they can be easily detected by humans e.g., Changing the word from “Poor” to “Rich”. We proposed a simple yet effective Grey-box cum Black-box approach that does not require the knowledge of the model while using a set of surrogate Transformer/BERT models to perform the attack using Explainable AI techniques. As Transformers are the current state-of-the-art models for almost all Natural Language Processing (NLP) tasks, an attack generated from BERT1 is transferable to BERT2. This transferability is made possible due to the attention mechanism in the transformer that allows the model to capture long-range dependencies in a sequence. Using the power of BERT generalisation via attention, we attempt to exploit how transformers learn by attacking a few surrogate transformer variants which are all based on a different architecture. We demonstrate that this approach is highly effective to generate semantically good sentences by changing as little as one word that is not detectable by humans while still fooling other BERT models.

Keywords: BERT, explainable AI, Grey-box text attack, transformer

Procedia PDF Downloads 103
4063 D3Advert: Data-Driven Decision Making for Ad Personalization through Personality Analysis Using BiLSTM Network

Authors: Sandesh Achar

Abstract:

Personalized advertising holds greater potential for higher conversion rates compared to generic advertisements. However, its widespread application in the retail industry faces challenges due to complex implementation processes. These complexities impede the swift adoption of personalized advertisement on a large scale. Personalized advertisement, being a data-driven approach, necessitates consumer-related data, adding to its complexity. This paper introduces an innovative data-driven decision-making framework, D3Advert, which personalizes advertisements by analyzing personalities using a BiLSTM network. The framework utilizes the Myers–Briggs Type Indicator (MBTI) dataset for development. The employed BiLSTM network, specifically designed and optimized for D3Advert, classifies user personalities into one of the sixteen MBTI categories based on their social media posts. The classification accuracy is 86.42%, with precision, recall, and F1-Score values of 85.11%, 84.14%, and 83.89%, respectively. The D3Advert framework personalizes advertisements based on these personality classifications. Experimental implementation and performance analysis of D3Advert demonstrate a 40% improvement in impressions. D3Advert’s innovative and straightforward approach has the potential to transform personalized advertising and foster widespread personalized advertisement adoption in marketing.

Keywords: personalized advertisement, deep Learning, MBTI dataset, BiLSTM network, NLP.

Procedia PDF Downloads 8
4062 Exploring Bidirectional Encoder Representations from the Transformers’ Capabilities to Detect English Preposition Errors

Authors: Dylan Elliott, Katya Pertsova

Abstract:

Preposition errors are some of the most common errors created by L2 speakers. In addition, improving error correction and detection methods remains an open issue in the realm of Natural Language Processing (NLP). This research investigates whether the bidirectional encoder representations from the transformers model (BERT) have the potential to correct preposition errors accurately enough to be useful in error correction software. This research finds that BERT performs strongly when the scope of its error correction is limited to preposition choice. The researchers used an open-source BERT model and over three hundred thousand edited sentences from Wikipedia, tagged for part of speech, where only a preposition edit had occurred. To test BERT’s ability to detect errors, a technique known as multi-level masking was used to generate suggestions based on sentence context for every prepositional environment in the test data. These suggestions were compared with the original errors in the data and their known corrections to evaluate BERT’s performance. The suggestions were further analyzed to determine if BERT more often agreed with the judgements of the Wikipedia editors. Both the untrained and fined-tuned models were compared. Finetuning led to a greater rate of error-detection which significantly improved recall, but lowered precision due to an increase in false positives or falsely flagged errors. However, in most cases, these false positives were not errors in preposition usage but merely cases where more than one preposition was possible. Furthermore, when BERT correctly identified an error, the model largely agreed with the Wikipedia editors, suggesting that BERT’s ability to detect misused prepositions is better than previously believed. To evaluate to what extent BERT’s false positives were grammatical suggestions, we plan to do a further crowd-sourcing study to test the grammaticality of BERT’s suggested sentence corrections against native speakers’ judgments.

Keywords: BERT, grammatical error correction, preposition error detection, prepositions

Procedia PDF Downloads 108
4061 Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach

Authors: Adeep Hande, Shubham Agarwal

Abstract:

This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.

Keywords: large language models, semi-supervised learning, sexism detection, data sparsity

Procedia PDF Downloads 28
4060 Relation between Low Thermal Stress and Antioxidant Enzymes Activity in a Sweetening Plant: Stevia Rebaudiana Bert

Authors: T. Bettaieb, S. Soufi, S. Arbaoui

Abstract:

Stevia rebaudiana Bert. is a natural sweet plant. The leaves contain diterpene glycosides stevioside, rebaudiosides A-F, steviolbioside and dulcoside, which are responsible for its sweet taste and have commercial value all over the world as sugar substitute in foods and medicines. Stevia rebaudiana Bert. is sensitive temperature lower than 9°C. The possibility of its outdoor culture in Tunisian conditions demand genotypes tolerant to low temperatures. In order to evaluate the low temperature tolerance of eight genotypes of Stevia rebaudiana, the activities of superoxide dismutase (SOD), ascorbate peroxidase (APX) and catalases (CAT) were measured. Before carrying out the analyses, three genotypes of Stevia were exposed for 1 month at a temperature regime of 18°C during the day and 7°C at night similar to winter conditions in Tunisia. In response to the stress generated by low temperature, antioxidant enzymes activity revealed on native gel and quantified by spectrophotometry showed variable levels according to their degree of tolerance to low temperatures.

Keywords: chilling tolerance, enzymatic activity, stevia rebaudiana bert, low thermal stress

Procedia PDF Downloads 405
4059 One-Shot Text Classification with Multilingual-BERT

Authors: Hsin-Yang Wang, K. M. A. Salam, Ying-Jia Lin, Daniel Tan, Tzu-Hsuan Chou, Hung-Yu Kao

Abstract:

Detecting user intent from natural language expression has a wide variety of use cases in different natural language processing applications. Recently few-shot training has a spike of usage on commercial domains. Due to the lack of significant sample features, the downstream task performance has been limited or leads to an unstable result across different domains. As a state-of-the-art method, the pre-trained BERT model gathering the sentence-level information from a large text corpus shows improvement on several NLP benchmarks. In this research, we are proposing a method to change multi-class classification tasks into binary classification tasks, then use the confidence score to rank the results. As a language model, BERT performs well on sequence data. In our experiment, we change the objective from predicting labels into finding the relations between words in sequence data. Our proposed method achieved 71.0% accuracy in the internal intent detection dataset and 63.9% accuracy in the HuffPost dataset. Acknowledgment: This work was supported by NCKU-B109-K003, which is the collaboration between National Cheng Kung University, Taiwan, and SoftBank Corp., Tokyo.

Keywords: OSML, BERT, text classification, one shot

Procedia PDF Downloads 73
4058 Detecting Covid-19 Fake News Using Deep Learning Technique

Authors: AnjalI A. Prasad

Abstract:

Nowadays, social media played an important role in spreading misinformation or fake news. This study analyzes the fake news related to the COVID-19 pandemic spread in social media. This paper aims at evaluating and comparing different approaches that are used to mitigate this issue, including popular deep learning approaches, such as CNN, RNN, LSTM, and BERT algorithm for classification. To evaluate models’ performance, we used accuracy, precision, recall, and F1-score as the evaluation metrics. And finally, compare which algorithm shows better result among the four algorithms.

Keywords: BERT, CNN, LSTM, RNN

Procedia PDF Downloads 164
4057 A Review of Research on Pre-training Technology for Natural Language Processing

Authors: Moquan Gong

Abstract:

In recent years, with the rapid development of deep learning, pre-training technology for natural language processing has made great progress. The early field of natural language processing has long used word vector methods such as Word2Vec to encode text. These word vector methods can also be regarded as static pre-training techniques. However, this context-free text representation brings very limited improvement to subsequent natural language processing tasks and cannot solve the problem of word polysemy. ELMo proposes a context-sensitive text representation method that can effectively handle polysemy problems. Since then, pre-training language models such as GPT and BERT have been proposed one after another. Among them, the BERT model has significantly improved its performance on many typical downstream tasks, greatly promoting the technological development in the field of natural language processing, and has since entered the field of natural language processing. The era of dynamic pre-training technology. Since then, a large number of pre-trained language models based on BERT and XLNet have continued to emerge, and pre-training technology has become an indispensable mainstream technology in the field of natural language processing. This article first gives an overview of pre-training technology and its development history, and introduces in detail the classic pre-training technology in the field of natural language processing, including early static pre-training technology and classic dynamic pre-training technology; and then briefly sorts out a series of enlightening technologies. Pre-training technology, including improved models based on BERT and XLNet; on this basis, analyze the problems faced by current pre-training technology research; finally, look forward to the future development trend of pre-training technology.

Keywords: natural language processing, pre-training, language model, word vectors

Procedia PDF Downloads 11
4056 Bidirectional Encoder Representations from Transformers Sentiment Analysis Applied to Three Presidential Pre-Candidates in Costa Rica

Authors: Félix David Suárez Bonilla

Abstract:

A sentiment analysis service to detect polarity (positive, neural, and negative), based on transfer learning, was built using a Spanish version of BERT and applied to tweets written in Spanish. The dataset that was used consisted of 11975 reviews, which were extracted from Google Play using the google-play-scrapper package. The BETO trained model used: the AdamW optimizer, a batch size of 16, a learning rate of 2x10⁻⁵ and 10 epochs. The system was tested using tweets of three presidential pre-candidates from Costa Rica. The system was finally validated using human labeled examples, achieving an accuracy of 83.3%.

Keywords: NLP, transfer learning, BERT, sentiment analysis, social media, opinion mining

Procedia PDF Downloads 131
4055 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models

Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev

Abstract:

Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.

Keywords: NLP, benchmak, bert, vectorization

Procedia PDF Downloads 13
4054 A BERT-Based Model for Financial Social Media Sentiment Analysis

Authors: Josiel Delgadillo, Johnson Kinyua, Charles Mutigwe

Abstract:

The purpose of sentiment analysis is to determine the sentiment strength (e.g., positive, negative, neutral) from a textual source for good decision-making. Natural language processing in domains such as financial markets requires knowledge of domain ontology, and pre-trained language models, such as BERT, have made significant breakthroughs in various NLP tasks by training on large-scale un-labeled generic corpora such as Wikipedia. However, sentiment analysis is a strong domain-dependent task. The rapid growth of social media has given users a platform to share their experiences and views about products, services, and processes, including financial markets. StockTwits and Twitter are social networks that allow the public to express their sentiments in real time. Hence, leveraging the success of unsupervised pre-training and a large amount of financial text available on social media platforms could potentially benefit a wide range of financial applications. This work is focused on sentiment analysis using social media text on platforms such as StockTwits and Twitter. To meet this need, SkyBERT, a domain-specific language model pre-trained and fine-tuned on financial corpora, has been developed. The results show that SkyBERT outperforms current state-of-the-art models in financial sentiment analysis. Extensive experimental results demonstrate the effectiveness and robustness of SkyBERT.

Keywords: BERT, financial markets, Twitter, sentiment analysis

Procedia PDF Downloads 116
4053 Topic Sentiments toward the COVID-19 Vaccine on Twitter

Authors: Melissa Vang, Raheyma Khan, Haihua Chen

Abstract:

The coronavirus disease 2019 (COVID‐19) pandemic has changed people's lives from all over the world. More people have turned to Twitter to engage online and discuss the COVID-19 vaccine. This study aims to present a text mining approach to identify people's attitudes towards the COVID-19 vaccine on Twitter. To achieve this purpose, we collected 54,268 COVID-19 vaccine tweets from September 01, 2020, to November 01, 2020, then the BERT model is used for the sentiment and topic analysis. The results show that people had more negative than positive attitudes about the vaccine, and countries with an increasing number of confirmed cases had a higher percentage of negative attitudes. Additionally, the topics discussed in positive and negative tweets are different. The tweet datasets can be helpful to information professionals to inform the public about vaccine-related informational resources. Our findings may have implications for understanding people's cognitions and feelings about the vaccine.

Keywords: BERT, COVID-19 vaccine, sentiment analysis, topic modeling

Procedia PDF Downloads 113
4052 Feature Engineering Based Detection of Buffer Overflow Vulnerability in Source Code Using Deep Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

One of the most important challenges in the field of software code audit is the presence of vulnerabilities in software source code. Every year, more and more software flaws are found, either internally in proprietary code or revealed publicly. These flaws are highly likely exploited and lead to system compromise, data leakage, or denial of service. C and C++ open-source code are now available in order to create a largescale, machine-learning system for function-level vulnerability identification. We assembled a sizable dataset of millions of opensource functions that point to potential exploits. We developed an efficient and scalable vulnerability detection method based on deep neural network models that learn features extracted from the source codes. The source code is first converted into a minimal intermediate representation to remove the pointless components and shorten the dependency. Moreover, we keep the semantic and syntactic information using state-of-the-art word embedding algorithms such as glove and fastText. The embedded vectors are subsequently fed into deep learning networks such as LSTM, BilSTM, LSTM-Autoencoder, word2vec, BERT, and GPT-2 to classify the possible vulnerabilities. Furthermore, we proposed a neural network model which can overcome issues associated with traditional neural networks. Evaluation metrics such as f1 score, precision, recall, accuracy, and total execution time have been used to measure the performance. We made a comparative analysis between results derived from features containing a minimal text representation and semantic and syntactic information. We found that all of the deep learning models provide comparatively higher accuracy when we use semantic and syntactic information as the features but require higher execution time as the word embedding the algorithm puts on a bit of complexity to the overall system.

Keywords: cyber security, vulnerability detection, neural networks, feature extraction

Procedia PDF Downloads 40
4051 Evaluation of Modern Natural Language Processing Techniques via Measuring a Company's Public Perception

Authors: Burak Oksuzoglu, Savas Yildirim, Ferhat Kutlu

Abstract:

Opinion mining (OM) is one of the natural language processing (NLP) problems to determine the polarity of opinions, mostly represented on a positive-neutral-negative axis. The data for OM is usually collected from various social media platforms. In an era where social media has considerable control over companies’ futures, it’s worth understanding social media and taking actions accordingly. OM comes to the fore here as the scale of the discussion about companies increases, and it becomes unfeasible to gauge opinion on individual levels. Thus, the companies opt to automize this process by applying machine learning (ML) approaches to their data. For the last two decades, OM or sentiment analysis (SA) has been mainly performed by applying ML classification algorithms such as support vector machines (SVM) and Naïve Bayes to a bag of n-gram representations of textual data. With the advent of deep learning and its apparent success in NLP, traditional methods have become obsolete. Transfer learning paradigm that has been commonly used in computer vision (CV) problems started to shape NLP approaches and language models (LM) lately. This gave a sudden rise to the usage of the pretrained language model (PTM), which contains language representations that are obtained by training it on the large datasets using self-supervised learning objectives. The PTMs are further fine-tuned by a specialized downstream task dataset to produce efficient models for various NLP tasks such as OM, NER (Named-Entity Recognition), Question Answering (QA), and so forth. In this study, the traditional and modern NLP approaches have been evaluated for OM by using a sizable corpus belonging to a large private company containing about 76,000 comments in Turkish: SVM with a bag of n-grams, and two chosen pre-trained models, multilingual universal sentence encoder (MUSE) and bidirectional encoder representations from transformers (BERT). The MUSE model is a multilingual model that supports 16 languages, including Turkish, and it is based on convolutional neural networks. The BERT is a monolingual model in our case and transformers-based neural networks. It uses a masked language model and next sentence prediction tasks that allow the bidirectional training of the transformers. During the training phase of the architecture, pre-processing operations such as morphological parsing, stemming, and spelling correction was not used since the experiments showed that their contribution to the model performance was found insignificant even though Turkish is a highly agglutinative and inflective language. The results show that usage of deep learning methods with pre-trained models and fine-tuning achieve about 11% improvement over SVM for OM. The BERT model achieved around 94% prediction accuracy while the MUSE model achieved around 88% and SVM did around 83%. The MUSE multilingual model shows better results than SVM, but it still performs worse than the monolingual BERT model.

Keywords: BERT, MUSE, opinion mining, pretrained language model, SVM, Turkish

Procedia PDF Downloads 107
4050 Legal Judgment Prediction through Indictments via Data Visualization in Chinese

Authors: Kuo-Chun Chien, Chia-Hui Chang, Ren-Der Sun

Abstract:

Legal Judgment Prediction (LJP) is a subtask for legal AI. Its main purpose is to use the facts of a case to predict the judgment result. In Taiwan's criminal procedure, when prosecutors complete the investigation of the case, they will decide whether to prosecute the suspect and which article of criminal law should be used based on the facts and evidence of the case. In this study, we collected 305,240 indictments from the public inquiry system of the procuratorate of the Ministry of Justice, which included 169 charges and 317 articles from 21 laws. We take the crime facts in the indictments as the main input to jointly learn the prediction model for law source, article, and charge simultaneously based on the pre-trained Bert model. For single article cases where the frequency of the charge and article are greater than 50, the prediction performance of law sources, articles, and charges reach 97.66, 92.22, and 60.52 macro-f1, respectively. To understand the big performance gap between articles and charges, we used a bipartite graph to visualize the relationship between the articles and charges, and found that the reason for the poor prediction performance was actually due to the wording precision. Some charges use the simplest words, while others may include the perpetrator or the result to make the charges more specific. For example, Article 284 of the Criminal Law may be indicted as “negligent injury”, "negligent death”, "business injury", "driving business injury", or "non-driving business injury". As another example, Article 10 of the Drug Hazard Control Regulations can be charged as “Drug Control Regulations” or “Drug Hazard Control Regulations”. In order to solve the above problems and more accurately predict the article and charge, we plan to include the article content or charge names in the input, and use the sentence-pair classification method for question-answer problems in the BERT model to improve the performance. We will also consider a sequence-to-sequence approach to charge prediction.

Keywords: legal judgment prediction, deep learning, natural language processing, BERT, data visualization

Procedia PDF Downloads 88
4049 Benefits of Therapeutic Climbing on Multiple Components of Attention in Attention Deficit Hyperactivity Disorder Children

Authors: Elaheh Hosseini, Otmar Bock, Monika Thomas

Abstract:

The purpose of the present study was to determine the effect of climbing therapy on the components of attention of children with attention-deficit hyperactivity disorder (ADHD). Forty children with ADHD were assigned to either an intervention group or a control group. The exercise group participated in a climbing therapy program for ten weeks, whereas no intervention was administered to the control group. All two groups were then assessed with the same battery of attention tests used in our earlier study. We found that compared to the ‘intervention’ group, performance was higher in the ‘control’ group on tests of sustained, divided and distributed attention, on all four tests. The intervention group showed a significant improvement in components of attention after ten weeks. From this we conclude that climbing therapy can improve the attention of children with ADHD and can be considered as a promising intervention and a standalone treatment for children with ADHD.

Keywords: ADHD, climbing therapy, distributed attention, divided attention, selective attention, sustained attention

Procedia PDF Downloads 126
4048 Investigating the Relationship and Interaction between Auditory Processing Disorder and Auditory Attention

Authors: Amirreza Razzaghipour Sorkhab

Abstract:

The exploration of the connection between cognition and Auditory Processing Disorder (APD) holds significant value. Individuals with APD experience challenges in processing auditory information through the central auditory nervous system's varied pathways. Understanding the importance of auditory attention in individuals with APD, as well as the primary diagnostic tools such as language and auditory attention tests, highlights the critical need for assessing their auditory attention abilities. While not all children with Auditory Processing Disorder (APD) show deficits in auditory attention, there are often deficiencies in cognitive and attentional performance. The link between various types of attention deficits and APD suggests impairments in sustained and divided auditory attention. Research into the origins of APD should also encompass higher-level processes, such as auditory attention. It is evident that investigating the interaction between APD and auditory and cognitive functions holds significant value. Furthermore, it was demonstrated that APD tests may be influenced by cognitive factors, but despite signs of auditory attention interaction with auditory processing skills and the influence of cognitive factors on tests for this disorder, auditory attention measures are not typically included in APD diagnostic protocols. Therefore, incorporating attention assessment tests into the battery of tests for individuals with auditory processing disorder will be beneficial for obtaining useful insights into their attentional abilities.

Keywords: auditory processing disorder, auditory attention, central auditory processing disorder, top-down pathway

Procedia PDF Downloads 24
4047 A Framework for Chinese Domain-Specific Distant Supervised Named Entity Recognition

Authors: Qin Long, Li Xiaoge

Abstract:

The Knowledge Graphs have now become a new form of knowledge representation. However, there is no consensus in regard to a plausible and definition of entities and relationships in the domain-specific knowledge graph. Further, in conjunction with several limitations and deficiencies, various domain-specific entities and relationships recognition approaches are far from perfect. Specifically, named entity recognition in Chinese domain is a critical task for the natural language process applications. However, a bottleneck problem with Chinese named entity recognition in new domains is the lack of annotated data. To address this challenge, a domain distant supervised named entity recognition framework is proposed. The framework is divided into two stages: first, the distant supervised corpus is generated based on the entity linking model of graph attention neural network; secondly, the generated corpus is trained as the input of the distant supervised named entity recognition model to train to obtain named entities. The link model is verified in the ccks2019 entity link corpus, and the F1 value is 2% higher than that of the benchmark method. The re-pre-trained BERT language model is added to the benchmark method, and the results show that it is more suitable for distant supervised named entity recognition tasks. Finally, it is applied in the computer field, and the results show that this framework can obtain domain named entities.

Keywords: distant named entity recognition, entity linking, knowledge graph, graph attention neural network

Procedia PDF Downloads 62
4046 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 57
4045 Self-Supervised Learning for Hate-Speech Identification

Authors: Shrabani Ghosh

Abstract:

Automatic offensive language detection in social media has become a stirring task in today's NLP. Manual Offensive language detection is tedious and laborious work where automatic methods based on machine learning are only alternatives. Previous works have done sentiment analysis over social media in different ways such as supervised, semi-supervised, and unsupervised manner. Domain adaptation in a semi-supervised way has also been explored in NLP, where the source domain and the target domain are different. In domain adaptation, the source domain usually has a large amount of labeled data, while only a limited amount of labeled data is available in the target domain. Pretrained transformers like BERT, RoBERTa models are fine-tuned to perform text classification in an unsupervised manner to perform further pre-train masked language modeling (MLM) tasks. In previous work, hate speech detection has been explored in Gab.ai, which is a free speech platform described as a platform of extremist in varying degrees in online social media. In domain adaptation process, Twitter data is used as the source domain, and Gab data is used as the target domain. The performance of domain adaptation also depends on the cross-domain similarity. Different distance measure methods such as L2 distance, cosine distance, Maximum Mean Discrepancy (MMD), Fisher Linear Discriminant (FLD), and CORAL have been used to estimate domain similarity. Certainly, in-domain distances are small, and between-domain distances are expected to be large. The previous work finding shows that pretrain masked language model (MLM) fine-tuned with a mixture of posts of source and target domain gives higher accuracy. However, in-domain performance of the hate classifier on Twitter data accuracy is 71.78%, and out-of-domain performance of the hate classifier on Gab data goes down to 56.53%. Recently self-supervised learning got a lot of attention as it is more applicable when labeled data are scarce. Few works have already been explored to apply self-supervised learning on NLP tasks such as sentiment classification. Self-supervised language representation model ALBERTA focuses on modeling inter-sentence coherence and helps downstream tasks with multi-sentence inputs. Self-supervised attention learning approach shows better performance as it exploits extracted context word in the training process. In this work, a self-supervised attention mechanism has been proposed to detect hate speech on Gab.ai. This framework initially classifies the Gab dataset in an attention-based self-supervised manner. On the next step, a semi-supervised classifier trained on the combination of labeled data from the first step and unlabeled data. The performance of the proposed framework will be compared with the results described earlier and also with optimized outcomes obtained from different optimization techniques.

Keywords: attention learning, language model, offensive language detection, self-supervised learning

Procedia PDF Downloads 75
4044 Exploring Relationship between Attention and Consciousness

Authors: Aarushi Agarwal, Tara Singh, Anju Lata Singh, Trayambak Tiwari, Indramani Lal Singh

Abstract:

The existing interdependent relationship between attention and consciousness has been put to debate since long. To testify the nature, dual-task paradigm has been used to simultaneously manipulate awareness and attention. With central discrimination task which is attentional demanding, participants also perform simple discrimination task in the periphery in near absence of attention. Individual-based analysis of performance accuracy in single and dual condition showed and above chance level performance i.e. more than 80%. In order to widen the understanding of extent of discrimination carried in near absence of attention, natural image and its geometric equivalent shape were presented in the periphery; synthetic objects accounted to lower level of performance than natural objects in dual condition. The gaze plot and heatmap indicate that peripheral performance do not necessarily involve saccade every time, verifying the discrimination in the periphery was in near absence of attention. Thus our studies show an interdependent nature of attention and awareness.

Keywords: attention, awareness, dual task paradigm, natural and geometric images

Procedia PDF Downloads 480
4043 Attention Problems among Adolescents: Examining Educational Environments

Authors: Zhidong Zhang, Zhi-Chao Zhang, Georgianna Duarte

Abstract:

This study investigated the attention problems with the instrument of Achenbach System of Empirically Based Assessment (ASEBA). Two thousand eight hundred and ninety-four adolescents were surveyed by using a stratified sampling method. We examined the relationships between relevant background variables and attention problems. Multiple regression models were applied to analyze the data. Relevant variables such as sports activities, hobbies, age, grade and the number of close friends were included in this study as predictive variables. The analysis results indicated that educational environments and extracurricular activities are important factors which influence students’ attention problems.

Keywords: adolescents, ASEBA, attention problems, educational environments, stratified sampling

Procedia PDF Downloads 240
4042 Investigating the Effect of the Pedagogical Agent on Visual Attention in Attention Deficit Hyperactivity Disorder Students

Authors: Nasrin Mohammadhasani, Rosa Angela Fabio

Abstract:

The attention to relevance information is the key element for learning. Otherwise, Attention Deficit Hyperactivity Disorder (ADHD) students have a fuzzy visual pattern that prevents them to attention and remember learning subject. The present study aimed to test the hypothesis that the presence of a pedagogical agent can effectively support ADHD learner's attention and learning outcomes in a multimedia learning environment. The learning environment was integrated with a pedagogical agent, named Koosha as a social peer. This study employed a pretest and posttest experimental design with control group. The statistical population was 30 boys students, age 10-11 with ADHD that randomly assigned to learn with/without an agent in well designed environment for mathematic. The results suggested that experimental and control groups show a significant difference in time when they participated and mathematics achievement. According to this research, using the pedagogical agent can enhance learning of ADHD students by gaining and guiding their attention to relevance information part on display, so it can be considered as asocial cue that provides theme cognitive supports.

Keywords: attention, computer assisted instruction, multimedia learning environment, pedagogical agent

Procedia PDF Downloads 270
4041 Affirming Students’ Attention and Perceptions on Prezi Presentation via Eye Tracking System

Authors: Mona Masood, Norshazlina Shaik Othman

Abstract:

The purpose of this study was to investigate graduate students’ visual attention and perceptions of a Prezi presentation. Ten post-graduate master students were presented with a Prezi presentation at the Centre for Instructional Technology and Multimedia, Universiti Sains Malaysia (USM). The eye movement indicators such as dwell time, average fixation on the areas of interests, heat maps and focus maps were abstracted to indicate the students’ visual attention. Descriptive statistics was employed to analyze the students’ perception of the Prezi presentation in terms of text, slide design, images, layout and overall presentation. The result revealed that the students paid more attention to the text followed by the images and sub heading presented through the Prezi presentation.

Keywords: eye tracking, Prezi, visual attention, visual perception

Procedia PDF Downloads 400
4040 An Experiment Research on the Effect of Brain-Break in the Classroom on Elementary School Students’ Selective Attention

Authors: Hui Liu, Xiaozan Wang, Jiarong Zhong, Ziming Shao

Abstract:

Introduction: Related research shows that students don’t concentrate on teacher’s speaking in the classroom. The d2 attention test is a time-limited test about selective attention. The d2 attention test can be used to evaluate individual selective attention. Purpose: To use the d2 attention test tool to measure the difference between the attention level of the experimental class and the control class before and after Brain-Break and to explore the effect of Brain-Break in the classroom on students' selective attention. Methods: According to the principle of no difference in pre-test data, two classes in the fourth- grade of Shenzhen Longhua Central Primary School were selected. After 20 minutes of class in the third class in the morning and the third class in the afternoon, about 3-minute Brain-Break intervention was performed in the experimental class for 10 weeks. The normal class in the control class did not intervene. Before and after the experiment, the d2 attention test tool was used to test the attention level of the two-class students. The paired sample t-test and independent sample t-test in SPSS 23.0 was used to test the change in the attention level of the two-class classes around 10 weeks. This article only presents results with significant differences. Results: The independent sample t-test results showed that after ten-week of Brain-Break, the missed errors (E1 t = -2.165 p = 0.042), concentration performance (CP t = 1.866 p = 0.05), and the degree of omissions (Epercent t = -2.375 p = 0.029) in experimental class showed significant differences compared with control class. The students’ error level decreased and the concentration increased. Conclusions: Adding Brain-Break interventions in the classroom can effectively improve the attention level of fourth-grade primary school students to a certain extent, especially can improve the concentration of attention and decrease the error rate in the tasks. The new sport's learning model is worth promoting

Keywords: cultural class, micromotor, attention, D2 test

Procedia PDF Downloads 98