Search results for: corpus analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 26935

Search results for: corpus analysis

26815 A Corpus Study of English Verbs in Chinese EFL Learners’ Academic Writing Abstracts

Authors: Shuaili Ji

Abstract:

The correct use of verbs is an important element of high-quality research articles, and thus for Chinese EFL learners, it is significant to master characteristics of verbs and to precisely use verbs. However, some researches have shown that there are differences in using verbs between learners and native speakers and learners have difficulty in using English verbs. This corpus-based quantitative research can enhance learners’ knowledge of English verbs and promote the quality of research article abstracts even of the whole academic writing. The aim of this study is to find the differences between learners’ and native speakers’ use of verbs and to study the factors that contribute to those differences. To this end, the research question is as follows: What are the differences between most frequently used verbs by learners and those by native speakers? The research question is answered through a study that uses corpus-based data-driven approach to analyze the verbs used by learners in their abstract writings in terms of collocation, colligation and semantic prosody. The results show that: (1) EFL learners obviously overused ‘be, can, find, make’ and underused ‘investigate, examine, may’. As to modal verbs, learners obviously overused ‘can’ while underused ‘may’. (2) Learners obviously overused ‘we find + object clauses’ while underused ‘nouns (results, findings, data) + suggest/indicate/reveal + object clauses’ when expressing research results. (3) Learners tended to transfer the collocation, colligation and semantic prosody of shǐ and zuò to make. (4) Learners obviously overused ‘BE+V-ed’ and used BE as the main verb. They also obviously overused the basic forms of BE such as be, is, are, while obviously underused its inflections (was, were). These results manifested learners’ lack of accuracy and idiomatic property in verb usage. Due to the influence of the concept transfer of Chinese, the verbs in learners’ abstracts showed obvious transfer of mother language. In addition, learners have not fully mastered the use of verbs, avoiding using complex colligations to prevent errors. Based on these findings, the present study has implications for English teaching, seeking to have implications for English academic abstract writing in China. Further research could be undertaken to study the use of verbs in the whole dissertation to find out whether the characteristic of the verbs in abstracts can apply in the whole dissertation or not.

Keywords: academic writing abstracts, Chinese EFL learners, corpus-based, data-driven, verbs

Procedia PDF Downloads 302
26814 MicroRNA in Bovine Corpus Luteum during Early Pregnancy

Authors: Rreze Gecaj, Corina Schanzenbach, Benedikt Kirchner, Michael Pfaffl, Bajram Berisha

Abstract:

The maintenance of corpus lutem (CL) during early pregnancy in cattle is a critical and multifarious process. A luteotrophic mechanism originating from the embryo is widely accepted as the triggering signal for the CL maintenance. In the cattle, it is the interferon-tau (IFNT) secretion form conceptus that prevents CL regression and ensures progesterone production for the establishment of pregnancy. In addition to endocrine and paracrine signals, microRNA (miRNA) can also support CL sustainability during early pregnancy. MiRNA are small non-coding nucleic acids that regulate gene expression post-transcriptionally and are shown to be involved in the modulation of CL function. However, the examination of miRNAs in corpus luteum function at the early pregnancy still remains largely uncovered. This study aims at profiling the expression of miRNA in CL during the early pregnancy in cattle by comparing it with the CL form late cycle and with the regressed CL. Corpora lutea were assigned in two different groups during the cycle (C13 group, late CL: days 13-18 and C18, regressed CL group: day >18) and during the early pregnancy (group P: 1-2 month). The estrous cycle was determined by macroscopic examination and to age the fetus crown-rump length measurement was applied. A total of 9 corpora lutea from individual animals were included in the study, three corpora lutea for each group. MiRNAs population was profiled using small RNA next-generation sequencing and biologically significant miRNAs were evaluated for their differential expression using the DESeq2-methodology. We show that 6 differentially expressed miRNAs (bta-mir-2890, -2332, -2441-3p, -148b, -1248 and -29c) are common to both comparisons, P vs C13 and P vs C18. While for each stage individually we have identified unique miRNAs differentially expressed only for the given comparison. bta-miR-23a and -769 were unique miRNAs differentially expressed in P vs C13, whereas forty-four unique miRNAs were identified as differentially expressed in P vs C18. These data confirm that miRNAs are highly abundant in luteal tissue during early pregnancy and potentially regulate the CL maintenance at this stage of fetus development.

Keywords: bovine, corpus luteum, microRNA, pregnancy, RNA-Seq

Procedia PDF Downloads 230
26813 Wavelets Contribution on Textual Data Analysis

Authors: Habiba Ben Abdessalem

Abstract:

The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.

Keywords: textual data, wavelet, denoising, contingency table

Procedia PDF Downloads 255
26812 Corpus-Based Neural Machine Translation: Empirical Study Multilingual Corpus for Machine Translation of Opaque Idioms - Cloud AutoML Platform

Authors: Khadija Refouh

Abstract:

Culture bound-expressions have been a bottleneck for Natural Language Processing (NLP) and comprehension, especially in the case of machine translation (MT). In the last decade, the field of machine translation has greatly advanced. Neural machine translation NMT has recently achieved considerable development in the quality of translation that outperformed previous traditional translation systems in many language pairs. Neural machine translation NMT is an Artificial Intelligence AI and deep neural networks applied to language processing. Despite this development, there remain some serious challenges that face neural machine translation NMT when translating culture bounded-expressions, especially for low resources language pairs such as Arabic-English and Arabic-French, which is not the case with well-established language pairs such as English-French. Machine translation of opaque idioms from English into French are likely to be more accurate than translating them from English into Arabic. For example, Google Translate Application translated the sentence “What a bad weather! It runs cats and dogs.” to “يا له من طقس سيء! تمطر القطط والكلاب” into the target language Arabic which is an inaccurate literal translation. The translation of the same sentence into the target language French was “Quel mauvais temps! Il pleut des cordes.” where Google Translate Application used the accurate French corresponding idioms. This paper aims to perform NMT experiments towards better translation of opaque idioms using high quality clean multilingual corpus. This Corpus will be collected analytically from human generated idiom translation. AutoML translation, a Google Neural Machine Translation Platform, is used as a custom translation model to improve the translation of opaque idioms. The automatic evaluation of the custom model will be compared to the Google NMT using Bilingual Evaluation Understudy Score BLEU. BLEU is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Human evaluation is integrated to test the reliability of the Blue Score. The researcher will examine syntactical, lexical, and semantic features using Halliday's functional theory.

Keywords: multilingual corpora, natural language processing (NLP), neural machine translation (NMT), opaque idioms

Procedia PDF Downloads 107
26811 A Bayesian Approach for Analyzing Academic Article Structure

Authors: Jia-Lien Hsu, Chiung-Wen Chang

Abstract:

Research articles may follow a simple and succinct structure of organizational patterns, called move. For example, considering extended abstracts, we observe that an extended abstract usually consists of five moves, including Background, Aim, Method, Results, and Conclusion. As another example, when publishing articles in PubMed, authors are encouraged to provide a structured abstract, which is an abstract with distinct and labeled sections (e.g., Introduction, Methods, Results, Discussions) for rapid comprehension. This paper introduces a method for computational analysis of move structures (i.e., Background-Purpose-Method-Result-Conclusion) in abstracts and introductions of research documents, instead of manually time-consuming and labor-intensive analysis process. In our approach, sentences in a given abstract and introduction are automatically analyzed and labeled with a specific move (i.e., B-P-M-R-C in this paper) to reveal various rhetorical status. As a result, it is expected that the automatic analytical tool for move structures will facilitate non-native speakers or novice writers to be aware of appropriate move structures and internalize relevant knowledge to improve their writing. In this paper, we propose a Bayesian approach to determine move tags for research articles. The approach consists of two phases, training phase and testing phase. In the training phase, we build a Bayesian model based on a couple of given initial patterns and the corpus, a subset of CiteSeerX. In the beginning, the priori probability of Bayesian model solely relies on initial patterns. Subsequently, with respect to the corpus, we process each document one by one: extract features, determine tags, and update the Bayesian model iteratively. In the testing phase, we compare our results with tags which are manually assigned by the experts. In our experiments, the promising accuracy of the proposed approach reaches 56%.

Keywords: academic English writing, assisted writing, move tag analysis, Bayesian approach

Procedia PDF Downloads 304
26810 Understanding the Top Questions Asked about Hong Kong by Travellers Worldwide through a Corpus-Based Discourse Analytic Approach

Authors: Phoenix W. Y. Lam

Abstract:

As one of the most important service-oriented industries in contemporary society, tourism has increasingly seen the influence of the Internet on all aspects of travelling. Travellers nowadays habitually research online before making travel-related decisions. One platform on which such research is conducted is destination forums. The emergence of such online destination forums in the last decade has allowed tourists to share their travel experiences quickly and easily with a large number of online users around the world. As such, these destination forums also provide invaluable data for tourism bodies to better understand travellers’ views on their destinations. Collecting posts from the Hong Kong travel forum on the world’s largest travel website TripAdvisor®, the present study identifies the top questions asked by TripAdvisor users about Hong Kong through a corpus-based discourse analytic approach. Based on questions posted on the forum and their associated meta-data gathered in a one-year period, the study examines the top questions asked by travellers around the world to identify the key geographical locations in which users have shown the greatest interest in the city. Questions raised by travellers from different geographical locations are also compared to see if traveller communities by location vary in terms of their areas of interest. This analysis involves the study of key words and concordance of frequently-occurring items and a close reading of representative examples in context. Findings from the present study show that travellers who asked the most questions about Hong Kong are from North America and Asia, and that travellers from different locations have different concerns and interests, which are clearly reflected in the language of the questions asked on the travel forum. These findings can therefore provide tourism organisations with useful information about the key markets that should be targeted for promotional purposes, and can also allow such organisations to design advertising campaigns which better address the specific needs of such markets. The present study thus demonstrates the value of applying linguistic knowledge and methodologies to the domain of tourism to address practical issues.

Keywords: corpus, hong kong, online travel forum, tourism, TripAdvisor

Procedia PDF Downloads 154
26809 Number Variation of the Personal Pronoun We in American Spoken English

Authors: Qiong Hu, Ming Yue

Abstract:

Language variation signals the newest usage of language community, which might become the developmental trend of that language. The personal pronoun we is prescribed as a plural pronoun in grammar, but its number value is more flexible in actual use. Based on the homemade Friends corpus, the present research explores the number value of the first person pronoun we in nowadays American spoken English. With consideration of the subjectivity of we, this paper used ‘we+ PCU (Perception-cognation-utterance) verbs’ collocations and ‘we+ plural categories’ as the parameters. Results from corpus data and manual annotation show that: 1) the overall frequency of we has been increasing; 2) we has been increasingly used with other plural categories, indicating a weakening of its plural reference; and 3) we has been increasingly used with PCU (perception-cognition-utterance) verbs of strong subjectivity, indicating a strengthening of its singular reference. All these seem to support our hypothesis that we is undergoing the process of further grammaticalization towards a singular reference, though future evidence is needed to attest the bold prediction.

Keywords: number, PCU verbs, personal pronoun we,

Procedia PDF Downloads 203
26808 Critical Discourse Analysis of President Mamnoon Hussain Speech in the Joint Session of Parliament.

Authors: Saeed Qaisrani

Abstract:

This article briefly reviews the rise of Critical Discourse Analysis about the Pakistani President Mamnoon Hussain speech which delivered in the joint session of Parliament and teases out a detailed analysis of the various critiques that have been levelled at CDA and its practitioners over the last twenty years, both by scholars working within the “critical” paradigm and by other critics. A range of criticisms are discussed which target the underlying premises, the analytical methodology and the disputed areas of reader response and the integration of contextual factors. Controversial issues such as the predominantly negative focus of much CDA scholarship, and the status of CDA as an emergent “intellectual orthodoxy”, are also reviewed. The conclusions offer a summary of the principal criticisms that emerge from this overview, and suggest some ways in which these problems could be attenuated. It also focused on the different views about president speech and how it is presented in the Pakistani print and electronic media.

Keywords: Critical Discourse Analysis, Analytical methodology, Corpus linguistics, Reader response theory, Critical paradigm, Contextualization.

Procedia PDF Downloads 452
26807 Neologisms and Word-Formation Processes in Board Game Rulebook Corpus: Preliminary Results

Authors: Athanasios Karasimos, Vasiliki Makri

Abstract:

This research focuses on the design and development of the first text Corpus based on Board Game Rulebooks (BGRC) with direct application on the morphological analysis of neologisms and tendencies in word-formation processes. Corpus linguistics is a dynamic field that examines language through the lens of vast collections of texts. These corpora consist of diverse written and spoken materials, ranging from literature and newspapers to transcripts of everyday conversations. By morphologically analyzing these extensive datasets, morphologists can gain valuable insights into how language functions and evolves, as these extensive datasets can reflect the byproducts of inflection, derivation, blending, clipping, compounding, and neology. This entails scrutinizing how words are created, modified, and combined to convey meaning in a corpus of challenging, creative, and straightforward texts that include rules, examples, tutorials, and tips. Board games teach players how to strategize, consider alternatives, and think flexibly, which are critical elements in language learning. Their rulebooks reflect not only their weight (complexity) but also the language properties of each genre and subgenre of these games. Board games are a captivating realm where strategy, competition, and creativity converge. Beyond the excitement of gameplay, board games also spark the art of word creation. Word games, like Scrabble, Codenames, Bananagrams, Wordcraft, Alice in the Wordland, Once uUpona Time, challenge players to construct words from a pool of letters, thus encouraging linguistic ingenuity and vocabulary expansion. These games foster a love for language, motivating players to unearth obscure words and devise clever combinations. On the other hand, the designers and creators produce rulebooks, where they include their joy of discovering the hidden potential of language, igniting the imagination, and playing with the beauty of words, making these games a delightful fusion of linguistic exploration and leisurely amusement. In this research, more than 150 rulebooks in English from all types of modern board games, either language-independent or language-dependent, are used to create the BGRC. A representative sample of each genre (family, party, worker placement, deckbuilding, dice, and chance games, strategy, eurogames, thematic, role-playing, among others) was selected based on the score from BoardGameGeek, the size of the texts and the level of complexity (weight) of the game. A morphological model with morphological networks, multi-word expressions, and word-creation mechanics based on the complexity of the textual structure, difficulty, and board game category will be presented. In enabling the identification of patterns, trends, and variations in word formation and other morphological processes, this research aspires to make avail of this creative yet strict text genre so as to (a) give invaluable insight into morphological creativity and innovation that (re)shape the lexicon of the English language and (b) test morphological theories. Overall, it is shown that corpus linguistics empowers us to explore the intricate tapestry of language, and morphology in particular, revealing its richness, flexibility, and adaptability in the ever-evolving landscape of human expression.

Keywords: board game rulebooks, corpus design, morphological innovations, neologisms, word-formation processes

Procedia PDF Downloads 55
26806 Sentence Structure for Free Word Order Languages in Context with Anaphora Resolution: A Case Study of Hindi

Authors: Pardeep Singh, Kamlesh Dutta

Abstract:

Many languages have fixed sentence structure and others are free word order. The accuracy of anaphora resolution of syntax based algorithm depends on structure of the sentence. So, it is important to analyze the structure of any language before implementing these algorithms. In this study, we analyzed the sentence structure exploiting the case marker in Hindi as well as some special tag for subject and object. We also investigated the word order for Hindi. Word order typology refers to the study of the order of the syntactic constituents of a language. We analyzed 165 news items of Ranchi Express from EMILEE corpus of plain text. It consisted of 1745 sentences. Eight file of dialogue based from the same corpus has been analyzed which will have 1521 sentences. The percentages of subject object verb structure (SOV) and object subject verb (OSV) are 66.90 and 33.10, respectively.

Keywords: anaphora resolution, free word order languages, SOV, OSV

Procedia PDF Downloads 442
26805 Displaying Compostela: Literature, Tourism and Cultural Representation, a Cartographic Approach

Authors: Fernando Cabo Aseguinolaza, Víctor Bouzas Blanco, Alberto Martí Ezpeleta

Abstract:

Santiago de Compostela became a stable object of literary representation during the period between 1840 and 1915, approximately. This study offers a partial cartographical look at this process, suggesting that a cultural space like Compostela’s becoming an object of literary representation paralleled the first stages of its becoming a tourist destination. We use maps as a method of analysis to show the interaction between a corpus of novels and the emerging tradition of tourist guides on Compostela during the selected period. Often, the novels constitute ways to present a city to the outside, marking it for the gaze of others, as guidebooks do. That leads us to examine the ways of constructing and rendering communicable the local in other contexts. For that matter, we should also acknowledge the fact that a good number of the narratives in the corpus evoke the representation of the city through the figure of one who comes from elsewhere: a traveler, a student or a professor. The guidebooks coincide in this with the emerging fiction, of which the mimesis of a city is a key characteristic. The local cannot define itself except through a process of symbolic negotiation, in which recognition and self-recognition play important roles. Cartography shows some of the forms that these processes of symbolic representation take through the treatment of space. The research uses GIS to find significant models of representation. We used the program ArcGIS for the mapping, defining the databases starting from an adapted version of the methodology applied by Barbara Piatti and Lorenz Hurni’s team at the University of Zurich. First, we designed maps that emphasize the peripheral position of Compostela from a historical and institutional perspective using elements found in the texts of our corpus (novels and tourist guides). Second, other maps delve into the parallels between recurring techniques in the fictional texts and characteristic devices of the guidebooks (sketching itineraries and the selection of zones and indexicalization), like a foreigner’s visit guided by someone who knows the city or the description of one’s first entrance into the city’s premises. Last, we offer a cartography that demonstrates the connection between the best known of the novels in our corpus (Alejandro Pérez Lugín’s 1915 novel La casa de la Troya) and the first attempt to create package tourist tours with Galicia as a destination, in a joint venture of Galician and British business owners, in the years immediately preceding the Great War. Literary cartography becomes a crucial instrument for digging deeply into the methods of cultural production of places. Through maps, the interaction between discursive forms seemingly so far removed from each other as novels and tourist guides becomes obvious and suggests the need to go deeper into a complex process through which a city like Compostela becomes visible on the contemporary cultural horizon.

Keywords: compostela, literary geography, literary cartography, tourism

Procedia PDF Downloads 367
26804 Studying Language of Immediacy and Language of Distance from a Corpus Linguistic Perspective: A Pilot Study of Evaluation Markers in French Television Weather Reports

Authors: Vince Liégeois

Abstract:

Language of immediacy and distance: Within their discourse theory, Koch & Oesterreicher establish a distinction between a language of immediacy and a language of distance. The former refers to those discourses which are oriented more towards a spoken norm, whereas the latter entails discourses oriented towards a written norm, regardless of whether they are realised phonically or graphically. This means that an utterance can be realised phonically but oriented more towards the written language norm (e.g., a scientific presentation or eulogy) or realised graphically but oriented towards a spoken norm (e.g., a scribble or chat messages). Research desiderata: The methodological approach from Koch & Oesterreicher has often been criticised for not providing a corpus-linguistic methodology, which makes it difficult to work with quantitative data or address large text collections within this research paradigm. Consequently, the Koch & Oesterreicher approach has difficulties gaining ground in those research areas which rely more on corpus linguistic research models, like text linguistics and LSP-research. A combinatory approach: Accordingly, we want to establish a combinatory approach with corpus-based linguistic methodology. To this end, we propose to (i) include data about the context of an utterance (e.g., monologicity/dialogicity, familiarity with the speaker) – which were called “conditions of communication” in the original work of Koch & Oesterreicher – and (ii) correlate the linguistic phenomenon at the centre of the inquiry (e.g., evaluation markers) to a group of linguistic phenomena deemed typical for either distance- or immediacy-language. Based on these two parameters, linguistic phenomena and texts could then be mapped on an immediacy-distance continuum. Pilot study: To illustrate the benefits of this approach, we will conduct a pilot study on evaluation phenomena in French television weather reports, a form of domain-sensitive discourse which has often been cited as an example of a “text genre”. Within this text genre, we will look at so-called “evaluation markers,” e.g., fixed strings like bad weather, stifling hot, and “no luck today!”. These evaluation markers help to communicate the coming weather situation towards the lay audience but have not yet been studied within the Koch & Oesterreicher research paradigm. Accordingly, we want to figure out whether said evaluation markers are more typical for those weather reports which tend more towards immediacy or those which tend more towards distance. To this aim, we collected a corpus with different kinds of television weather reports,e.g., as part of the news broadcast, including dialogue. The evaluation markers themselves will be studied according to the explained methodology, by correlating them to (i) metadata about the context and (ii) linguistic phenomena characterising immediacy-language: repetition, deixis (personal, spatial, and temporal), a freer choice of tense and right- /left-dislocation. Results: Our results indicate that evaluation markers are more dominantly present in those weather reports inclining towards immediacy-language. Based on the methodology established above, we have gained more insight into the working of evaluation markers in the domain-sensitive text genre of (television) weather reports. For future research, it will be interesting to determine whether said evaluation markers are also typical for immediacy-language-oriented in other domain-sensitive discourses.

Keywords: corpus-based linguistics, evaluation markers, language of immediacy and distance, weather reports

Procedia PDF Downloads 179
26803 Ideological Manipulations and Cultural-Norm Constraints

Authors: Masoud Hassanzade Novin, Bahloul Salmani

Abstract:

Translation cannot be considered as a simple linguistic act. Through the rise of descriptive approach in the late 1970s and 1980s, translation process managed to meet the requirements of social aspects as well as linguistic approaches. To have the translation considered as the cross-cultural communication through which various cultures communicate in ideological and cultural constraints, the contrastive analysis was conducted in this paper to reveal the distortions imposed in the translated texts. The corpus of the study involved the novel 1984 written by George Orwell and its Persian translated texts which were analyzed through the qualitative type of the research based on critical discourse analysis (CDA) and Toury's norms as well as Lefever's concepts of ideology. Results of the study revealed the point that ideology and the cultural constraints were considered as an important stimulus which can control the process of the translation.

Keywords: critical discourse analysis, ideology, norms, translated texts

Procedia PDF Downloads 310
26802 Comparison of Verb Complementation Patterns in Selected Pakistani and British English Newspaper Social Columns: A Corpus-Based Study

Authors: Zafar Iqbal Bhatti

Abstract:

The present research aims to examine and evaluate the frequencies and practices of verb complementation patterns in English newspaper social columns published in Pakistan and Britain. The research will demonstrate that Pakistani English is a non-native variety of English having its own unique usual and logical characteristics, affected by way of the native languages and the culture, upon syntactic levels, making the variety users aware that any differences from British or American English that are systematic and regular, or another English language, are not even if they are unique, erroneous forms and typical characteristics of several kinds. The objectives are to examine the verb complementation patterns that British and Pakistani social columnists use in relation to their syntactic categories. Secondly, to compare the verb complementation patterns used in Pakistani and British English newspapers social columns. This study will figure out various verb complementation patterns in Pakistani and British English newspaper social columns and their occurrence and distribution. The word classes express different functions of words, such as action, event, or state of being. This research aims to evaluate whether there are any appreciable differences in the verb complementation patterns used in Pakistani and British English newspaper social columns. The results will show the number of varieties of verb complementation patterns in selected English newspapers social columns. This study will fill the gap of previous studies conducted in this field as they only explore a little about the differences between Pakistani and British English newspapers. It will also figure out a variety of languages used in Pakistani and British English journals, as well as regional and cultural values and variations. The researcher will use AntConc software in this study to extract the data for analysis. The researcher will use a concordance tool to identify verb complementation patterns in selected data. Then the researcher will manually categorize them because the same type of adverb can sometimes be used for various purposes. From 1st June 2022 to 30th Sep. 2022, a four-month written corpus of the social columns of PE and BE newspapers will be collected and analyzed. For the analysis of the research questions, 50 social columns will be selected from Pakistani newspapers and 50 from British newspapers. The researcher will collect a representative sample of data from Pakistani and British English newspaper social columns. The researcher will manually analyze the complementation patterns of each verb in each sentence, and then the researcher will determine how frequently each pattern occurs. The researcher will use syntactic characteristics of the verb complementation elements according to the description by Downing and Locke (2006). The researcher will examine all of the verb complementation patterns in the data, and the frequency and distribution of each verb complementation pattern will be evaluated using the software. The researcher will explore every possible verb complementation pattern in Pakistani and British English before calculating the occurrence and abundance of each verb pattern. The researcher will explore every possible verb complementation pattern in Pakistani English before calculating the frequency and distribution of each pattern.

Keywords: verb complementation, syntactic categories, newspaper social columns, corpus

Procedia PDF Downloads 17
26801 A Corpus-Based Approach to Understanding Market Access in Fisheries and Aquaculture: A Systematic Literature Review

Authors: Cheryl Marie Cordeiro

Abstract:

Although fisheries and aquaculture studies might seem marginal to international business (IB) studies in general, fisheries and aquaculture IB (FAIB) management is currently facing increasing pressure to meet global demand and consumption for fish in the next coming decades. In part address to this challenge, the purpose of this systematic review of literature (SLR) study is to investigate the use of the term ‘market access’ in its context of use in the generic literature and business sector discourse, in comparison to the more specific literature and discourse in fisheries, aquaculture and seafood. This SLR aims to uncover the knowledge/interest gaps between the academic subject discourses and business sector practices. Corpus driven in methodology and using a triangulation method of three different text analysis software including AntConc, VOSviewer and Web of Science (WoS) analytics, the SLR results indicate a gap in conceptual knowledge and business practices in how ‘market access’ is conceived and used in the context of the pharmaceutical healthcare industry and FAIB research and practice. While it is acknowledged that the product orientation of different business sectors might differ, this SLR study works with the assumption that both business sectors are global in orientation. These business sectors are complex in their operations from product to market. This SLR suggests a conceptual model in understanding the challenges, the potential barriers as well as avenues for solutions to developing market access for FAIB.

Keywords: market access, fisheries and aquaculture, international business, systematic literature review

Procedia PDF Downloads 121
26800 L2 Reading in Distance Education: Analysis of Students' Reading Attitude and Interests

Authors: Ma. Junithesmer, D. Rosales

Abstract:

The study is a baseline description of students’ attitude and interests about L2 reading in a state university in the Philippines that uses distance education as a delivery mode. Most research conducted on this area dealt with the analysis of reading in a traditional school set-up. For this reason, this research was written to discover if there are implications as regards students’ preferences, interests and attitude reveal about L2 reading in a non-traditional set-up. To form the corpus of this study, it included the literature and studies about reading, preferred technological devices, titles of books and authors, reading medium traditional/ print and electronic books that juxtapose with students’ interest and feelings when reading at home and in school; and their views about their strengths and weaknesses as readers.

Keywords: distance education, L2 reading, reading, reading attitude

Procedia PDF Downloads 316
26799 Preparation on Sentimental Analysis on Social Media Comments with Bidirectional Long Short-Term Memory Gated Recurrent Unit and Model Glove in Portuguese

Authors: Leonardo Alfredo Mendoza, Cristian Munoz, Marco Aurelio Pacheco, Manoela Kohler, Evelyn Batista, Rodrigo Moura

Abstract:

Natural Language Processing (NLP) techniques are increasingly more powerful to be able to interpret the feelings and reactions of a person to a product or service. Sentiment analysis has become a fundamental tool for this interpretation but has few applications in languages other than English. This paper presents a classification of sentiment analysis in Portuguese with a base of comments from social networks in Portuguese. A word embedding's representation was used with a 50-Dimension GloVe pre-trained model, generated through a corpus completely in Portuguese. To generate this classification, the bidirectional long short-term memory and bidirectional Gated Recurrent Unit (GRU) models are used, reaching results of 99.1%.

Keywords: natural processing language, sentiment analysis, bidirectional long short-term memory, BI-LSTM, gated recurrent unit, GRU

Procedia PDF Downloads 129
26798 Spanish Language Violence Corpus: An Analysis of Offensive Language in Twitter

Authors: Beatriz Botella-Gil, Patricio Martínez-Barco, Lea Canales

Abstract:

The Internet and ICT are an integral element of and omnipresent in our daily lives. Technologies have changed the way we see the world and relate to it. The number of companies in the ICT sector is increasing every year, and there has also been an increase in the work that occurs online, from sending e-mails to the way companies promote themselves. In social life, ICT’s have gained momentum. Social networks are useful for keeping in contact with family or friends that live far away. This change in how we manage our relationships using electronic devices and social media has been experienced differently depending on the age of the person. According to currently available data, people are increasingly connected to social media and other forms of online communication. Therefore, it is no surprise that violent content has also made its way to digital media. One of the important reasons for this is the anonymity provided by social media, which causes a sense of impunity in the victim. Moreover, it is not uncommon to find derogatory comments, attacking a person’s physical appearance, hobbies, or beliefs. This is why it is necessary to develop artificial intelligence tools that allow us to keep track of violent comments that relate to violent events so that this type of violent online behavior can be deterred. The objective of our research is to create a guide for detecting and recording violent messages. Our annotation guide begins with a study on the problem of violent messages. First, we consider the characteristics that a message should contain for it to be categorized as violent. Second, the possibility of establishing different levels of aggressiveness. To download the corpus, we chose the social network Twitter for its ease of obtaining free messages. We chose two recent, highly visible violent cases that occurred in Spain. Both of them experienced a high degree of social media coverage and user comments. Our corpus has a total of 633 messages, manually tagged, according to the characteristics we considered important, such as, for example, the verbs used, the presence of exclamations or insults, and the presence of negations. We consider it necessary to create wordlists that are present in violent messages as indicators of violence, such as lists of negative verbs, insults, negative phrases. As a final step, we will use automatic learning systems to check the data obtained and the effectiveness of our guide.

Keywords: human language technologies, language modelling, offensive language detection, violent online content

Procedia PDF Downloads 97
26797 Named Entity Recognition System for Tigrinya Language

Authors: Sham Kidane, Fitsum Gaim, Ibrahim Abdella, Sirak Asmerom, Yoel Ghebrihiwot, Simon Mulugeta, Natnael Ambassager

Abstract:

The lack of annotated datasets is a bottleneck to the progress of NLP in low-resourced languages. The work presented here consists of large-scale annotated datasets and models for the named entity recognition (NER) system for the Tigrinya language. Our manually constructed corpus comprises over 340K words tagged for NER, with over 118K of the tokens also having parts-of-speech (POS) tags, annotated with 12 distinct classes of entities, represented using several types of tagging schemes. We conducted extensive experiments covering convolutional neural networks and transformer models; the highest performance achieved is 88.8% weighted F1-score. These results are especially noteworthy given the unique challenges posed by Tigrinya’s distinct grammatical structure and complex word morphologies. The system can be an essential building block for the advancement of NLP systems in Tigrinya and other related low-resourced languages and serve as a bridge for cross-referencing against higher-resourced languages.

Keywords: Tigrinya NER corpus, TiBERT, TiRoBERTa, BiLSTM-CRF

Procedia PDF Downloads 64
26796 Concepts of Creation and Destruction as Cognitive Instruments in World View Study

Authors: Perizat Balkhimbekova

Abstract:

Evolutionary changes in cognitive world view taking place in the last decades are followed by changes in perception of the key concepts which are related to the certain lingua-cultural sphere. Also, such concepts reflect the person’s attitude to essential processes in the sphere of concepts, e.g. the opposite operations like creation and destruction. These changes in people’s life and thinking are displayed in a language world view. In order to open the maintenance of mental structures and concepts we should use language means as observable results of people’s cognitive activity. Semantics of words, free phrases and idioms should be considered as an authoritative source of information concerning concepts. The regularized set of concepts in people consciousness forms the sphere of concepts. Cognitive linguistics widely discusses the sphere of concepts as its crucial category defining it as the field of knowledge which is made of concepts. It is considered that a sphere of concepts comprises the various types of association and forms conceptual fields. As a material for the given research, the data from Russian National Corpus and British National Corpus were used. In is necessary to point out that data provided by computational studies, are intrinsic and verifiable; so that we have used them in order to get the reliable results. The procedure of study was based on such techniques as extracting of the context containing concepts of creation|destruction from the Russian National Corpus (RNC), and British National Corpus (BNC); analyzing and interpreting of those context on the basis of cognitive approach; finding of correspondence between the given concepts in the Russian and English world view. The key problem of our study is to find the correspondence between the elements of world view represented by opposite concepts such as creation and destruction. Findings: The concept of "destruction" indicates a process which leads to full or partial destruction of an object. In other words, it is a loss of the object primary essence: structures, properties, distinctive signs and its initial integrity. The concept of "creation", on the contrary, comprises positive characteristics, represents the activity aimed at improvement of the certain object, at the creation of ideal models of the world. On the other hand, destruction is represented much more widely in RNC than creation (1254 cases of the first concept by comparison to 192 cases for the second one). Our hypothesis consists in the antinomy represented by the aforementioned concepts. Being opposite both in respect of semantics and pragmatics, and from the point of view of axiology, they are at the same time complementary and interrelated concepts.

Keywords: creation, destruction, concept, world view

Procedia PDF Downloads 317
26795 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times

Authors: Ankita Sharma

Abstract:

The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.

Keywords: classical Greek theory, women, wandering womb, modern ideology

Procedia PDF Downloads 168
26794 The Grammar of the Content Plane as a Style Marker in Forensic Authorship Attribution

Authors: Dayane de Almeida

Abstract:

This work aims at presenting a study that demonstrates the usability of categories of analysis from Discourse Semiotics – also known as Greimassian Semiotics in authorship cases in forensic contexts. It is necessary to know if the categories examined in semiotic analysis (the ‘grammar’ of the content plane) can distinguish authors. Thus, a study with 4 sets of texts from a corpus of ‘not on demand’ written samples (those texts differ in formality degree, purpose, addressees, themes, etc.) was performed. Each author contributed with 20 texts, separated into 2 groups of 10 (Author1A, Author1B, and so on). The hypothesis was that texts from a single author were semiotically more similar to each other than texts from different authors. The assumptions and issues that led to this idea are as follows: -The features analyzed in authorship studies mostly relate to the expression plane: they are manifested on the ‘surface’ of texts. If language is both expression and content, content would also have to be considered for more accurate results. Style is present in both planes. -Semiotics postulates the content plane is structured in a ‘grammar’ that underlies expression, and that presents different levels of abstraction. This ‘grammar’ would be a style marker. -Sociolinguistics demonstrates intra-speaker variation: an individual employs different linguistic uses in different situations. Then, how to determine if someone is the author of several texts, distinct in nature (as it is the case in most forensic sets), when it is known intra-speaker variation is dependent on so many factors?-The idea is that the more abstract the level in the content plane, the lower the intra-speaker variation, because there will be a greater chance for the author to choose the same thing. If two authors recurrently chose the same options, differently from one another, it means each one’s option has discriminatory power. -Size is another issue for various attribution methods. Since most texts in real forensic settings are short, methods relying only on the expression plane tend to fail. The analysis of the content plane as proposed by greimassian semiotics would be less size-dependable. -The semiotic analysis was performed using the software Corpus Tool, generating tags to allow the counting of data. Then, similarities and differences were quantitatively measured, through the application of the Jaccard coefficient (a statistical measure that compares the similarities and differences between samples). The results showed the hypothesis was confirmed and, hence, the grammatical categories of the content plane may successfully be used in questioned authorship scenarios.

Keywords: authorship attribution, content plane, forensic linguistics, greimassian semiotics, intraspeaker variation, style

Procedia PDF Downloads 215
26793 A Small Graphic Lie. The Photographic Quality of Pierre Bourdieu’s Correspondance Analysis

Authors: Lene Granzau Juel-Jacobsen

Abstract:

The problem of beautification is an obvious concern of photography, claiming reference to reality, but it also lies at the very heart of social theory. As we become accustomed to sophisticated visualizations of statistical data in pace with the development of software programs, we should not only be inclined to ask new types of research questions, but we also need to confront social theories based on such visualization techniques with new types of questions. Correspondence Analysis, GIS analysis, Social Network Analysis, and Perceptual Maps are current examples of visualization techniques popular within the social sciences and neighboring disciplines. This article discusses correspondence analysis, arguing that the graphic plot of correspondence analysis is to be interpreted much similarly to a photograph. It refers no more evidently or univocally to reality than a photograph, representing social life no more truthfully than a photograph documents. Pierre Bourdieu’s theoretical corpus, especially his theory of fields, relies heavily on correspondence analysis. While much attention has been directed towards critiquing the somewhat vague conceptualization of habitus, limited focus has been placed on the equally problematic concepts of social space and field. Based on a re-reading of the Distinction, the article argues that the concepts rely on ‘a small graphic lie’ very similar to a photograph. Like any other piece of art, as Bourdieu himself recognized, the graphic display is a politically and morally loaded representation technique. However, the correspondence analysis does not necessarily serve the purpose he intended. In fact, it tends towards the pitfalls he strove to overcome.

Keywords: datavisualization, correspondance analysis, bourdieu, Field, visual representation

Procedia PDF Downloads 37
26792 Agenesis of the Corpus Callosum: The Role of Neuropsychological Assessment with Implications to Psychosocial Rehabilitation

Authors: Ron Dick, P. S. D. V. Prasadarao, Glenn Coltman

Abstract:

Agenesis of the corpus callosum (ACC) is a failure to develop corpus callosum - the large bundle of fibers of the brain that connects the two cerebral hemispheres. It can occur as a partial or complete absence of the corpus callosum. In the general population, its estimated prevalence rate is 1 in 4000 and a wide range of genetic, infectious, vascular, and toxic causes have been attributed to this heterogeneous condition. The diagnosis of ACC is often achieved by neuroimaging procedures. Though persons with ACC can perform normally on intelligence tests they generally present with a range of neuropsychological and social deficits. The deficit profile is characterized by poor coordination of motor movements, slow reaction time, processing speed and, poor memory. Socially, they present with deficits in communication, language processing, the theory of mind, and interpersonal relationships. The present paper illustrates the role of neuropsychological assessment with implications to psychosocial management in a case of agenesis of the corpus callosum. Method: A 27-year old left handed Caucasian male with a history of ACC was self-referred for a neuropsychological assessment to assist him in his employment options. Parents noted significant difficulties with coordination and balance at an early age of 2-3 years and he was diagnosed with dyspraxia at the age of 14 years. History also indicated visual impairment, hypotonia, poor muscle coordination, and delayed development of motor milestones. MRI scan indicated agenesis of the corpus callosum with ventricular morphology, widely spaced parallel lateral ventricles and mild dilatation of the posterior horns; it also showed colpocephaly—a disproportionate enlargement of the occipital horns of the lateral ventricles which might be affecting his motor abilities and visual defects. The MRI scan ruled out other structural abnormalities or neonatal brain injury. At the time of assessment, the subject presented with such problems as poor coordination, slowed processing speed, poor organizational skills and time management, and difficulty with social cues and facial expressions. A comprehensive neuropsychological assessment was planned and conducted to assist in identifying the current neuropsychological profile to facilitate the formulation of a psychosocial and occupational rehabilitation programme. Results: General intellectual functioning was within the average range and his performance on memory-related tasks was adequate. Significant visuospatial and visuoconstructional deficits were evident across tests; constructional difficulties were seen in tasks such as copying a complex figure, building a tower and manipulating blocks. Poor visual scanning ability and visual motor speed were evident. Socially, the subject reported heightened social anxiety, difficulty in responding to cues in the social environment, and difficulty in developing intimate relationships. Conclusion: Persons with ACC are known to present with specific cognitive deficits and problems in social situations. Findings from the current neuropsychological assessment indicated significant visuospatial difficulties, poor visual scanning and problems in social interactions. His general intellectual functioning was within the average range. Based on the findings from the comprehensive neuropsychological assessment, a structured psychosocial rehabilitation programme was developed and recommended.

Keywords: agenesis, callosum, corpus, neuropsychology, psychosocial, rehabilitation

Procedia PDF Downloads 257
26791 A Linguistic Product of K-Pop: A Corpus-Based Study on the Korean-Originated Chinese Neologism Simida

Authors: Hui Shi

Abstract:

This article examines the online popularity of Chinese neologism simida, which is a loanword derived from Korean declarative sentence-final suffix seumnida. Facilitated by corpus data obtained from Weibo, the Chinese counterpart of Twitter, this study analyzes the morphological and syntactical processes behind simida’s coinage, as well as the causes of its prevalence on Chinese social media. The findings show that simida is used by Weibo bloggers in two manners: (1) as an alternative word of 'Korea' and 'Korean'; (2) as a redundant sentence-final particle which adds a Korean-like speech style to a statement. Additionally, Weibo user profile analysis further reveals demographical distribution patterns concerning this neologism and highlights young Weibo users in the third-tier cities as the leading adopters of simida. These results are accounted for under the theoretical framework of social indexicality, especially how variations generate style in the indexical field. This article argues that the creation of such an ethnically-targeted neologism is a linguistic demonstration of Chinese netizen’s two-sided attitudes toward the previously heated Korean-wave. The exotic suffix seumnida is borrowed to Chinese as simida due to its high-frequency in Korean cultural exports. Therefore, it gradually becomes a replacement of Korea-related lexical items due to markedness, regardless of semantic prosody. Its innovative implantation to Chinese syntax, on the other hand, reflects Chinese netizens’ active manipulation of language for their online identity building. This study has implications for research on the linguistic construction of identity and style and lays the groundwork for linguistic creativity in the Chinese new media.

Keywords: Chinese neologism, loanword, humor, new media

Procedia PDF Downloads 152
26790 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 41
26789 The Philippines’ War on Drugs: a Pragmatic Analysis on Duterte's Commemorative Speeches

Authors: Ericson O. Alieto, Aprillete C. Devanadera

Abstract:

The main objective of the study is to determine the dominant speech acts in five commemorative speeches of President Duterte. This study employed Speech Act Theory and Discourse analysis to determine how the speech acts features connote the pragmatic meaning of Duterte’s speeches. Identifying the speech acts is significant in elucidating the underlying message or the pragmatic meaning of the speeches. From the 713 sentences or utterances from the speeches, assertive with 208 occurrences from the corpus or 29% is the dominant speech acts. It was followed by expressive with 177 or 25% occurrences, directive accounts for 152 or 15% occurrences. While commisive accounts for 104 or 15% occurrences and declarative got the lowest percentage of occurrences with 72 or 10% only. These sentences when uttered by Duterte carry a certain power of language to move or influence people. Thus, the present study shows the fundamental message perceived by the listeners. Moreover, the frequent use of assertive and expressive not only explains the pragmatic message of the speeches but also reflects the personality of President Duterte.

Keywords: commemorative speech, discourse analysis, duterte, pragmatics

Procedia PDF Downloads 254
26788 Achieving Maximum Performance through the Practice of Entrepreneurial Ethics: Evidence from SMEs in Nigeria

Authors: S. B. Tende, H. L. Abubakar

Abstract:

It is acknowledged that small and medium enterprises (SMEs) may encounter different ethical issues and pressures that could affect the way in which they strategize or make decisions concerning the outcome of their business. Therefore, this research aimed at assessing entrepreneurial ethics in the business of SMEs in Nigeria. Secondary data were adopted as source of corpus for the analysis. The findings conclude that a sound entrepreneurial ethics system has a significant effect on the level of performance of SMEs in Nigeria. The Nigerian Government needs to provide both guiding and physical structures; as well as learning systems that could inculcate these entrepreneurial ethics.

Keywords: culture, entrepreneurial ethics, performance, SME

Procedia PDF Downloads 348
26787 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 73
26786 On the Semantics and Pragmatics of 'Be Able To': Modality and Actualisation

Authors: Benoît Leclercq, Ilse Depraetere

Abstract:

The goal of this presentation is to shed new light on the semantics and pragmatics of be able to. It presents the results of a corpus analysis based on data from the BNC (British National Corpus), and discusses these results in light of a specific stance on the semantics-pragmatics interface taking into account recent developments. Be able to is often discussed in relation to can and could, all of which can be used to express ability. Such an onomasiological approach often results in the identification of usage constraints for each expression. In the case of be able to, it is the formal properties of the modal expression (unlike can and could, be able to has non-finite forms) that are in the foreground, and the modal expression is described as the verb that conveys future ability. Be able to is also argued to expressed actualised ability in the past (I was able/could to open the door). This presentation aims to provide a more accurate pragmatic-semantic profile of be able to, based on extensive data analysis and one that is embedded in a very explicit view on the semantics-pragmatics interface. A random sample of 3000 examples (1000 for each modal verb) extracted from the BNC was analysed to account for the following issues. First, the challenge is to identify the exact semantic range of be able to. The results show that, contrary to general assumption, be able to does not only express ability but it shares most of the root meanings usually associated with the possibility modals can and could. The data reveal that what is called opportunity is, in fact, the most frequent meaning of be able to. Second, attention will be given to the notion of actualisation. It is commonly argued that be able to is the preferred form when the residue actualises: (1) The only reason he was able to do that was because of the restriction (BNC, spoken) (2) It is only through my imaginative shuffling of the aces that we are able to stay ahead of the pack. (BNC, written) Although this notion has been studied in detail within formal semantic approaches, empirical data is crucially lacking and it is unclear whether actualisation constitutes a conventional (and distinguishing) property of be able to. The empirical analysis provides solid evidence that actualisation is indeed a conventional feature of the modal. Furthermore, the dataset reveals that be able to expresses actualised 'opportunities' and not actualised 'abilities'. In the final part of this paper, attention will be given to the theoretical implications of the empirical findings, and in particular to the following paradox: how can the same expression encode both modal meaning (non-factual) and actualisation (factual)? It will be argued that this largely depends on one's conception of the semantics-pragmatics interface, and that this need not be an issue when actualisation (unlike modality) is analysed as a generalised conversational implicature and thus is considered part of the conventional pragmatic layer of be able to.

Keywords: Actualisation, Modality, Pragmatics, Semantics

Procedia PDF Downloads 98