Search results for: conversational corpora
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 167

Search results for: conversational corpora

17 Differences in Assessing Hand-Written and Typed Student Exams: A Corpus-Linguistic Study

Authors: Jutta Ransmayr

Abstract:

The digital age has long arrived at Austrian schools, so both society and educationalists demand that digital means should be integrated accordingly to day-to-day school routines. Therefore, the Austrian school-leaving exam (A-levels) can now be written either by hand or by using a computer. However, the choice of writing medium (pen and paper or computer) for written examination papers, which are considered 'high-stakes' exams, raises a number of questions that have not yet been adequately investigated and answered until recently, such as: What effects do the different conditions of text production in the written German A-levels have on the component of normative linguistic accuracy? How do the spelling skills of German A-level papers written with a pen differ from those that the students wrote on the computer? And how is the teacher's assessment related to this? Which practical desiderata for German didactics can be derived from this? In a trilateral pilot project of the Austrian Center for Digital Humanities (ACDH) of the Austrian Academy of Sciences and the University of Vienna in cooperation with the Austrian Ministry of Education and the Council for German Orthography, these questions were investigated. A representative Austrian learner corpus, consisting of around 530 German A-level papers from all over Austria (pen and computer written), was set up in order to subject it to a quantitative (corpus-linguistic and statistical) and qualitative investigation with regard to the spelling and punctuation performance of the high school graduates and the differences between pen- and computer-written papers and their assessments. Relevant studies are currently available mainly from the Anglophone world. These have shown that writing on the computer increases the motivation to write, has positive effects on the length of the text, and, in some cases, also on the quality of the text. Depending on the writing situation and other technical aids, better results in terms of spelling and punctuation could also be found in the computer-written texts as compared to the handwritten ones. Studies also point towards a tendency among teachers to rate handwritten texts better than computer-written texts. In this paper, the first comparable results from the German-speaking area are to be presented. Research results have shown that, on the one hand, there are significant differences between handwritten and computer-written work with regard to performance in orthography and punctuation. On the other hand, the corpus linguistic investigation and the subsequent statistical analysis made it clear that not only the teachers' assessments of the students’ spelling performance vary enormously but also the overall assessments of the exam papers – the factor of the production medium (pen and paper or computer) also seems to play a decisive role.

Keywords: exam paper assessment, pen and paper or computer, learner corpora, linguistics

Procedia PDF Downloads 139
16 A Model for Language Intervention: Toys & Picture-Books as Early Pedagogical Props for the Transmission of Lazuri

Authors: Peri Ozlem Yuksel-Sokmen, Irfan Cagtay

Abstract:

Oral languages are destined to disappear rapidly in the absence of interventions aimed at encouraging their usage by young children. The seminal language preservation model proposed by Fishman (1991) stresses the importance of multiple generations using the endangered L1 while engaged in daily routines with younger children. Over the last two decades Fishman (2001) has used his intergenerational transmission model in documenting the revitalization of Basque languages, providing evidence that families are transmitting Euskara as a first language to their children with success. In our study, to motivate usage of Lazuri, we asked caregivers to speak the language while engaged with their toddlers (12 to 48 months) in semi-structured play, and included both parents (N=32) and grandparents (N=30) as play partners. This unnatural prompting to speak only in Lazuri was greeted with reluctance, as 90% of our families indicated that they had stopped using Lazuri with their children. Nevertheless, caregivers followed instructions and produced 67% of their utterances in Lazuri, with another 14% of utterances using a combination of Lazuri and Turkish (Codeswitch). Although children spoke mostly in Turkish (83% of utterances), frequencies of caregiver utterances in Lazuri or Codeswitch predicted the extent to which their children used the minority language in return. This trend suggests that home interventions aimed at encouraging dyads to communicate in a non-preferred, endangered language can effectively increase children’s usage of the language. Alternatively, this result suggests than any use of the minority language on the part of the children will promote its further usage by caregivers. For researchers examining links between play, culture, and child development, structured play has emerged as a critical methodology (e.g., Frost, Wortham, Reifel, 2007, Lilliard et al., 2012; Sutton-Smith, 1986; Gaskins & Miller, 2009), allowing investigation of cultural and individual variation in parenting styles, as well as the role of culture in constraining the affordances of toys. Toy props, as well as picture-books in native languages, can be used as tools in the transmission and preservation of endangered languages by allowing children to explore adult roles through enactment of social routines and conversational patterns modeled by caregivers. Through adult-guided play children not only acquire scripts for culturally significant activities, but also develop skills in expressing themselves in culturally relevant ways that may continue to develop over their lives through community engagement. Further pedagogical tools, such as language games and e-learning, will be discussed in this proposed oral talk.

Keywords: language intervention, pedagogical tools, endangered languages, Lazuri

Procedia PDF Downloads 304
15 Comparing Deep Architectures for Selecting Optimal Machine Translation

Authors: Despoina Mouratidis, Katia Lida Kermanidis

Abstract:

Machine translation (MT) is a very important task in Natural Language Processing (NLP). MT evaluation is crucial in MT development, as it constitutes the means to assess the success of an MT system, and also helps improve its performance. Several methods have been proposed for the evaluation of (MT) systems. Some of the most popular ones in automatic MT evaluation are score-based, such as the BLEU score, and others are based on lexical similarity or syntactic similarity between the MT outputs and the reference involving higher-level information like part of speech tagging (POS). This paper presents a language-independent machine learning framework for classifying pairwise translations. This framework uses vector representations of two machine-produced translations, one from a statistical machine translation model (SMT) and one from a neural machine translation model (NMT). The vector representations consist of automatically extracted word embeddings and string-like language-independent features. These vector representations used as an input to a multi-layer neural network (NN) that models the similarity between each MT output and the reference, as well as between the two MT outputs. To evaluate the proposed approach, a professional translation and a "ground-truth" annotation are used. The parallel corpora used are English-Greek (EN-GR) and English-Italian (EN-IT), in the educational domain and of informal genres (video lecture subtitles, course forum text, etc.) that are difficult to be reliably translated. They have tested three basic deep learning (DL) architectures to this schema: (i) fully-connected dense, (ii) Convolutional Neural Network (CNN), and (iii) Long Short-Term Memory (LSTM). Experiments show that all tested architectures achieved better results when compared against those of some of the well-known basic approaches, such as Random Forest (RF) and Support Vector Machine (SVM). Better accuracy results are obtained when LSTM layers are used in our schema. In terms of a balance between the results, better accuracy results are obtained when dense layers are used. The reason for this is that the model correctly classifies more sentences of the minority class (SMT). For a more integrated analysis of the accuracy results, a qualitative linguistic analysis is carried out. In this context, problems have been identified about some figures of speech, as the metaphors, or about certain linguistic phenomena, such as per etymology: paronyms. It is quite interesting to find out why all the classifiers led to worse accuracy results in Italian as compared to Greek, taking into account that the linguistic features employed are language independent.

Keywords: machine learning, machine translation evaluation, neural network architecture, pairwise classification

Procedia PDF Downloads 105
14 Management of Caverno-Venous Leakage: A Series of 133 Patients with Symptoms, Hemodynamic Workup, and Results of Surgery

Authors: Allaire Eric, Hauet Pascal, Floresco Jean, Beley Sebastien, Sussman Helene, Virag Ronald

Abstract:

Background: Caverno-venous leakage (CVL) is devastating, although barely known disease, the first cause of major physical impairment in men under 25, and responsible for 50% of resistances to phosphodiesterase 5-inhibitors (PDE5-I), affecting 30 to 40% of users in this medication class. In this condition, too early blood drainage from corpora cavernosa prevents penile rigidity and penetration during sexual intercourse. The role of conservative surgery in this disease remains controversial. Aim: Assess complications and results of combined open surgery and embolization for CVL. Method: Between June 2016 and September 2021, 133 consecutive patients underwent surgery in our institution for CVL, causing severe erectile dysfunction (ED) resistance to oral medical treatment. Procedures combined vein embolization and ligation with microsurgical techniques. We performed a pre-and post-operative clinical (Erection Harness Scale: EHS) hemodynamic evaluation by duplex sonography in all patients. Before surgery, the CVL network was visualized by computed tomography cavernography. Penile EMG was performed in case of diabetes or suspected other neurological conditions. All patients were optimized for hormonal status—data we prospectively recorded. Results: Clinical signs suggesting CVL were ED since age lower than 25, loss of erection when changing position, penile rigidity varying according to the position. Main complications were minor pulmonary embolism in 2 patients, one after airline travel, one with Factor V Leiden heterozygote mutation, one infection and three hematomas requiring reoperation, one decreased gland sensitivity lasting for more than one year. Mean pre-operative pharmacologic EHS was 2.37+/-0.64, mean pharmacologic post-operative EHS was 3.21+/-0.60, p<0.0001 (paired t-test). The mean EHS variation was 0.87+/-0.74. After surgery, 81.5% of patients had a pharmacologic EHS equal to or over 3, allowing for intercourse with penetration. Three patients (2.2%) experienced lower post-operative EHS. The main cause of failure was leakage from the deep dorsal aspect of the corpus cavernosa. In a 14 months follow-up, 83.2% of patients had a clinical EHS equal to or over 3, allowing for sexual intercourse with penetration, one-third of them without any medication. 5 patients had a penile implant after unsuccessful conservative surgery. Conclusion: Open surgery combined with embolization for CVL is an efficient approach to CVL causing severe erectile dysfunction.

Keywords: erectile dysfunction, cavernovenous leakage, surgery, embolization, treatment, result, complications, penile duplex sonography

Procedia PDF Downloads 120
13 The Conflict of Grammaticality and Meaningfulness of the Corrupt Words: A Cross-lingual Sociolinguistic Study

Authors: Jayashree Aanand, Gajjam

Abstract:

The grammatical tradition in Sanskrit literature emphasizes the importance of the correct use of Sanskrit words or linguistic units (sādhu śabda) that brings the meritorious values, denying the attribution of the same religious merit to the incorrect use of Sanskrit words (asādhu śabda) or the vernacular or corrupt forms (apa-śabda or apabhraṁśa), even though they may help in communication. The current research, the culmination of the doctoral research on sentence definition, studies the difference among the comprehension of both correct and incorrect word forms in Sanskrit and Marathi languages in India. Based on the total of 19 experiments (both web-based and classroom-controlled) on approximately 900 Indian readers, it is found that while the incorrect forms in Sanskrit are comprehended with lesser accuracy than the correct word forms, no such difference can be seen for the Marathi language. It is interpreted that the incorrect word forms in the native language or in the language which is spoken daily (such as Marathi) will pose a lesser cognitive load as compared to the language that is not spoken on a daily basis but only used for reading (such as Sanskrit). The theoretical base for the research problem is as follows: among the three main schools of Language Science in ancient India, the Vaiyākaraṇas (Grammarians) hold that the corrupt word forms do have their own expressive power since they convey meaning, while as the Mimāṁsakas (the Exegesists) and the Naiyāyikas (the Logicians) believe that the corrupt forms can only convey the meaning indirectly, by recalling their association and similarity with the correct forms. The grammarians argue that the vernaculars that are born of the speaker’s inability to speak proper Sanskrit are regarded as degenerate versions or fallen forms of the ‘divine’ Sanskrit language and speakers who could not use proper Sanskrit or the standard language were considered as Śiṣṭa (‘elite’). The different ideas of different schools strictly adhere to their textual dispositions. For the last few years, sociolinguists have agreed that no variety of language is inherently better than any other; they are all the same as long as they serve the need of people that use them. Although the standard form of a language may offer the speakers some advantages, the non-standard variety is considered the most natural style of speaking. This is visible in the results. If the incorrect word forms incur the recall of the correct word forms in the reader as the theory suggests, it would have added one extra step in the process of sentential cognition leading to more cognitive load and less accuracy. This has not been the case for the Marathi language. Although speaking and listening to the vernaculars is the common practice and reading the vernacular is not, Marathi readers have readily and accurately comprehended the incorrect word forms in the sentences, as against the Sanskrit readers. The primary reason being Sanskrit is spoken and also read in the standard form only and the vernacular forms in Sanskrit are not found in the conversational data.

Keywords: experimental sociolinguistics, grammaticality and meaningfulness, Marathi, Sanskrit

Procedia PDF Downloads 103
12 Spatial Conceptualization in French and Italian Speakers: A Contrastive Approach in the Context of the Linguistic Relativity Theory

Authors: Camilla Simoncelli

Abstract:

The connection between language and cognition has been one of the main interests of linguistics from several years. According to the Sapir-Whorf Linguistic Relativity Theory, the way we perceive reality depends on the language we speak which in turn has a central role in the human cognition. This paper is in line with this research work with the aim of analyzing how language structures reflect on our cognitive abilities even in the description of space, which is generally considered as a human natural and universal domain. The main objective is to identify the differences in the encoding of spatial inclusion relationships in French and Italian speakers to make evidence that a significant variation exists at various levels even in two similar systems. Starting from the constitution a corpora, the first step of the study has been to establish the relevant complex prepositions marking an inclusion relation in French and Italian: au centre de, au cœur de, au milieu de, au sein de, à l'intérieur de and the opposition entre/parmi in French; al centro di, al cuore di, nel mezzo di, in seno a, all'interno di and the fra/tra contrast in Italian. These prepositions had been classified on the base of the type of Noun following them (e.g. mass nouns, concrete nouns, abstract nouns, body-parts noun, etc.) following the Collostructional Analysis of lexemes with the purpose of analyzing the preferred construction of each preposition comparing the relations construed. Comparing the Italian and the French results it has been possible to define the degree of representativeness of each target Noun for the chosen preposition studied. Lexicostatistics and Statistical Association Measures showed the values of attraction or repulsion between lexemes and a given preposition, highlighting which words are over-represented or under-represented in a specific context compared to the expected results. For instance, a Noun as Dibattiti has a negative value for the Italian Al cuore di (-1,91), but it has a strong positive representativeness for the corresponding French Au cœur de (+677,76). The value, positive or negative, is the result of a hypergeometric distribution law which displays the current use of some relevant nouns in relations of spatial inclusion by French and Italian speakers. Differences on the kind of location conceptualization denote syntactic and semantic constraints based on spatial features as well as on linguistic peculiarity, too. The aim of this paper is to demonstrate that the domain of spatial relations is basic to human experience and is linked to universally shared perceptual mechanisms which create mental representations depending on the language use. Therefore, linguistic coding strongly correlates with the way spatial distinctions are conceptualized for non-verbal tasks even in close language systems, like Italian and French.

Keywords: cognitive semantics, cross-linguistic variations, locational terms, non-verbal spatial representations

Procedia PDF Downloads 87
11 Identifying Confirmed Resemblances in Problem-Solving Engineering, Both in the Past and Present

Authors: Colin Schmidt, Adrien Lecossier, Pascal Crubleau, Simon Richir

Abstract:

Introduction:The widespread availability of artificial intelligence, exemplified by Generative Pre-trained Transformers (GPT) relying on large language models (LLM), has caused a seismic shift in the realm of knowledge. Everyone now has the capacity to swiftly learn how these models can either serve them well or not. Today, conversational AI like ChatGPT is grounded in neural transformer models, a significant advance in natural language processing facilitated by the emergence of renowned LLMs constructed using neural transformer architecture. Inventiveness of an LLM : OpenAI's GPT-3 stands as a premier LLM, capable of handling a broad spectrum of natural language processing tasks without requiring fine-tuning, reliably producing text that reads as if authored by humans. However, even with an understanding of how LLMs respond to questions asked, there may be lurking behind OpenAI’s seemingly endless responses an inventive model yet to be uncovered. There may be some unforeseen reasoning emerging from the interconnection of neural networks here. Just as a Soviet researcher in the 1940s questioned the existence of Common factors in inventions, enabling an Under standing of how and according to what principles humans create them, it is equally legitimate today to explore whether solutions provided by LLMs to complex problems also share common denominators. Theory of Inventive Problem Solving (TRIZ) : We will revisit some fundamentals of TRIZ and how Genrich ALTSHULLER was inspired by the idea that inventions and innovations are essential means to solve societal problems. It's crucial to note that traditional problem-solving methods often fall short in discovering innovative solutions. The design team is frequently hampered by psychological barriers stemming from confinement within a highly specialized knowledge domain that is difficult to question. We presume ChatGPT Utilizes TRIZ 40. Hence, the objective of this research is to decipher the inventive model of LLMs, particularly that of ChatGPT, through a comparative study. This will enhance the efficiency of sustainable innovation processes and shed light on how the construction of a solution to a complex problem was devised. Description of the Experimental Protocol : To confirm or reject our main hypothesis that is to determine whether ChatGPT uses TRIZ, we will follow a stringent protocol that we will detail, drawing on insights from a panel of two TRIZ experts. Conclusion and Future Directions : In this endeavor, we sought to comprehend how an LLM like GPT addresses complex challenges. Our goal was to analyze the inventive model of responses provided by an LLM, specifically ChatGPT, by comparing it to an existing standard model: TRIZ 40. Of course, problem solving is our main focus in our endeavours.

Keywords: artificial intelligence, Triz, ChatGPT, inventiveness, problem-solving

Procedia PDF Downloads 34
10 Development and Validation of a Quantitative Measure of Engagement in the Analysing Aspect of Dialogical Inquiry

Authors: Marcus Goh Tian Xi, Alicia Chua Si Wen, Eunice Gan Ghee Wu, Helen Bound, Lee Liang Ying, Albert Lee

Abstract:

The Map of Dialogical Inquiry provides a conceptual look at the underlying nature of future-oriented skills. According to the Map, learning is learner-oriented, with conversational time shifted from teachers to learners, who play a strong role in deciding what and how they learn. For example, in courses operating on the principles of Dialogical Inquiry, learners were able to leave the classroom with a deeper understanding of the topic, broader exposure to differing perspectives, and stronger critical thinking capabilities, compared to traditional approaches to teaching. Despite its contributions to learning, the Map is grounded in a qualitative approach both in its development and its application for providing feedback to learners and educators. Studies hinge on openended responses by Map users, which can be time consuming and resource intensive. The present research is motivated by this gap in practicality by aiming to develop and validate a quantitative measure of the Map. In addition, a quantifiable measure may also strengthen applicability by making learning experiences trackable and comparable. The Map outlines eight learning aspects that learners should holistically engage. This research focuses on the Analysing aspect of learning. According to the Map, Analysing has four key components: liking or engaging in logic, using interpretative lenses, seeking patterns, and critiquing and deconstructing. Existing scales of constructs (e.g., critical thinking, rationality) related to these components were identified so that the current scale could adapt items from. Specifically, items were phrased beginning with an “I”, followed by an action phrase, to fulfil the purpose of assessing learners' engagement with Analysing either in general or in classroom contexts. Paralleling standard scale development procedure, the 26-item Analysing scale was administered to 330 participants alongside existing scales with varying levels of association to Analysing, to establish construct validity. Subsequently, the scale was refined and its dimensionality, reliability, and validity were determined. Confirmatory factor analysis (CFA) revealed if scale items loaded onto the four factors corresponding to the components of Analysing. To refine the scale, items were systematically removed via an iterative procedure, according to their factor loadings and results of likelihood ratio tests at each step. Eight items were removed this way. The Analysing scale is better conceptualised as unidimensional, rather than comprising the four components identified by the Map, for three reasons: 1) the covariance matrix of the model specified for the CFA was not positive definite, 2) correlations among the four factors were high, and 3) exploratory factor analyses did not yield an easily interpretable factor structure of Analysing. Regarding validity, since the Analysing scale had higher correlations with conceptually similar scales than conceptually distinct scales, with minor exceptions, construct validity was largely established. Overall, satisfactory reliability and validity of the scale suggest that the current procedure can result in a valid and easy-touse measure for each aspect of the Map.

Keywords: analytical thinking, dialogical inquiry, education, lifelong learning, pedagogy, scale development

Procedia PDF Downloads 66
9 Teachers’ Language Insecurity in English as a Second Language Instruction: Developing Effective In-Service Training

Authors: Mamiko Orii

Abstract:

This study reports on primary school second language teachers’ sources of language insecurity. Furthermore, it aims to develop an in-service training course to reduce anxiety and build sufficient English communication skills. Language/Linguistic insecurity refers to a lack of confidence experienced by language speakers. In particular, second language/non-native learners often experience insecurity, influencing their learning efficacy. While language learner insecurity has been well-documented, research on the insecurity of language teaching professionals is limited. Teachers’ language insecurity or anxiety in target language use may adversely affect language instruction. For example, they may avoid classroom activities requiring intensive language use. Therefore, understanding teachers’ language insecurity and providing continuing education to help teachers to improve their proficiency is vital to improve teaching quality. This study investigated Japanese primary school teachers’ language insecurity. In Japan, teachers are responsible for teaching most subjects, including English, which was recently added as compulsory. Most teachers have never been professionally trained in second language instruction during college teacher certificate preparation, leading to low confidence in English teaching. Primary source of language insecurity is a lack of confidence regarding English communication skills. Their actual use of English in classrooms remains unclear. Teachers’ classroom speech remains a neglected area requiring improvement. A more refined programme for second language teachers could be constructed if we can identify areas of need. Two questionnaires were administered to primary school teachers in Tokyo: (1) Questionnaire A: 396 teachers answered questions (using a 5-point scale) concerning classroom teaching anxiety and general English use and needs for in-service training (Summer 2021); (2) Questionnaire B: 20 teachers answered detailed questions concerning their English use (Autumn 2022). Questionnaire A’s responses showed that over 80% of teachers have significant language insecurity and anxiety, mainly when speaking English in class or teaching independently. Most teachers relied on a team-teaching partner (e.g., ALT) and avoided speaking English. Over 70% of the teachers said they would like to participate in training courses in classroom English. Questionnaire B’s results showed that teachers could use simple classroom English, such as greetings and basic instructions (e.g., stand up, repeat after me), and initiate conversation (e.g., asking questions). In contrast, teachers reported that conversations were mainly carried on in a simple question-answer style. They had difficulty continuing conversations. Responding to learners’ ‘on-the-spot’ utterances was particularly difficult. Instruction in turn-taking patterns suitable in the classroom communication context is needed. Most teachers received grammar-based instruction during their entire English education. They were predominantly exposed to displayed questions and form-focused corrective feedback. Therefore, strategies such as encouraging teachers to ask genuine questions (i.e., referential questions) and responding to students with content feedback are crucial. When learners’ utterances are incorrect or unsatisfactory, teachers should rephrase or extend (recast) them instead of offering explicit corrections. These strategies support a continuous conversational flow. These results offer benefits beyond Japan’s English as a second Language context. They will be valuable in any context where primary school teachers are underprepared but must provide English-language instruction.

Keywords: english as a second/non-native language, in-service training, primary school, teachers’ language insecurity

Procedia PDF Downloads 46
8 The End Justifies the Means: Using Programmed Mastery Drill to Teach Spoken English to Spanish Youngsters, without Relying on Homework

Authors: Robert Pocklington

Abstract:

Most current language courses expect students to be ‘vocational’, sacrificing their free time in order to learn. However, pupils with a full-time job, or bringing up children, hardly have a spare moment. Others just need the language as a tool or a qualification, as if it were book-keeping or a driving license. Then there are children in unstructured families whose stressful life makes private study almost impossible. And the countless parents whose evenings and weekends have become a nightmare, trying to get the children to do their homework. There are many arguments against homework being a necessity (rather than an optional extra for more ambitious or dedicated students), making a clear case for teaching methods which facilitate full learning of the key content within the classroom. A methodology which could be described as Programmed Mastery Learning has been used at Fluency Language Academy (Spain) since 1992, to teach English to over 4000 pupils yearly, with a staff of around 100 teachers, barely requiring homework. The course is structured according to the tenets of Programmed Learning: small manageable teaching steps, immediate feedback, and constant successful activity. For the Mastery component (not stopping until everyone has learned), the memorisation and practice are entrusted to flashcard-based drilling in the classroom, leading all students to progress together and develop a permanently growing knowledge base. Vocabulary and expressions are memorised using flashcards as stimuli, obliging the brain to constantly recover words from the long-term memory and converting them into reflex knowledge, before they are deployed in sentence building. The use of grammar rules is practised with ‘cue’ flashcards: the brain refers consciously to the grammar rule each time it produces a phrase until it comes easily. This automation of lexicon and correct grammar use greatly facilitates all other language and conversational activities. The full B2 course consists of 48 units each of which takes a class an average of 17,5 hours to complete, allowing the vast majority of students to reach B2 level in 840 class hours, which is corroborated by an 85% pass-rate in the Cambridge University B2 exam (First Certificate). In the past, studying for qualifications was just one of many different options open to young people. Nowadays, youngsters need to stay at school and obtain qualifications in order to get any kind of job. There are many students in our classes who have little intrinsic interest in what they are studying; they just need the certificate. In these circumstances and with increasing government pressure to minimise failure, teachers can no longer think ‘If they don’t study, and fail, its their problem’. It is now becoming the teacher’s problem. Teachers are ever more in need of methods which make their pupils successful learners; this means assuring learning in the classroom. Furthermore, homework is arguably the main divider between successful middle-class schoolchildren and failing working-class children who drop out: if everything important is learned at school, the latter will have a much better chance, favouring inclusiveness in the language classroom.

Keywords: flashcard drilling, fluency method, mastery learning, programmed learning, teaching English as a foreign language

Procedia PDF Downloads 84
7 A Corpus-Based Analysis of "MeToo" Discourse in South Korea: Coverage Representation in Korean Newspapers

Authors: Sun-Hee Lee, Amanda Kraley

Abstract:

The “MeToo” movement is a social movement against sexual abuse and harassment. Though the hashtag went viral in 2017 following different cultural flashpoints in different countries, the initial response was quiet in South Korea. This radically changed in January 2018, when a high-ranking senior prosecutor, Seo Ji-hyun, gave a televised interview discussing being sexually assaulted by a colleague. Acknowledging public anger, particularly among women, on the long-existing problems of sexual harassment and abuse, the South Korean media have focused on several high-profile cases. Analyzing the media representation of these cases is a window into the evolving South Korean discourse around “MeToo.” This study presents a linguistic analysis of “MeToo” discourse in South Korea by utilizing a corpus-based approach. The term corpus (pl. corpora) is used to refer to electronic language data, that is, any collection of recorded instances of spoken or written language. A “MeToo” corpus has been collected by extracting newspaper articles containing the keyword “MeToo” from BIGKinds, big data analysis, and service and Nexis Uni, an online academic database search engine, to conduct this language analysis. The corpus analysis explores how Korean media represent accusers and the accused, victims and perpetrators. The extracted data includes 5,885 articles from four broadsheet newspapers (Chosun, JoongAng, Hangyore, and Kyunghyang) and 88 articles from two Korea-based English newspapers (Korea Times and Korea Herald) between January 2017 and November 2020. The information includes basic data analysis with respect to keyword frequency and network analysis and adds refined examinations of select corpus samples through naming strategies, semantic relations, and pragmatic properties. Along with the exponential increase of the number of articles containing the keyword “MeToo” from 104 articles in 2017 to 3,546 articles in 2018, the network and keyword analysis highlights ‘US,’ ‘Harvey Weinstein’, and ‘Hollywood,’ as keywords for 2017, with articles in 2018 highlighting ‘Seo Ji-Hyun, ‘politics,’ ‘President Moon,’ ‘An Ui-Jeong, ‘Lee Yoon-taek’ (the names of perpetrators), and ‘(Korean) society.’ This outcome demonstrates the shift of media focus from international affairs to domestic cases. Another crucial finding is that word ‘defamation’ is widely distributed in the “MeToo” corpus. This relates to the South Korean legal system, in which a person who defames another by publicly alleging information detrimental to their reputation—factual or fabricated—is punishable by law (Article 307 of the Criminal Act of Korea). If the defamation occurs on the internet, it is subject to aggravated punishment under the Act on Promotion of Information and Communications Network Utilization and Information Protection. These laws, in particular, have been used against accusers who have publicly come forward in the wake of “MeToo” in South Korea, adding an extra dimension of risk. This corpus analysis of “MeToo” newspaper articles contributes to the analysis of the media representation of the “MeToo” movement and sheds light on the shifting landscape of gender relations in the public sphere in South Korea.

Keywords: corpus linguistics, MeToo, newspapers, South Korea

Procedia PDF Downloads 189
6 Even When the Passive Resistance Is Obligatory: Civil Intellectuals’ Solidarity Activism in Tea Workers Movement

Authors: Moshreka Aditi Huq

Abstract:

This study shows how a progressive portion of civil intellectuals in Bangladesh contributed as the solidarity activist entities in a movement of tea workers that became the symbol of their unique moral struggle. Their passive yet sharp way of resistance, with the integration of mass tea workers of a tea estate, got demonstrated against certain private companies and government officials who approached to establish a special economic zone inside the tea garden without offering any compensation and rehabilitation for poor tea workers. Due to massive protests and rebellion, the authorized entrepreneurs had to step back and called off the project immediately. The extraordinary features of this movement generated itself from the deep core social need of indigenous tea workers who are still imprisoned in the colonial cage. Following an anthropological and ethnographic perspective, this study adopted the main three techniques of intensive interview, focus group discussion, and laborious observation, to extract empirical data. The intensive interviews were undertaken informally using a mostly conversational approach. Focus group discussions were piloted among various representative groups where observations prevailed as part of the regular documentation process. These were conducted among civil intellectual entities, tea workers, tea estate authorities, civil service authorities, and business officials to obtain a holistic view of the situation. The fieldwork was executed in capital Dhaka city, along with northern areas like Chandpur-Begumkhan Tea Estate of Chunarughat Upazilla and Habiganj city of Habiganj District of Bangladesh. Correspondingly, secondary data were accessed through books, scholarly papers, archives, newspapers, reports, leaflets, posters, writing blog, and electronic pages of social media. The study results find that: (1) civil intellectuals opposed state-sponsored business impositions by producing counter-discourse and struggled against state hegemony through the phases of the movement; (2) instead of having the active physical resistance, civil intellectuals’ strength was preferably in passive form which was portrayed through their intellectual labor; (3) the combined movement of tea workers and civil intellectuals reflected on social security of ethnic worker communities that contrasts state’s pseudo-development motives which ultimately supports offensive and oppressive neoliberal growths of economy; (4) civil intellectuals are revealed as having certain functional limitations in the process of movement organization as well as resource mobilization; (5) in specific contexts, the genuine need of protest by indigenous subaltern can overshadow intellectual elitism and helps to raise the voices of ‘subjugated knowledge’. This study is quite likely to represent two sets of apparent protagonist entities in the discussion of social injustice and oppressive development intervention. On the one, hand it may help us to find the basic functional characteristics of civil intellectuals in Bangladesh when they are in a passive mode of resistance in social movement issues. On the other hand, it represents the community ownership and inherent protest tendencies of indigenous workers when they feel threatened and insecure. The study seems to have the potential to understand the conditions of ‘subjugated knowledge’ of subalterns. Furthermore, being the memory and narratives, these ‘activism mechanisms’ of social entities broadens the path to understand ‘power’ and ‘resistance’ in more fascinating ways.

Keywords: civil intellectuals, resistance, subjugated knowledge, indigenous

Procedia PDF Downloads 103
5 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 78
4 A Multimodal Discourse Analysis of Gender Representation on Health and Fitness Magazine Cover Pages

Authors: Nashwa Elyamany

Abstract:

In visual cultures, namely that of the United States, media representations are such influential and pervasive reflections of societal norms and expectations to the extent that they impact the manner in which both genders view themselves. Health and fitness magazines fall within the realm of visual culture. Since the main goal of communication is to ensure proper dissemination of information in order for the target audience to grasp the intended messages, it becomes imperative that magazine publishers, editors, advertisers and image producers use different modes of communication within their reach to convey messages to their readers and viewers. A rapid waxing flow of multimodality floods popular discourse, particularly health and fitness magazine cover pages. The use of well-crafted cover lines and visual images is imbued with agendas, consumerist ideologies and properties capable of effectively conveying implicit and explicit meaning to potential readers and viewers. In essence, the primary goal of this thesis is to interrogate the multi-semiotic operations and manifestations of hegemonic masculinity and femininity in male and female body culture, particularly on the cover pages of the twin American magazines Men's Health and Women's Health using corpora that spanned from 2011 to the mid of 2016. The researcher explores the semiotic resources that contribute to shaping and legitimizing a new form of postmodern, consumerist, gendered discourse that positions the reader-viewer ideologically. Methodologically, the researcher carries out analysis on the macro and micro levels. On the macro level, the researcher takes on a critical stance to illuminate the ideological nature of the multimodal ensemble of the cover pages, and, on the micro level, seeks to put forward new theoretical and methodological routes through which the semiotic choices well invested on the media texts can be more objectively scrutinized. On the macro level, a 'themes' analysis is initially conducted to isolate the overarching themes that dominate the fitness discourse on the cover pages under study. It is argued that variation in terms of frequencies of such themes is indicative, broadly speaking, of which facets of hegemonic masculinity and femininity are infused in the fitness discourse on the cover pages. On the micro level, this research work encompasses three sub-levels of analysis. The researcher follows an SF-MMDA approach, drawing on a trio of analytical frameworks: Halliday's SFG for the verbal analysis; Kress & van Leeuween's VG for the visual analysis; and CMT in relation to Sperber & Wilson's RT for the pragma-cognitive analysis of multimodal metaphors and metonymies. The data is presented in terms of detailed descriptions in conjunction with frequency tables, ANOVA with alpha=0.05 and MANOVA in the multiple phases of analysis. Insights and findings from this multi-faceted, social-semiotic analysis are interpreted in light of Cultivation Theory, Self-objectification Theory and the literature to date. Implications for future research include the implementation of a multi-dimensional approach whereby linguistic and visual analytical models are deployed with special regards to cultural variation.

Keywords: gender, hegemony, magazine cover page, multimodal discourse analysis, multimodal metaphor, multimodal metonymy, systemic functional grammar, visual grammar

Procedia PDF Downloads 316
3 Neologisms and Word-Formation Processes in Board Game Rulebook Corpus: Preliminary Results

Authors: Athanasios Karasimos, Vasiliki Makri

Abstract:

This research focuses on the design and development of the first text Corpus based on Board Game Rulebooks (BGRC) with direct application on the morphological analysis of neologisms and tendencies in word-formation processes. Corpus linguistics is a dynamic field that examines language through the lens of vast collections of texts. These corpora consist of diverse written and spoken materials, ranging from literature and newspapers to transcripts of everyday conversations. By morphologically analyzing these extensive datasets, morphologists can gain valuable insights into how language functions and evolves, as these extensive datasets can reflect the byproducts of inflection, derivation, blending, clipping, compounding, and neology. This entails scrutinizing how words are created, modified, and combined to convey meaning in a corpus of challenging, creative, and straightforward texts that include rules, examples, tutorials, and tips. Board games teach players how to strategize, consider alternatives, and think flexibly, which are critical elements in language learning. Their rulebooks reflect not only their weight (complexity) but also the language properties of each genre and subgenre of these games. Board games are a captivating realm where strategy, competition, and creativity converge. Beyond the excitement of gameplay, board games also spark the art of word creation. Word games, like Scrabble, Codenames, Bananagrams, Wordcraft, Alice in the Wordland, Once uUpona Time, challenge players to construct words from a pool of letters, thus encouraging linguistic ingenuity and vocabulary expansion. These games foster a love for language, motivating players to unearth obscure words and devise clever combinations. On the other hand, the designers and creators produce rulebooks, where they include their joy of discovering the hidden potential of language, igniting the imagination, and playing with the beauty of words, making these games a delightful fusion of linguistic exploration and leisurely amusement. In this research, more than 150 rulebooks in English from all types of modern board games, either language-independent or language-dependent, are used to create the BGRC. A representative sample of each genre (family, party, worker placement, deckbuilding, dice, and chance games, strategy, eurogames, thematic, role-playing, among others) was selected based on the score from BoardGameGeek, the size of the texts and the level of complexity (weight) of the game. A morphological model with morphological networks, multi-word expressions, and word-creation mechanics based on the complexity of the textual structure, difficulty, and board game category will be presented. In enabling the identification of patterns, trends, and variations in word formation and other morphological processes, this research aspires to make avail of this creative yet strict text genre so as to (a) give invaluable insight into morphological creativity and innovation that (re)shape the lexicon of the English language and (b) test morphological theories. Overall, it is shown that corpus linguistics empowers us to explore the intricate tapestry of language, and morphology in particular, revealing its richness, flexibility, and adaptability in the ever-evolving landscape of human expression.

Keywords: board game rulebooks, corpus design, morphological innovations, neologisms, word-formation processes

Procedia PDF Downloads 59
2 PsyVBot: Chatbot for Accurate Depression Diagnosis using Long Short-Term Memory and NLP

Authors: Thaveesha Dheerasekera, Dileeka Sandamali Alwis

Abstract:

The escalating prevalence of mental health issues, such as depression and suicidal ideation, is a matter of significant global concern. It is plausible that a variety of factors, such as life events, social isolation, and preexisting physiological or psychological health conditions, could instigate or exacerbate these conditions. Traditional approaches to diagnosing depression entail a considerable amount of time and necessitate the involvement of adept practitioners. This underscores the necessity for automated systems capable of promptly detecting and diagnosing symptoms of depression. The PsyVBot system employs sophisticated natural language processing and machine learning methodologies, including the use of the NLTK toolkit for dataset preprocessing and the utilization of a Long Short-Term Memory (LSTM) model. The PsyVBot exhibits a remarkable ability to diagnose depression with a 94% accuracy rate through the analysis of user input. Consequently, this resource proves to be efficacious for individuals, particularly those enrolled in academic institutions, who may encounter challenges pertaining to their psychological well-being. The PsyVBot employs a Long Short-Term Memory (LSTM) model that comprises a total of three layers, namely an embedding layer, an LSTM layer, and a dense layer. The stratification of these layers facilitates a precise examination of linguistic patterns that are associated with the condition of depression. The PsyVBot has the capability to accurately assess an individual's level of depression through the identification of linguistic and contextual cues. The task is achieved via a rigorous training regimen, which is executed by utilizing a dataset comprising information sourced from the subreddit r/SuicideWatch. The diverse data present in the dataset ensures precise and delicate identification of symptoms linked with depression, thereby guaranteeing accuracy. PsyVBot not only possesses diagnostic capabilities but also enhances the user experience through the utilization of audio outputs. This feature enables users to engage in more captivating and interactive interactions. The PsyVBot platform offers individuals the opportunity to conveniently diagnose mental health challenges through a confidential and user-friendly interface. Regarding the advancement of PsyVBot, maintaining user confidentiality and upholding ethical principles are of paramount significance. It is imperative to note that diligent efforts are undertaken to adhere to ethical standards, thereby safeguarding the confidentiality of user information and ensuring its security. Moreover, the chatbot fosters a conducive atmosphere that is supportive and compassionate, thereby promoting psychological welfare. In brief, PsyVBot is an automated conversational agent that utilizes an LSTM model to assess the level of depression in accordance with the input provided by the user. The demonstrated accuracy rate of 94% serves as a promising indication of the potential efficacy of employing natural language processing and machine learning techniques in tackling challenges associated with mental health. The reliability of PsyVBot is further improved by the fact that it makes use of the Reddit dataset and incorporates Natural Language Toolkit (NLTK) for preprocessing. PsyVBot represents a pioneering and user-centric solution that furnishes an easily accessible and confidential medium for seeking assistance. The present platform is offered as a modality to tackle the pervasive issue of depression and the contemplation of suicide.

Keywords: chatbot, depression diagnosis, LSTM model, natural language process

Procedia PDF Downloads 36
1 The Integration of Digital Humanities into the Sociology of Knowledge Approach to Discourse Analysis

Authors: Gertraud Koch, Teresa Stumpf, Alejandra Tijerina García

Abstract:

Discourse analysis research approaches belong to the central research strategies applied throughout the humanities; they focus on the countless forms and ways digital texts and images shape present-day notions of the world. Despite the constantly growing number of relevant digital, multimodal discourse resources, digital humanities (DH) methods are thus far not systematically developed and accessible for discourse analysis approaches. Specifically, the significance of multimodality and meaning plurality modelling are yet to be sufficiently addressed. In order to address this research gap, the D-WISE project aims to develop a prototypical working environment as digital support for the sociology of knowledge approach to discourse analysis and new IT-analysis approaches for the use of context-oriented embedding representations. Playing an essential role throughout our research endeavor is the constant optimization of hermeneutical methodology in the use of (semi)automated processes and their corresponding epistemological reflection. Among the discourse analyses, the sociology of knowledge approach to discourse analysis is characterised by the reconstructive and accompanying research into the formation of knowledge systems in social negotiation processes. The approach analyses how dominant understandings of a phenomenon develop, i.e., the way they are expressed and consolidated by various actors in specific arenas of discourse until a specific understanding of the phenomenon and its socially accepted structure are established. This article presents insights and initial findings from D-WISE, a joint research project running since 2021 between the Institute of Anthropological Studies in Culture and History and the Language Technology Group of the Department of Informatics at the University of Hamburg. As an interdisciplinary team, we develop central innovations with regard to the availability of relevant DH applications by building up a uniform working environment, which supports the procedure of the sociology of knowledge approach to discourse analysis within open corpora and heterogeneous, multimodal data sources for researchers in the humanities. We are hereby expanding the existing range of DH methods by developing contextualized embeddings for improved modelling of the plurality of meaning and the integrated processing of multimodal data. The alignment of this methodological and technical innovation is based on the epistemological working methods according to grounded theory as a hermeneutic methodology. In order to systematically relate, compare, and reflect the approaches of structural-IT and hermeneutic-interpretative analysis, the discourse analysis is carried out both manually and digitally. Using the example of current discourses on digitization in the healthcare sector and the associated issues regarding data protection, we have manually built an initial data corpus of which the relevant actors and discourse positions are analysed in conventional qualitative discourse analysis. At the same time, we are building an extensive digital corpus on the same topic based on the use and further development of entity-centered research tools such as topic crawlers and automated newsreaders. In addition to the text material, this consists of multimodal sources such as images, video sequences, and apps. In a blended reading process, the data material is filtered, annotated, and finally coded with the help of NLP tools such as dependency parsing, named entity recognition, co-reference resolution, entity linking, sentiment analysis, and other project-specific tools that are being adapted and developed. The coding process is carried out (semi-)automated by programs that propose coding paradigms based on the calculated entities and their relationships. Simultaneously, these can be specifically trained by manual coding in a closed reading process and specified according to the content issues. Overall, this approach enables purely qualitative, fully automated, and semi-automated analyses to be compared and reflected upon.

Keywords: entanglement of structural IT and hermeneutic-interpretative analysis, multimodality, plurality of meaning, sociology of knowledge approach to discourse analysis

Procedia PDF Downloads 203