Search results for: corpus grammar
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 546

Search results for: corpus grammar

306 The Role and Effects of Communication on Occupational Safety: A Review

Authors: Pieter A. Cornelissen, Joris J. Van Hoof

Abstract:

The interest in improving occupational safety started almost simultaneously with the beginning of the Industrial Revolution. Yet, it was not until the late 1970’s before the role of communication was considered in scientific research regarding occupational safety. In recent years the importance of communication as a means to improve occupational safety has increased. Not only as communication might have a direct effect on safety performance and safety outcomes, but also as it can be viewed as a major component of other important safety-related elements (e.g., training, safety meetings, leadership). And while safety communication is an increasingly important topic in research, its operationalization is often vague and differs among studies. This is not only problematic when comparing results, but also in applying these results to practice and the work floor. By means of an in-depth analysis—building on an existing dataset—this review aims to overcome these problems. The initial database search yielded 25.527 articles, which was reduced to a research corpus of 176 articles. Focusing on the 37 articles of this corpus that addressed communication (related to safety outcomes and safety performance), the current study will provide a comprehensive overview of the role and effects of safety communication and outlines the conditions under which communication contributes to a safer work environment. The study shows that in literature a distinction is commonly made between safety communication (i.e., the exchange or dissemination of safety-related information) and feedback (i.e. a reactive form of communication). And although there is a consensus among researchers that both communication and feedback positively affect safety performance, there is a debate about the directness of this relationship. Whereas some researchers assume a direct relationship between safety communication and safety performance, others state that this relationship is mediated by safety climate. One of the key findings is that despite the strongly present view that safety communication is a formal and top-down safety management tool, researchers stress the importance of open communication that encourages and allows employees to express their worries, experiences, views, and share information. This raises questions with regard to other directions (e.g., bottom-up, horizontal) and forms of communication (e.g., informal). The current review proposes a framework to overcome the often vague and different operationalizations of safety communication. The proposed framework can be used to characterize safety communication in terms of stakeholders, direction, and characteristics of communication (e.g., medium usage).

Keywords: communication, feedback, occupational safety, review

Procedia PDF Downloads 267
305 A Corpus-Based Study on the Lexical, Syntactic and Sequential Features across Interpreting Types

Authors: Qianxi Lv, Junying Liang

Abstract:

Among the various modes of interpreting, simultaneous interpreting (SI) is regarded as a ‘complex’ and ‘extreme condition’ of cognitive tasks while consecutive interpreters (CI) do not have to share processing capacity between tasks. Given that SI exerts great cognitive demand, it makes sense to posit that the output of SI may be more compromised than that of CI in the linguistic features. The bulk of the research has stressed the varying cognitive demand and processes involved in different modes of interpreting; however, related empirical research is sparse. In keeping with our interest in investigating the quantitative linguistic factors discriminating between SI and CI, the current study seeks to examine the potential lexical simplification, syntactic complexity and sequential organization mechanism with a self-made inter-model corpus of transcribed simultaneous and consecutive interpretation, translated speech and original speech texts with a total running word of 321960. The lexical features are extracted in terms of the lexical density, list head coverage, hapax legomena, and type-token ratio, as well as core vocabulary percentage. Dependency distance, an index for syntactic complexity and reflective of processing demand is employed. Frequency motif is a non-grammatically-bound sequential unit and is also used to visualize the local function distribution of interpreting the output. While SI is generally regarded as multitasking with high cognitive load, our findings evidently show that CI may impose heavier or taxing cognitive resource differently and hence yields more lexically and syntactically simplified output. In addition, the sequential features manifest that SI and CI organize the sequences from the source text in different ways into the output, to minimize the cognitive load respectively. We reasoned the results in the framework that cognitive demand is exerted both on maintaining and coordinating component of Working Memory. On the one hand, the information maintained in CI is inherently larger in volume compared to SI. On the other hand, time constraints directly influence the sentence reformulation process. The temporal pressure from the input in SI makes the interpreters only keep a small chunk of information in the focus of attention. Thus, SI interpreters usually produce the output by largely retaining the source structure so as to relieve the information from the working memory immediately after formulated in the target language. Conversely, CI interpreters receive at least a few sentences before reformulation, when they are more self-paced. CI interpreters may thus tend to retain and generate the information in a way to lessen the demand. In other words, interpreters cope with the high demand in the reformulation phase of CI by generating output with densely distributed function words, more content words of higher frequency values and fewer variations, simpler structures and more frequently used language sequences. We consequently propose a revised effort model based on the result for a better illustration of cognitive demand during both interpreting types.

Keywords: cognitive demand, corpus-based, dependency distance, frequency motif, interpreting types, lexical simplification, sequential units distribution, syntactic complexity

Procedia PDF Downloads 141
304 Software Architectural Design Ontology

Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah

Abstract:

Software architecture plays a key role in software development but absence of formal description of software architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for software architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate software architectural design information.

Keywords: semantic-based software architecture, software architecture, ontology, software engineering

Procedia PDF Downloads 513
303 The Effect of Problem-Based Mobile-Assisted Tasks on Spoken Intelligibility of English as a Foreign Language Learners

Authors: Loghman Ansarian, Teoh Mei Lin

Abstract:

In an attempt to increase oral proficiency of Iranian EFL learners, the researchers compared the effect of problem-based mobile-assisted language learning with the conventional language learning approach (Communicative Language Teaching) in Iran. The experimental group (n=37) went through PBL instruction and the control group (n=33) went through conventional instruction. The results of quantitative data analysis after 26 sessions of treatment revealed that PBL could positively affect participants' knowledge of grammar, vocabulary, spoken fluency, and pronunciation; however, in terms of task achievement, no significant effect was found. This study can have pedagogical implications for language teachers, and material developers.

Keywords: problem-based learning, spoken intelligibility, Iranian EFL context, cognitive learning

Procedia PDF Downloads 152
302 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs

Authors: Michela Quadrini

Abstract:

Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.

Keywords: chord diagrams, linear chord diagram, equivalence class, topological language

Procedia PDF Downloads 173
301 The Effect of Written Corrective Feedback on the Accurate Use of Grammatical Forms by Japanese Low-Intermediate EFL Learners

Authors: Ayako Hasegawa, Ken Ubukata

Abstract:

The purpose of this study is to investigate whether corrective feedback has any significant effect on Japanese low-intermediate EFL learners’ performance on a specific set of linguistic features. The subjects are Japanese college students majoring in English. They have studied English for about 7 years, but their inter-language seems to fossilize because non-target like errors is frequently observed in traditional deductive teacher-fronted approach. It has been reported that corrective feedback plays an important role in diminishing or overcoming inter-language fossilization and achieving TL competency. Therefore, it was examined how the corrective feedback (the focus of this study was metalinguistic feedback) and self-correction raised the students’ awareness and helped them notice the gaps between their inter-language and the TL.

Keywords: written corrective feedback, fossilized error, grammar teaching, language teaching

Procedia PDF Downloads 335
300 Definition of a Computing Independent Model and Rules for Transformation Focused on the Model-View-Controller Architecture

Authors: Vanessa Matias Leite, Jandira Guenka Palma, Flávio Henrique de Oliveira

Abstract:

This paper presents a model-oriented development approach to software development in the Model-View-Controller (MVC) architectural standard. This approach aims to expose a process of extractions of information from the models, in which through rules and syntax defined in this work, assists in the design of the initial model and its future conversions. The proposed paper presents a syntax based on the natural language, according to the rules agreed in the classic grammar of the Portuguese language, added to the rules of conversions generating models that follow the norms of the Object Management Group (OMG) and the Meta-Object Facility MOF.

Keywords: BNF Syntax, model driven architecture, model-view-controller, transformation, UML

Procedia PDF Downloads 365
299 Learning and Teaching Strategies in Association with EXE Program for Master Course Students of Yerevan Brusov State University of Languages and Social Sciences

Authors: Susanna Asatryan

Abstract:

The author will introduce a single module related to English teaching methodology for master course students getting specialization “A Foreign Language Teacher of High Schools And Professional Educational Institutions” of Yerevan Brusov State University of Languages and Social Sciences. The overall aim of the presentation is to introduce learning and teaching strategies within EXE Computer program for Mastery student-teachers of the University. The author will display the advantages of the use of this program. The learners interact with the teacher in the classroom as well as they are provided an opportunity for virtual domain to carry out their learning procedures in association with assessment and self-assessment. So they get integrated into blended learning. As this strategy is in its piloting stage, the author has elaborated a single module, embracing 3 main sections: -Teaching English vocabulary at high school, -Teaching English grammar at high school, and -Teaching English pronunciation at high school. The author will present the above mentioned topics with corresponding sections and subsections. The strong point is that preparing this module we have planned to display it on the blended learning landscape. So for this account working with EXE program is highly effective. As it allows the users to operate several tools for self-learning and self-testing/assessment. The author elaborated 3 single EXE files for each topic. Each file starts with the section’s subject-specific description: - Objectives and Pre-knowledge, followed by the theoretical part. The author associated and flavored her observations with appropriate samples of charts, drawings, diagrams, recordings, video-clips, photos, pictures, etc. to make learning process more effective and enjoyable. Before or after the article the author has downloaded a video clip, related to the current topic. EXE offers a wide range of tools to work out or prepare different activities and exercises for the learners: 'Interactive/non-interactive' and 'Textual/non-textual'. So with the use of these tools Multi-Select, Multi-Choice, Cloze, Drop-Down, Case Study, Gap-Filling, Matching and different other types of activities have been elaborated and submitted to the appropriate sections. The learners task is to prepare themselves for the coming module or seminar, related to teaching methodology of English vocabulary, grammar, and pronunciation. The point is that the teacher has an opportunity for face to face communication, as well as to connect with the learners through the Moodle, or as a single EXE file offer it to the learners for their self-study and self-assessment. As for the students’ feedback –EXE environment also makes it available.

Keywords: blended learning, EXE program, learning/teaching strategies, self-study/assessment, virtual domain,

Procedia PDF Downloads 448
298 Variation in Italian Specialized Economic Texts

Authors: Abdelmagid Basyouny Sakr

Abstract:

Terminological variation is a reality and it is now recognized by terminologists. This paper investigates the terminological variation in the context of specialized economic texts in Italian. It aims to find whether certain patterns or tendencies can be derived from the analysis of these texts. Term variants pose two different kinds of difficulties. The first one is being able to recognize linguistic expressions that denote the same concept in running text. Another one lies in knowing which variant should be considered and for what purpose. This would help to differentiate between variants that could be candidates for inclusion in terminological resources and the ones which are synonyms or contextual variants. New insights about terminological variation in specialized texts could contribute to improve specialized dictionaries which will better account for the different ways in which a given thought is expressed.

Keywords: corpus linguistics, specialized communication, terms and concepts, terminological variation

Procedia PDF Downloads 122
297 Creation and Evaluation of an Academic Blog of Tools for the Self-Correction of Written Production in English

Authors: Brady, Imelda Katherine, Da Cunha Fanego, Iria

Abstract:

Today's university students are considered digital natives and the use of Information Technologies (ITs) forms a large part of their study and learning. In the context of language studies, applications that help with revisions of grammar or vocabulary are particularly useful, especially if they are open access. There are studies that show the effectiveness of this type of application in the learning of English as a foreign language and that using IT can help learners become more autonomous in foreign language acquisition, given that these applications can enhance awareness of the learning process; this means that learners are less dependent on the teacher for corrective feedback. We also propose that the exploitation of these technologies also enhances the work of the language instructor wishing to incorporate IT into his/her practice. In this context, the aim of this paper is to present the creation of a repository of tools that provide support in the writing and correction of texts in English and the assessment of their usefulness on behalf of university students enrolled in the English Studies Degree. The project seeks to encourage the development of autonomous learning through the acquisition of skills linked to the self-correction of written work in English. To comply with the above, our methodology follows five phases. First of all, a selection of the main open-access online applications available for the correction of written texts in English is made: AutoCrit, Hemingway, Grammarly, LanguageTool, OutWrite, PaperRater, ProWritingAid, Reverso, Slick Write, Spell Check Plus and Virtual Writing Tutor. Secondly, the functionalities of each of these tools (spelling, grammar, style correction, etc.) are analyzed. Thirdly, explanatory materials (texts and video tutorials) are prepared on each tool. Fourth, these materials are uploaded into a repository of our university in the form of an institutional blog, which is made available to students and the general public. Finally, a survey was designed to collect students’ feedback. The survey aimed to analyse the usefulness of the blog and the quality of the explanatory materials as well as the degree of usefulness that students assigned to each of the tools offered. In this paper, we present the results of the analysis of data received from 33 students in the 1st semester of the 21-22 academic year. One result we highlight in our paper is that the students have rated this resource very highly, in addition to offering very valuable information on the perceived usefulness of the applications provided for them to review. Our work, carried out within the framework of a teaching innovation project funded by our university, emphasizes that teachers need to design methodological strategies that help their students improve the quality of their productions written in English and, by extension, to improve their linguistic competence.

Keywords: academic blog, open access tools, online self-correction, written production in English, university learning

Procedia PDF Downloads 69
296 Wavelets Contribution on Textual Data Analysis

Authors: Habiba Ben Abdessalem

Abstract:

The emergence of giant set of textual data was the push that has encouraged researchers to invest in this field. The purpose of textual data analysis methods is to facilitate access to such type of data by providing various graphic visualizations. Applying these methods requires a corpus pretreatment step, whose standards are set according to the objective of the problem studied. This step determines the forms list contained in contingency table by keeping only those information carriers. This step may, however, lead to noisy contingency tables, so the use of wavelet denoising function. The validity of the proposed approach is tested on a text database that offers economic and political events in Tunisia for a well definite period.

Keywords: textual data, wavelet, denoising, contingency table

Procedia PDF Downloads 255
295 Modeling False Statements in Texts

Authors: Francielle A. Vargas, Thiago A. S. Pardo

Abstract:

According to the standard philosophical definition, lying is saying something that you believe to be false with the intent to deceive. For deception detection, the FBI trains its agents in a technique named statement analysis, which attempts to detect deception based on parts of speech (i.e., linguistics style). This method is employed in interrogations, where the suspects are first asked to make a written statement. In this poster, we model false statements using linguistics style. In order to achieve this, we methodically analyze linguistic features in a corpus of fake news in the Portuguese language. The results show that they present substantial lexical, syntactic and semantic variations, as well as punctuation and emotion distinctions.

Keywords: deception detection, linguistics style, computational linguistics, natural language processing

Procedia PDF Downloads 182
294 Converse to the Sherman Inequality with Applications in Information Theory

Authors: Ana Barbir, S. Ivelic Bradanovic, D. Pecaric, J. Pecaric

Abstract:

We proved a converse to Sherman's inequality. Using the concept of f-divergence we obtained some inequalities for the well-known entropies, such as Shannon entropies that have many applications in many applied sciences, for example, in information theory, biology and economics Zipf-Mandelbrot law gave improvement in account for the low-rankwords in corpus. Applications of Zipf-Mandelbrot law can be found in linguistics, information sciences and also mostly applicable in ecological eld studies. We also introduced an entropy by applying the Zipf-Mandelbrot law and derived some related inequalities.

Keywords: f-divergence, majorization inequality, Sherman inequality, Zipf-Mandelbrot entropy

Procedia PDF Downloads 143
293 On the Semantics and Pragmatics of 'Be Able To': Modality and Actualisation

Authors: Benoît Leclercq, Ilse Depraetere

Abstract:

The goal of this presentation is to shed new light on the semantics and pragmatics of be able to. It presents the results of a corpus analysis based on data from the BNC (British National Corpus), and discusses these results in light of a specific stance on the semantics-pragmatics interface taking into account recent developments. Be able to is often discussed in relation to can and could, all of which can be used to express ability. Such an onomasiological approach often results in the identification of usage constraints for each expression. In the case of be able to, it is the formal properties of the modal expression (unlike can and could, be able to has non-finite forms) that are in the foreground, and the modal expression is described as the verb that conveys future ability. Be able to is also argued to expressed actualised ability in the past (I was able/could to open the door). This presentation aims to provide a more accurate pragmatic-semantic profile of be able to, based on extensive data analysis and one that is embedded in a very explicit view on the semantics-pragmatics interface. A random sample of 3000 examples (1000 for each modal verb) extracted from the BNC was analysed to account for the following issues. First, the challenge is to identify the exact semantic range of be able to. The results show that, contrary to general assumption, be able to does not only express ability but it shares most of the root meanings usually associated with the possibility modals can and could. The data reveal that what is called opportunity is, in fact, the most frequent meaning of be able to. Second, attention will be given to the notion of actualisation. It is commonly argued that be able to is the preferred form when the residue actualises: (1) The only reason he was able to do that was because of the restriction (BNC, spoken) (2) It is only through my imaginative shuffling of the aces that we are able to stay ahead of the pack. (BNC, written) Although this notion has been studied in detail within formal semantic approaches, empirical data is crucially lacking and it is unclear whether actualisation constitutes a conventional (and distinguishing) property of be able to. The empirical analysis provides solid evidence that actualisation is indeed a conventional feature of the modal. Furthermore, the dataset reveals that be able to expresses actualised 'opportunities' and not actualised 'abilities'. In the final part of this paper, attention will be given to the theoretical implications of the empirical findings, and in particular to the following paradox: how can the same expression encode both modal meaning (non-factual) and actualisation (factual)? It will be argued that this largely depends on one's conception of the semantics-pragmatics interface, and that this need not be an issue when actualisation (unlike modality) is analysed as a generalised conversational implicature and thus is considered part of the conventional pragmatic layer of be able to.

Keywords: Actualisation, Modality, Pragmatics, Semantics

Procedia PDF Downloads 98
292 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 302
291 The Influence of Screen Translation on Creative Audiovisual Writing: A Corpus-Based Approach

Authors: John D. Sanderson

Abstract:

The popularity of American cinema worldwide has contributed to the development of sociolects related to specific film genres in other cultural contexts by means of screen translation, in many cases eluding norms of usage in the target language, a process whose result has come to be known as 'dubbese'. A consequence for the reception in countries where local audiovisual fiction consumption is far lower than American imported productions is that this linguistic construct is preferred, even though it differs from common everyday speech. The iconography of film genres such as science-fiction, western or sword-and-sandal films, for instance, generates linguistic expectations in international audiences who will accept more easily the sociolects assimilated by the continuous reception of American productions, even if the themes, locations, characters, etc., portrayed on screen may belong in origin to other cultures. And the non-normative language (e.g., calques, semantic loans) used in the preferred mode of linguistic transfer, whether it is translation for dubbing or subtitling, has diachronically evolved in many cases into a status of canonized sociolect, not only accepted but also required, by foreign audiences of American films. However, a remarkable step forward is taken when this typology of artificial linguistic constructs starts being used creatively by nationals of these target cultural contexts. In the case of Spain, the success of American sitcoms such as Friends in the 1990s led Spanish television scriptwriters to include in national productions lexical and syntactical indirect borrowings (Anglicisms not formally identifiable as such because they include elements from their own language) in order to target audiences of the former. However, this commercial strategy had already taken place decades earlier when Spain became a favored location for the shooting of foreign films in the early 1960s. The international popularity of the then newly developed sub-genre known as Spaghetti-Western encouraged Spanish investors to produce their own movies, and local scriptwriters made use of the dubbese developed nationally since the advent of sound in film instead of using normative language. As a result, direct Anglicisms, as well as lexical and syntactical borrowings made up the creative writing of these Spanish productions, which also became commercially successful. Interestingly enough, some of these films were even marketed in English-speaking countries as original westerns (some of the names of actors and directors were anglified to that purpose) dubbed into English. The analysis of these 'back translations' will also foreground some semantic distortions that arose in the process. In order to perform the research on these issues, a wide corpus of American films has been used, which chronologically range from Stagecoach (John Ford, 1939) to Django Unchained (Quentin Tarantino, 2012), together with a shorter corpus of Spanish films produced during the golden age of Spaghetti Westerns, from una tumba para el sheriff (Mario Caiano; in English lone and angry man, William Hawkins) to tu fosa será la exacta, amigo (Juan Bosch, 1972; in English my horse, my gun, your widow, John Wood). The methodology of analysis and the conclusions reached could be applied to other genres and other cultural contexts.

Keywords: dubbing, film genre, screen translation, sociolect

Procedia PDF Downloads 134
290 How Is a Machine-Translated Literary Text Organized in Coherence? An Analysis Based upon Theme-Rheme Structure

Authors: Jiang Niu, Yue Jiang

Abstract:

With the ultimate goal to automatically generate translated texts with high quality, machine translation has made tremendous improvements. However, its translations of literary works are still plagued with problems in coherence, esp. the translation between distant language pairs. One of the causes of the problems is probably the lack of linguistic knowledge to be incorporated into the training of machine translation systems. In order to enable readers to better understand the problems of machine translation in coherence, to seek out the potential knowledge to be incorporated, and thus to improve the quality of machine translation products, this study applies Theme-Rheme structure to examine how a machine-translated literary text is organized and developed in terms of coherence. Theme-Rheme structure in Systemic Functional Linguistics is a useful tool for analysis of textual coherence. Theme is the departure point of a clause and Rheme is the rest of the clause. In a text, as Themes and Rhemes may be connected with each other in meaning, they form thematic and rhematic progressions throughout the text. Based on this structure, we can look into how a text is organized and developed in terms of coherence. Methodologically, we chose Chinese and English as the language pair to be studied. Specifically, we built a comparable corpus with two modes of English translations, viz. machine translation (MT) and human translation (HT) of one Chinese literary source text. The translated texts were annotated with Themes, Rhemes and their progressions throughout the texts. The annotated texts were analyzed from two respects, the different types of Themes functioning differently in achieving coherence, and the different types of thematic and rhematic progressions functioning differently in constructing texts. By analyzing and contrasting the two modes of translations, it is found that compared with the HT, 1) the MT features “pseudo-coherence”, with lots of ill-connected fragments of information using “and”; 2) the MT system produces a static and less interconnected text that reads like a list; these two points, in turn, lead to the less coherent organization and development of the MT than that of the HT; 3) novel to traditional and previous studies, Rhemes do contribute to textual connection and coherence though less than Themes do and thus are worthy of notice in further studies. Hence, the findings suggest that Theme-Rheme structure be applied to measuring and assessing the coherence of machine translation, to being incorporated into the training of the machine translation system, and Rheme be taken into account when studying the textual coherence of both MT and HT.

Keywords: coherence, corpus-based, literary translation, machine translation, Theme-Rheme structure

Procedia PDF Downloads 176
289 The Code-Mixing of Japanese, English, and Thai in Line Chat

Authors: Premvadee Na Nakornpanom

Abstract:

Language mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study was an attempt to explore the characteristics of the mixing of Japanese, English and Thai in a mobile chat room by students with their background of Japanese, English, and Thai. The result found that Insertion of Thai and English content words was a very common linguistic phenomenon embedded in the utterances. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotional-related. A Japanese sentence-final question particle“か”(ka) was added to the end of the sentence based on Thai grammar rule. Moreover, some unique characteristics were created. The non-verbal cues were represented in personal, Thai styles by inserting textual representations of images or feelings available on the websites into streams of conversations.

Keywords: code-mixing, Japanese, English, Thai, line chat

Procedia PDF Downloads 616
288 Off-Topic Text Detection System Using a Hybrid Model

Authors: Usama Shahid

Abstract:

Be it written documents, news columns, or students' essays, verifying the content can be a time-consuming task. Apart from the spelling and grammar mistakes, the proofreader is also supposed to verify whether the content included in the essay or document is relevant or not. The irrelevant content in any document or essay is referred to as off-topic text and in this paper, we will address the problem of off-topic text detection from a document using machine learning techniques. Our study aims to identify the off-topic content from a document using Echo state network model and we will also compare data with other models. The previous study uses Convolutional Neural Networks and TFIDF to detect off-topic text. We will rearrange the existing datasets and take new classifiers along with new word embeddings and implement them on existing and new datasets in order to compare the results with the previously existing CNN model.

Keywords: off topic, text detection, eco state network, machine learning

Procedia PDF Downloads 53
287 Perceptions of Tunisian EFL Students toward Their Writing Difficulties

Authors: Salwa Enneifer

Abstract:

The research is intended to investigate Tunisian students’ own perception of the difficulties they encounter in the writing task. To achieve this objective, a questionnaire was administered to students enrolled in the ‘Faculty of Letters Arts and Humanities’ in Kairouan, in Tunisia. Students were classified into three groups: first-, second-, and third-year students. The researcher used 120 questionnaires filled in by the students as data for this study; moreover, 30 students participated in a semi-structured interview to complete the data. The questionnaire results revealed that Tunisian EFL students faced spelling and grammar difficulties. ANOVA also revealed that the first-year students did not recognise that Arabic and English greatly differ in their respective punctuation systems. The second-year class, however, was fully aware of this difference. Additionally, the interview shed light on other aspects or different difficulties experienced by students in writing: a cruel ‘lack of vocabulary’, Arabic language interference, the organisation of the essay and especially the academic essay, and difficulty with writing an argumentative essay.

Keywords: difficulties, writing, Tunisian, EFL students

Procedia PDF Downloads 211
286 L2 Acquisition of Tense and Aspect by Cantonese and Mandarin ESL Learners of Different Proficiency Levels

Authors: Mable Chan

Abstract:

The present study about the acquisition of tense and aspect by Cantonese and Mandarin ESL learners aims to investigate the relationship between knowledge, the role that classroom input plays in the development of that knowledge, and learners' use of the L2 knowledge they acquire (i.e. their performance). Chinese has been argued as a tenseless language and Chinese ESL learners have to acquire the property from scratch. The study of acquisition of tense and aspect is a very fruitful research area in second language acquisition for a number of reasons. First, tense and aspect are notorious for being difficult for Chinese ESL learners. Second, to our knowledge, no studies have been done to compare Cantonese and Mandarin ESL learners and age effects in one single study. Data are now being collected and the findings from this comparison study of tense-aspect acquisition will shed light on both theoretical and pedagogical issues in second language acquisition, and contribute to a better understanding of both theoretical aspect concerning L2 acquisition of tense and aspect, and pedagogy of tense for L2 Chinese ESL learners.

Keywords: aspect, second language acquisition, tense, universal grammar

Procedia PDF Downloads 317
285 Searching Linguistic Synonyms through Parts of Speech Tagging

Authors: Faiza Hussain, Usman Qamar

Abstract:

Synonym-based searching is recognized to be a complicated problem as text mining from unstructured data of web is challenging. Finding useful information which matches user need from bulk of web pages is a cumbersome task. In this paper, a novel and practical synonym retrieval technique is proposed for addressing this problem. For replacement of semantics, user intent is taken into consideration to realize the technique. Parts-of-Speech tagging is applied for pattern generation of the query and a thesaurus for this experiment was formed and used. Comparison with Non-Context Based Searching, Context Based searching proved to be a more efficient approach while dealing with linguistic semantics. This approach is very beneficial in doing intent based searching. Finally, results and future dimensions are presented.

Keywords: natural language processing, text mining, information retrieval, parts-of-speech tagging, grammar, semantics

Procedia PDF Downloads 279
284 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 73
283 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 455
282 Difference and Haeccities: On the Religious Foundations of Deleuze’s Philosophy of Difference

Authors: Tony See

Abstract:

Although much has been devoted to Deleuze’s ethics of difference, relatively little has been focused on how his political perspective is informed by his appropriation of religious ideas and theological concepts. The bulk of the scholarly works have examined his political views with the assumption that they have little or nothing to do with his ideas of religions at all. This is in spite of the fact that Deleuze has drawn heavily from religious and theological thinkers such as Duns Scotus, Spinoza and Nietzsche. This dimension can also be traced in Deleuze’s later works, when he collaborated with Felix Guattari in creating an anti-Oedipal philosophy of difference after May 68. This paper seeks to reverse the tendency in contemporary scholarship ignore Deleuze’s ‘religious’ framework in his understanding of the ethical and the political. Towards this aim, we will refer to key texts in Deleuze’s corpus such as Expressionism in Philosophy, A Thousand Plateaus and others.

Keywords: difference, haeccities, identity, religion, theology

Procedia PDF Downloads 327
281 Dialect as a Means of Identification among Hausa Speakers

Authors: Hassan Sabo

Abstract:

Language is a system of conventionally spoken, manual and written symbols by human beings that members of a certain social group and participants in its culture express themselves. Communication, expression of identity and imaginative expression are among the functions of language. Dialect is a form of language, or a regional variety of language that is spoken in a particular geographical setting by a particular group of people. Hausa is one of the major languages in Africa, in terms of large number of people for whom it is the first language. Hausa is one of the western Chadic groups of languages. It constitutes one of the five or six branches of Afro-Asiatic family. The predominant Hausa speakers are in Nigeria and they live in different geographical locations which resulted to variety of dialects within the Hausa language apart of the standard Hausa language, the Hausa language has a variety of dialect that distinguish from one another by such features as phonology, grammar and vocabulary. This study intends to examine such features that serve as means of identification among Hausa speakers who are set off from others, geographically or socially.

Keywords: dialect, features, geographical location, Hausa language

Procedia PDF Downloads 166
280 Extending Image Captioning to Video Captioning Using Encoder-Decoder

Authors: Sikiru Ademola Adewale, Joe Thomas, Bolanle Hafiz Matti, Tosin Ige

Abstract:

This project demonstrates the implementation and use of an encoder-decoder model to perform a many-to-many mapping of video data to text captions. The many-to-many mapping occurs via an input temporal sequence of video frames to an output sequence of words to form a caption sentence. Data preprocessing, model construction, and model training are discussed. Caption correctness is evaluated using 2-gram BLEU scores across the different splits of the dataset. Specific examples of output captions were shown to demonstrate model generality over the video temporal dimension. Predicted captions were shown to generalize over video action, even in instances where the video scene changed dramatically. Model architecture changes are discussed to improve sentence grammar and correctness.

Keywords: decoder, encoder, many-to-many mapping, video captioning, 2-gram BLEU

Procedia PDF Downloads 66
279 The Folk Influences in the Melody of Romanian and Serbian Church Music

Authors: Eudjen Cinc

Abstract:

Common Byzantine origins of church music of Serbs and Romanians are certainly not the only reason for great similarities between the ways of singing of the two nations, especially in the region of Banat. If it was so, the differences between the interpretation of church music in this part of Orthodox religion and the one specific for other parts where Serbs or Romanians live could not be explained. What is it that connects church signing of two nations in this peaceful part of Europe to such an extent that it could be considered a comprehensive corpus, different from other 'Serbian' or 'Romanian' regions? This is the main issue dealt with in the text according to examples and comparative processing of material. The main aim of the paper is representation of the new and interesting, while its value lies in its potential to encourage the reader or a future researcher to investigate and search further.

Keywords: folk influences, melody, melodic models, ethnomusicology

Procedia PDF Downloads 230
278 L2 Reading in Distance Education: Analysis of Students' Reading Attitude and Interests

Authors: Ma. Junithesmer, D. Rosales

Abstract:

The study is a baseline description of students’ attitude and interests about L2 reading in a state university in the Philippines that uses distance education as a delivery mode. Most research conducted on this area dealt with the analysis of reading in a traditional school set-up. For this reason, this research was written to discover if there are implications as regards students’ preferences, interests and attitude reveal about L2 reading in a non-traditional set-up. To form the corpus of this study, it included the literature and studies about reading, preferred technological devices, titles of books and authors, reading medium traditional/ print and electronic books that juxtapose with students’ interest and feelings when reading at home and in school; and their views about their strengths and weaknesses as readers.

Keywords: distance education, L2 reading, reading, reading attitude

Procedia PDF Downloads 316
277 Using Music: An Effective Medium of Teaching Vocabulary in ESL Classroom

Authors: Takwa Jahan

Abstract:

Music can be used in ESL classroom to create a learning environment. As literature abounds with positive statements, music can be used as a vehicle for second language acquisition. Music can be applied as an instrument to help second language learners to acquire vocabulary, grammar, spelling and other four skills and to expand cultural knowledge. Vocabulary learning is perceived boring by learners. As listening to music and singing songs are enjoyable to students, it can be used effectively to acquire vocabulary in second language. This paper reports a study to find out how music exhilarates vocabulary acquisition as the learners stay relaxed and thus learning becomes more enjoyable. For conducting my research two groups of fifty students- music and non-music group were formed. Data were collected through class observation, test, questionnaires, and interview. The finding shows that music group acquired much amount of vocabulary than the non-music group. They enjoyed vocabulary learning activities based on listening songs.

Keywords: effective instrument, ESL classroom, music, relax environment, vocabulary learning

Procedia PDF Downloads 340