Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 223

Search results for: syntactic priming

13 A Comparative Study of Motion Events Encoding in English and Italian

Abstract:

The aim of this study is to investigate the degree of cross-linguistic and intra-linguistic variation in the encoding of motion events (MEs) in English and Italian, these being typologically different languages both showing signs of disobedience to their respective types. As a matter of fact, the traditional typological classification of MEs encoding distributes languages into two macro-types, based on the preferred locus for the expression of Path, the main ME component (other components being Figure, Ground and Manner) characterized by conceptual and structural prominence. According to this model, Satellite-framed (SF) languages typically express Path information in verb-dependent items called satellites (e.g. preverbs and verb particles) with main verbs encoding Manner of motion; whereas Verb-framed languages (VF) tend to include Path information within the verbal locus, leaving Manner to adjuncts. Although this dichotomy is valid altogether, languages do not always behave according to their typical classification patterns. English, for example, is usually ascribed to the SF type due to the rich inventory of postverbal particles and phrasal verbs used to express spatial relations (i.e. the cat climbed down the tree); nevertheless, it is not uncommon to find constructions such as the fog descended slowly, which is typical of the VF type. Conversely, Italian is usually described as being VF (cf. Paolo uscì di corsa ‘Paolo went out running’), yet SF constructions like corse via in lacrime ‘She ran away in tears’ are also frequent. This paper will try to demonstrate that such a typological overlapping is due to the fact that the semantic units making up MEs are distributed within several loci of the sentence –not only verbs and satellites– thus determining a number of different constructions stemming from convergent factors. Indeed, the linguistic expression of motion events depends not only on the typological nature of languages in a traditional sense, but also on a series morphological, lexical, and syntactic resources, as well as on inferential, discursive, usage-related, and cultural factors that make semantic information more or less accessible, frequent, and easy to process. Hence, rather than describe English and Italian in dichotomic terms, this study focuses on the investigation of cross-linguistic and intra-linguistic variation in the use of all the strategies made available by each linguistic system to express motion. Evidence for these assumptions is provided by parallel corpora analysis. The sample texts are taken from two contemporary Italian novels and their respective English translations. The 400 motion occurrences selected (200 in English and 200 in Italian) were scanned according to the MODEG (an acronym for Motion Decoding Grid) methodology, which grants data comparability through the indexation and retrieval of combined morphosyntactic and semantic information at different levels of detail.

Keywords: construction typology, motion event encoding, parallel corpora, satellite-framed vs. verb-framed type

Procedia PDF Downloads 260

12 Behavioral and EEG Reactions in Native Turkic-Speaking Inhabitants of Siberia and Siberian Russians during Recognition of Syntactic Errors in Sentences in Native and Foreign Languages

Authors: Tatiana N. Astakhova, Alexander E. Saprygin, Tatyana A. Golovko, Alexander N. Savostyanov, Mikhail S. Vlasov, Natalia V. Borisova, Alexandera G. Karpova, Urana N. Kavai-ool, Elena D. Mokur-ool, Nikolay A. Kolchanov, Lubomir I. Aftanas

Abstract:

The aim of the study is to compare behaviorally and EEG reactions in Turkic-speaking inhabitants of Siberia (Tuvinians and Yakuts) and Russians during the recognition of syntax errors in native and foreign languages. 63 healthy aboriginals of the Tyva Republic, 29 inhabitants of the Sakha (Yakutia) Republic, and 55 Russians from Novosibirsk participated in the study. All participants completed a linguistic task, in which they had to find a syntax error in the written sentences. Russian participants completed the task in Russian and in English. Tuvinian and Yakut participants completed the task in Russian, English, and Tuvinian or Yakut, respectively. EEG’s were recorded during the solving of tasks. For Russian participants, EEG's were recorded using 128-channels. The electrodes were placed according to the extended International 10-10 system, and the signals were amplified using ‘Neuroscan (USA)’ amplifiers. For Tuvinians and Yakuts EEG's were recorded using 64-channels and amplifiers Brain Products, Germany. In all groups 0.3-100 Hz analog filtering, sampling rate 1000 Hz were used. Response speed and the accuracy of recognition error were used as parameters of behavioral reactions. Event-related potentials (ERP) responses P300 and P600 were used as indicators of brain activity. The accuracy of solving tasks and response speed in Russians were higher for Russian than for English. The P300 amplitudes in Russians were higher for English; the P600 amplitudes in the left temporal cortex were higher for the Russian language. Both Tuvinians and Yakuts have no difference in accuracy of solving tasks in Russian and in their respective national languages (Tuvinian and Yakut). However, the response speed was faster for tasks in Russian than for tasks in their national language. Tuvinians and Yakuts showed bad accuracy in English, but the response speed was higher for English than for Russian and the national languages. With Tuvinians, there were no differences in the P300 and P600 amplitudes and in cortical topology for Russian and Tuvinian, but there was a difference for English. In Yakuts, the P300 and P600 amplitudes and topology of ERP for Russian were the same as Russians had for Russian. In Yakuts, brain reactions during Yakut and English comprehension had no difference and were reflected foreign language comprehension -while the Russian language comprehension was reflected native language comprehension. We found out that the Tuvinians recognized both Russian and Tuvinian as native languages, and English as a foreign language. The Yakuts recognized both English and Yakut as a foreign language, only Russian as a native language. According to the inquirer, both Tuvinians and Yakuts use the national language as a spoken language, whereas they don’t use it for writing. It can well be a reason that Yakuts perceive the Yakut writing language as a foreign language while writing Russian as their native.

Keywords: EEG, language comprehension, native and foreign languages, Siberian inhabitants

Procedia PDF Downloads 532

11 Cross-Language Variation and the ‘Fused’ Zone in Bilingual Mental Lexicon: An Experimental Research

Authors: Yuliya E. Leshchenko, Tatyana S. Ostapenko

Abstract:

Language variation is a widespread linguistic phenomenon which can affect different levels of a language system: phonological, morphological, lexical, syntactic, etc. It is obvious that the scope of possible standard alternations within a particular language is limited by a variety of its norms and regulations which set more or less clear boundaries for what is possible and what is not possible for the speakers. The possibility of lexical variation (alternate usage of lexical items within the same contexts) is based on the fact that the meanings of words are not clearly and rigidly defined in the consciousness of the speakers. Therefore, lexical variation is usually connected with unstable relationship between words and their referents: a case when a particular lexical item refers to different types of referents, or when a particular referent can be named by various lexical items. We assume that the scope of lexical variation in bilingual speech is generally wider than that observed in monolingual speech due to the fact that, besides ‘lexical item – referent’ relations it involves the possibility of cross-language variation of L1 and L2 lexical items. We use the term ‘cross-language variation’ to denote a case when two equivalent words of different languages are treated by a bilingual speaker as freely interchangeable within the common linguistic context. As distinct from code-switching which is traditionally defined as the conscious use of more than one language within one communicative act, in case of cross-language lexical variation the speaker does not perceive the alternate lexical items as belonging to different languages and, therefore, does not realize the change of language code. In the paper, the authors present research of lexical variation of adult Komi-Permyak – Russian bilingual speakers. The two languages co-exist on the territory of the Komi-Permyak District in Russia (Komi-Permyak as the ethnic language and Russian as the official state language), are usually acquired from birth in natural linguistic environment and, according to the data of sociolinguistic surveys, are both identified by the speakers as coordinate mother tongues. The experimental research demonstrated that alternation of Komi-Permyak and Russian words within one utterance/phrase is highly frequent both in speech perception and production. Moreover, our participants estimated cross-language word combinations like ‘маленькая /Russian/ нывка /Komi-Permyak/’ (‘a little girl’) or ‘мунны /Komi-Permyak/ домой /Russian/’ (‘go home’) as regular/habitual, containing no violation of any linguistic rules and being equally possible in speech as the equivalent intra-language word combinations (‘учöтик нывка’ /Komi-Permyak/ or ‘идти домой’ /Russian/). All the facts considered, we claim that constant concurrent use of the two languages results in the fact that a large number of their words tend to be intuitively interpreted by the speakers as lexical variants not only related to the same referent, but also referring to both languages or, more precisely, to none of them in particular. Consequently, we can suppose that bilingual mental lexicon includes an extensive ‘fused’ zone of lexical representations that provide the basis for cross-language variation in bilingual speech.

Keywords: bilingualism, bilingual mental lexicon, code-switching, lexical variation

Procedia PDF Downloads 148

10 Assessment of DNA Sequence Encoding Techniques for Machine Learning Algorithms Using a Universal Bacterial Marker

Authors: Diego Santibañez Oyarce, Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán

Abstract:

The advent of high-throughput sequencing technologies has revolutionized genomics, generating vast amounts of genetic data that challenge traditional bioinformatics methods. Machine learning addresses these challenges by leveraging computational power to identify patterns and extract information from large datasets. However, biological sequence data, being symbolic and non-numeric, must be converted into numerical formats for machine learning algorithms to process effectively. So far, some encoding methods, such as one-hot encoding or k-mers, have been explored. This work proposes additional approaches for encoding DNA sequences in order to compare them with existing techniques and determine if they can provide improvements or if current methods offer superior results. Data from the 16S rRNA gene, a universal marker, was used to analyze eight bacterial groups that are significant in the pulmonary environment and have clinical implications. The bacterial genes included in this analysis are Prevotella, Abiotrophia, Acidovorax, Streptococcus, Neisseria, Veillonella, Mycobacterium, and Megasphaera. These data were downloaded from the NCBI database in Genbank file format, followed by a syntactic analysis to selectively extract relevant information from each file. For data encoding, a sequence normalization process was carried out as the first step. From approximately 22,000 initial data points, a subset was generated for testing purposes. Specifically, 55 sequences from each bacterial group met the length criteria, resulting in an initial sample of approximately 440 sequences. The sequences were encoded using different methods, including one-hot encoding, k-mers, Fourier transform, and Wavelet transform. Various machine learning algorithms, such as support vector machines, random forests, and neural networks, were trained to evaluate these encoding methods. The performance of these models was assessed using multiple metrics, including the confusion matrix, ROC curve, and F1 Score, providing a comprehensive evaluation of their classification capabilities. The results show that accuracies between encoding methods vary by up to approximately 15%, with the Fourier transform obtaining the best results for the evaluated machine learning algorithms. These findings, supported by the detailed analysis using the confusion matrix, ROC curve, and F1 Score, provide valuable insights into the effectiveness of different encoding methods and machine learning algorithms for genomic data analysis, potentially improving the accuracy and efficiency of bacterial classification and related genomic studies.

Keywords: DNA encoding, machine learning, Fourier transform, Fourier transformation

Procedia PDF Downloads 23

9 Impact of Elevated Temperature on Spot Blotch Development in Wheat and Induction of Resistance by Plant Growth Promoting Rhizobacteria

Authors: Jayanwita Sarkar, Usha Chakraborty, Bishwanath Chakraborty

Abstract:

Plants are constantly interacting with various abiotic and biotic stresses. In changing climate scenario plants are continuously modifying physiological processes to adapt to changing environmental conditions which profoundly affect plant-pathogen interactions. Spot blotch in wheat is a fast-rising disease in the warmer plains of South Asia where the rise in minimum average temperature over most of the year already affecting wheat production. Hence, the study was undertaken to explore the role of elevated temperature in spot blotch disease development and modulation of antioxidative responses by plant growth promoting rhizobacteria (PGPR) for biocontrol of spot blotch at high temperature. Elevated temperature significantly increases the susceptibility of wheat plants to spot blotch causing pathogen Bipolaris sorokiniana. Two PGPR Bacillus safensis (W10) and Ochrobactrum pseudogrignonense (IP8) isolated from wheat (Triticum aestivum L.) and blady grass (Imperata cylindrical L.) rhizophere respectively, showing in vitro antagonistic activity against Bipolaris sorokiniana were tested for growth promotion and induction of resistance against spot blotch in wheat. GC-MS analysis showed that Bacillus safensis (W10) and Ochrobactrum pseudogrignonense (IP8) produced antifungal and antimicrobial compounds in culture. Seed priming with these two bacteria significantly increase growth, modulate antioxidative signaling and induce resistance and eventually reduce disease incidence in wheat plants at optimum as well as elevated temperature which was further confirmed by indirect immunofluorescence assay using polyclonal antibody raised against Bipolaris sorokiniana. Application of the PGPR led to enhancement in activities of plant defense enzymes- phenylalanine ammonia lyase, peroxidase, chitinase and β-1,3 glucanase in infected leaves. Immunolocalization of chitinase and β-1,3 glucanase in PGPR primed and pathogen inoculated leaf tissue was further confirmed by transmission electron microscopy using PAb of chitinase, β-1,3 glucanase and gold labelled conjugates. Activity of ascorbate-glutathione redox cycle related enzymes such as ascorbate peroxidase, superoxide dismutase and glutathione reductase along with antioxidants such as carotenoids, glutathione and ascorbate and osmolytes like proline and glycine betain accumulation were also increased during disease development in PGPR primed plant in comparison to unprimed plants at high temperature. Real-time PCR analysis revealed enhanced expression of defense genes- chalcone synthase and phenyl alanineammonia lyase. Over expression of heat shock proteins like HSP 70, small HSP 26.3 and heat shock factor HsfA3 in PGPR primed plants effectively protect plants against spot blotch infection at elevated temperature as compared with control plants. Our results revealed dynamic biochemical cross talk between elevated temperature and spot blotch disease development and furthermore highlight PGPR mediated array of antioxidative and molecular alterations responsible for induction of resistance against spot blotch disease at elevated temperature which seems to be associated with up-regulation of defense genes, heat shock proteins and heat shock factors, less ROS production, membrane damage, increased expression of redox enzymes and accumulation of osmolytes and antioxidants.

Keywords: antioxidative enzymes, defense enzymes, elevated temperature, heat shock proteins, PGPR, Real-Time PCR, spot blotch, wheat

Procedia PDF Downloads 171

8 The Origins of Representations: Cognitive and Brain Development

Authors: Athanasios Raftopoulos

Abstract:

In this paper, an attempt is made to explain the evolution or development of human’s representational arsenal from its humble beginnings to its modern abstract symbols. Representations are physical entities that represent something else. To represent a thing (in a general sense of “thing”) means to use in the mind or in an external medium a sign that stands for it. The sign can be used as a proxy of the represented thing when the thing is absent. Representations come in many varieties, from signs that perceptually resemble their representative to abstract symbols that are related to their representata through conventions. Relying the distinction among indices, icons, and symbols, it is explained how symbolic representations gradually emerged from indices and icons. To understand the development or evolution of our representational arsenal, the development of the cognitive capacities that enabled the gradual emergence of representations of increasing complexity and expressive capability should be examined. The examination of these factors should rely on a careful assessment of the available empirical neuroscientific and paleo-anthropological evidence. These pieces of evidence should be synthesized to produce arguments whose conclusions provide clues concerning the developmental process of our representational capabilities. The analysis of the empirical findings in this paper shows that Homo Erectus was able to use both icons and symbols. Icons were used as external representations, while symbols were used in language. The first step in the emergence of representations is that a sensory-motor purely causal schema involved in indices is decoupled from its normal causal sensory-motor functions and serves as a representation of the object that initially called it into play. Sensory-motor schemes are tied to specific contexts of the organism-environment interactions and are activated only within these contexts. For a representation of an object to be possible, this scheme must be de-contextualized so that the same object can be represented in different contexts; a decoupled schema loses its direct ties to reality and becomes mental content. The analysis suggests that symbols emerged due to selection pressures of the social environment. The need to establish and maintain social relationships in ever-enlarging groups that would benefit the group was a sufficient environmental pressure to lead to the appearance of the symbolic capacity. Symbols could serve this need because they can express abstract relationships, such as marriage or monogamy. Icons, by being firmly attached to what can be observed, could not go beyond surface properties to express abstract relations. The cognitive capacities that are required for having iconic and then symbolic representations were present in Homo Erectus, which had a language that started without syntactic rules but was structured so as to mirror the structure of the world. This language became increasingly complex, and grammatical rules started to appear to allow for the construction of more complex expressions required to keep up with the increasing complexity of social niches. This created evolutionary pressures that eventually led to increasing cranial size and restructuring of the brain that allowed more complex representational systems to emerge.

Keywords: mental representations, iconic representations, symbols, human evolution

Procedia PDF Downloads 57

7 Yu Kwang-Chung vs. Yu Kwang-Chung: Untranslatability as the Touchstone of a Poet

Authors: Min-Hua Wu

Abstract:

The untranslatability of an established poet’s tour de force is thoroughly explored by Matthew Arnold (1822-1888). In his On Translating Homer (1861), Arnold lists the four most striking poetic qualities of Homer, namely his rapidity, plainness and directness of style and diction, plainness and directness of ideas, and nobleness. He concludes that such celebrated English translators as Cowper, Pope, Chapman, and Mr. Newman are all doomed, due to their respective failure in rendering the totality of the four Homeric poetic qualities. Why poetic translation always amounts to being proven such a mission impossible for the translator? According to Arnold, it is because there constantly exists a mist interposed between the translator’s own literary self-obsession and the objective artistic qualities that reside in the work of the original author. Foregrounding such a seemingly empowering yet actually detrimental poetic mist, he explains why the aforementioned translators fail in their attempts to bring the Homeric charm to the British reader. Drawing on Arnold’s analytical study on Homeric translation, the research attempts to bring Yu Kwang-chung the poet vis-à-vis Yu Kwang-chung the translator, with an aim not so much to find any similar mist as revealed by Arnold between his Chinese poetry and English translation as to probe into a latent and veiled literary and lingual mist interposed between Chinese and English, if not between Chinese and English literatures. The major work studied and analyzed for this study is Yu’s own Chinese poetry and his own English translation collected in The Night Watchman: Yu Kwang-chung 1958-2004. The research argues that the following critical elements that characterizes Yu’s poetics are to a certain extent 'transformed,' if not 'lost,' in his English translation: a. the Chinese pictographic and ideographic unit terms which so unfailingly characterize the poet’s incredible creativity, allowing him to habitually and conveniently coin concrete textual images or word-scapes almost at his own will; b. the subtle wordplay and punning which appear at a reasonable frequency; c. the parallel contrastive repetitive syntactic structure within a single poetic line; d. the ambiguous and highly associative diction in the adjective and noun categories; e. the literary allusion that harks back to the old times of Chinese literature; f. the alliteration that adds rhythm and smoothness to the lines; g. the rhyming patterns that bring about impressive sonority and lingering echo to the ears of the reader; h. the grandeur-imposing and sublimity-arousing word-scaping which hinges on the employment of verbs; i. the meandering cultural heritage that embraces such elements as Chinese medicine and kung fu; and j. other features of the like. Once we appeal to the Arnoldian tribunal and resort to the strict standards of such a Victorian cultural and literary critic who insists 'to see the object as in itself it really is,' we may serve as a potential judge for the tug of war between Yu Kwang-chung the poet and Yu Kwang-chung the translator, a tug of war that will not merely broaden our understating of Chinese poetics but deepen our apprehension of Chinese-English translatology.

Keywords: Yu Kwang-chung, The Night Watchman, poetry translation, Chinese-English translation, translation studies, Matthew Arnold

Procedia PDF Downloads 392

6 Perspective Shifting in the Elicited Language Production Can Defy with Aging

Authors: Tuyuan Cheng

Abstract:

As we age, many things become more difficult. Among the abilities are the linguistic and cognitive ones. Competing theories have shown that these two functions could diminish together or that one is selectively affected by the other. In other words, some proposes aging affects sentence production in the same way it affects sentence comprehension and other cognitive functions, while some argues it does not.To address this question, the current investigation is conducted into the critical aspect of sentences as well as cognitive abilities – the syntactic complexity and the number of perspective shifts being contained in the elicited production. Healthy non-pathological aging is often characterized by a cognitive and neural decline in a number of cognitive abilities. Although the language is assumed to be of the more stable domain, a variety of findings in the cognitive aging literature would suggest otherwise. Older adults often show deficits in language production and multiple aspects of comprehension. Nevertheless, while some age differences likely reflect cognitive decline, others might reflect changes in communicative goals, and some even display cognitive advantages. In the domain of language processing, research efforts have been made in tests that probed a variety of communicative abilities. In general, there exists a distinction: Comprehension seems to be selectively unaffected, while production does not. The current study raises a novel question and investigates whether aging affects the production of relative clauses (RCs) under the cognitive factor of perspective shifts. Based on Perspective Hypothesis (MacWhinney, 2000, 2005), our cognitive processes build upon a fundamental system of perspective-taking, and language provides a series of cues to facilitate the construction and shifting of perspectives. These cues include a wide variety of constructions, including RCs structures. In this regard, linguistic complexity can be determined by the number of perspective shifts, and the processing difficulties of RCs can be interpreted within the theory of perspective shifting. Two experiments were conducted to study language production under controlled conditions. In Experiment 1, older healthy participants were tested on standard measures of cognitive aging, including MMSE (Mini-Mental State Examination), ToMI-2 (a simplified Theory of Mind Inventory-2), and a perspective-shifting comprehension task programmed with E-Prime. The results were analyzed to examine if/how they are correlated with aging people’s subsequent production data. In Experiment 2, the production profile of differing RCs, SRC vs. ORC, were collected with healthy aging participants who perform a picture elicitation task. Variable containing 0, 1, or 2 perspective shifts were juxtaposed respectively to the pictures and counterbalanced presented for elicitation. In parallel, a controlled group of young adults were recruited to examine the linguistic and cognitive abilities in question. The results lead us to the discussion whetheraging affects RCs production in a manner determined by its semantic structure or the number of perspective shifts it contains or the status of participants’ mental understanding. The major findingsare: (1) Elders’ production on Chinese RCtypes did not display intrinsic difficulty asymmetry. (2) RC types (the linguistic structural features) and the cognitiveperspective shifts jointly play important roles in the elders’ RCproduction. (3) The production of RC may defy the aging in the case offlexibly preserved cognitive ability.

Keywords: cognition aging, perspective hypothesis, perspective shift, relative clauses, sentence complexity

Procedia PDF Downloads 118

5 The Istrian Istrovenetian-Croatian Bilingual Corpus

Authors: Nada Poropat Jeletic, Gordana Hrzica

Abstract:

Bilingual conversational corpora represent a meaningful and the most comprehensive data source for investigating the genuine contact phenomena in non-monitored bi-lingual speech productions. They can be particularly useful for bilingual research since some features of bilingual interaction can hardly be accessed with more traditional methodologies (e.g., elicitation tasks). The method of language sampling provides the resources for describing language interaction in a bilingual community and/or in bilingual situations (e.g. code-switching, amount of languages used, number of languages used, etc.). To capture these phenomena in genuine communication situations, such sampling should be as close as possible to spontaneous communication. Bilingual spoken corpus design is methodologically demanding. Therefore this paper aims at describing the methodological challenges that apply to the corpus design of the conversational corpus design of the Istrian Istrovenetian-Croatian Bilingual Corpus. Croatian is the first official language of the Croatian-Italian officially bilingual Istria County, while Istrovenetian is a diatopic subvariety of Venetian, a longlasting lingua franca in the Istrian peninsula, the mother tongue of the members of the Italian National Community in Istria and the primary code of informal everyday communication among the Istrian Italophone population. Within the CLARIN infrastructure, TalkBank is being used, as it provides relevant procedures for designing and analyzing bilingual corpora. Furthermore, it allows public availability allows for easy replication of studies and cumulative progress as a research community builds up around the corpus, while the tools developed within the field of corpus linguistics enable easy retrieval and analysis of information. The method of language sampling employed is kept at the level of spontaneous communication, in order to maximise the naturalness of the collected conversational data. All speakers have provided written informed consent in which they agree to be recorded at a random point within the period of one month after signing the consent. Participants are administered a background questionnaire providing information about the socioeconomic status and the exposure and language usage in the participants social networks. Recording data are being transcribed, phonologically adapted within a standard-sized orthographic form, coded and segmented (speech streams are being segmented into communication units based on syntactic criteria) and are being marked following the CHAT transcription system and its associated CLAN suite of programmes within the TalkBank toolkit. The corpus consists of transcribed sound recordings of 36 bilingual speakers, while the target is to publish the whole corpus by the end of 2020, by sampling spontaneous conversations among approximately 100 speakers from all the bilingual areas of Istria for ensuring representativeness (the participants are being recruited across three generations of native bilingual speakers in all the bilingual areas of the peninsula). Conversational corpora are still rare in TalkBank, so the Corpus will contribute to BilingBank as a highly relevant and scientifically reliable resource for an internationally established and active research community. The impact of the research of communities with societal bilingualism will contribute to the growing body of research on bilingualism and multilingualism, especially regarding topics of language dominance, language attrition and loss, interference and code-switching etc.

Keywords: conversational corpora, bilingual corpora, code-switching, language sampling, corpus design methodology

Procedia PDF Downloads 145

4 Forming Form, Motivation and Their Biolinguistic Hypothesis: The Case of Consonant Iconicity in Tashelhiyt Amazigh and English

Authors: Noury Bakrim

Abstract:

When dealing with motivation/arbitrariness, forming form (Forma Formans) and morphodynamics are to be grasped as relevant implications of enunciation/enactment, schematization within the specificity of language as sound/meaning articulation. Thus, the fact that a language is a form does not contradict stasis/dynamic enunciation (reflexivity vs double articulation). Moreover, some languages exemplify the role of the forming form, uttering, and schematization (roots in Semitic languages, the Chinese case). Beyond the evolutionary biosemiotic process (form/substance bifurcation, the split between realization/representation), non-isomorphism/asymmetry between linguistic form/norm and linguistic realization (phonetics for instance) opens up a new horizon problematizing the role of Brain – sensorimotor contribution in the continuous forming form. Therefore, we hypothesize biotization as both process/trace co-constructing motivation/forming form. Henceforth, referring to our findings concerning distribution and motivation patterns within Berber written texts (pulse based obstruents and nasal-lateral levels in poetry) and oral storytelling (consonant intensity clustering in quantitative and semantic/prosodic motivation), we understand consonant clustering, motivation and schematization as a complex phenomenon partaking in patterns of oral/written iconic prosody and reflexive metalinguistic representation opening the stable form. We focus our inquiry on both Amazigh and English clusters (/spl/, /spr/) and iconic consonant iteration in [gnunnuy] (to roll/tumble), [smummuy] (to moan sadly or crankily). For instance, the syllabic structures of /splaeʃ/ and /splaet/ imply an anamorphic representation of the state of the world: splash, impact on aquatic surfaces/splat impact on the ground. The pair has stridency and distribution as distinctive features which specify its phonetic realization (and a part of its meaning) /ʃ/ is [+ strident] and /t/ is [+ distributed] on the vocal tract. Schematization is then a process relating both physiology/code as an arthron vocal/bodily, vocal/practical shaping of the motor-articulatory system, leading to syntactic/semantic thematization (agent/patient roles in /spl/, /sm/ and other clusters or the tense uvular /qq/ at the initial position in Berber). Furthermore, the productivity of serial syllable sequencing in Berber points out different expressivity forms. We postulate two Components of motivated formalization: i) the process of memory paradigmatization relating to sequence modeling under sensorimotor/verbal specific categories (production/perception), ii) the process of phonotactic selection - prosodic unconscious/subconscious distribution by virtue of iconicity. Basing on multiple tests including a questionnaire, phonotactic/visual recognition and oral/written reproduction, we aim at patterning/conceptualizing consonant schematization and motivation among EFL and Amazigh (Berber) learners and speakers integrating biolinguistic hypotheses.

Keywords: consonant motivation and prosody, language and order of life, anamorphic representation, represented representation, biotization, sensori-motor and brain representation, form, formalization and schematization

Procedia PDF Downloads 143

3 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts

Authors: William Michael Short

Abstract:

Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.

Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics

Procedia PDF Downloads 132

2 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 108

1 A Parallel Cellular Automaton Model of Tumor Growth for Multicore and GPU Programming

Authors: Manuel I. Capel, Antonio Tomeu, Alberto Salguero

Abstract:

Tumor growth from a transformed cancer-cell up to a clinically apparent mass spans through a range of spatial and temporal magnitudes. Through computer simulations, Cellular Automata (CA) can accurately describe the complexity of the development of tumors. Tumor development prognosis can now be made -without making patients undergo through annoying medical examinations or painful invasive procedures- if we develop appropriate CA-based software tools. In silico testing mainly refers to Computational Biology research studies of application to clinical actions in Medicine. To establish sound computer-based models of cellular behavior, certainly reduces costs and saves precious time with respect to carrying out experiments in vitro at labs or in vivo with living cells and organisms. These aim to produce scientifically relevant results compared to traditional in vitro testing, which is slow, expensive, and does not generally have acceptable reproducibility under the same conditions. For speeding up computer simulations of cellular models, specific literature shows recent proposals based on the CA approach that include advanced techniques, such the clever use of supporting efficient data structures when modeling with deterministic stochastic cellular automata. Multiparadigm and multiscale simulation of tumor dynamics is just beginning to be developed by the concerned research community. The use of stochastic cellular automata (SCA), whose parallel programming implementations are open to yield a high computational performance, are of much interest to be explored up to their computational limits. There have been some approaches based on optimizations to advance in multiparadigm models of tumor growth, which mainly pursuit to improve performance of these models through efficient memory accesses guarantee, or considering the dynamic evolution of the memory space (grids, trees,…) that holds crucial data in simulations. In our opinion, the different optimizations mentioned above are not decisive enough to achieve the high performance computing power that cell-behavior simulation programs actually need. The possibility of using multicore and GPU parallelism as a promising multiplatform and framework to develop new programming techniques to speed-up the computation time of simulations is just starting to be explored in the few last years. This paper presents a model that incorporates parallel processing, identifying the synchronization necessary for speeding up tumor growth simulations implemented in Java and C++ programming environments. The speed up improvement that specific parallel syntactic constructs, such as executors (thread pools) in Java, are studied. The new tumor growth parallel model is proved using implementations with Java and C++ languages on two different platforms: chipset Intel core i-X and a HPC cluster of processors at our university. The parallelization of Polesczuk and Enderling model (normally used by researchers in mathematical oncology) proposed here is analyzed with respect to performance gain. We intend to apply the model and overall parallelization technique presented here to solid tumors of specific affiliation such as prostate, breast, or colon. Our final objective is to set up a multiparadigm model capable of modelling angiogenesis, or the growth inhibition induced by chemotaxis, as well as the effect of therapies based on the presence of cytotoxic/cytostatic drugs.

Keywords: cellular automaton, tumor growth model, simulation, multicore and manycore programming, parallel programming, high performance computing, speed up

Procedia PDF Downloads 244