Search results for: nouns
7 Information-Controlled Laryngeal Feature Variations in Korean Consonants
Authors: Ponghyung Lee
Abstract:
This study seeks to investigate the variations occurring to Korean consonantal variations center around laryngeal features of the concerned sounds, to the exclusion of others. Our fundamental premise is that the weak contrast associated with concerned segments might be held accountable for the oscillation of the status quo of the concerned consonants. What is more, we assume that an array of notions as a measure of communicative efficiency of linguistic units would be significantly influential on triggering those variations. To this end, we have tried to compute the surprisal, entropic contribution, and relative contrastiveness associated with Korean obstruent consonants. What we found therein is that the Information-theoretic perspective is compelling enough to lend support our approach to a considerable extent. That is, the variant realizations, chronologically and stylistically, prove to be profoundly affected by a set of Information-theoretic factors enumerated above. When it comes to the biblical proper names, we use Georgetown University CQP Web-Bible corpora. From the 8 texts (4 from Old Testament and 4 from New Testament) among the total 64 texts, we extracted 199 samples. We address the issue of laryngeal feature variations associated with Korean obstruent consonants under the presumption that the variations stem from the weak contrast among the triad manifestations of laryngeal features. The variants emerge from diverse sources in chronological and stylistic senses: Christianity biblical texts, ordinary casual speech, the shift of loanword adaptation over time, and ideophones. For the purpose of discussing what they are really like from the perspective of Information Theory, it is necessary to closely look at the data. Among them, the massive changes occurring to loanword adaptation of proper nouns during the centennial history of Korean Christianity draw our special attention. We searched 199 types of initially capitalized words among 45,528-word tokens, which account for around 5% of total 901,701-word tokens (12,786-word types) from Georgetown University CQP Web-Bible corpora. We focus on the shift of the laryngeal features incorporated into word-initial consonants, which are available through the two distinct versions of Korean Bible: one came out in the 1960s for the Protestants, and the other was published in the 1990s for the Catholic Church. Of these proper names, we have closely traced the adaptation of plain obstruents, e. g. /b, d, g, s, ʤ/ in the sources. The results show that as much as 41% of the extracted proper names show variations; 37% in terms of aspiration, and 4% in terms of tensing. This study set out in an effort to shed light on the question: to what extent can we attribute the variations occurring to the laryngeal features associated with Korean obstruent consonants to the communicative aspects of linguistic activities? In this vein, the concerted effects of the triad, of surprisal, entropic contribution, and relative contrastiveness can be credited with the ups and downs in the feature specification, despite being contentiousness on the role of surprisal to some extent.Keywords: entropic contribution, laryngeal feature variation, relative contrastiveness, surprisal
Procedia PDF Downloads 1286 Quantitative, Preservative Methodology for Review of Interview Transcripts Using Natural Language Processing
Authors: Rowan P. Martnishn
Abstract:
During the execution of a National Endowment of the Arts grant, approximately 55 interviews were collected from professionals across various fields. These interviews were used to create deliverables – historical connections for creations that began as art and evolved entirely into computing technology. With dozens of hours’ worth of transcripts to be analyzed by qualitative coders, a quantitative methodology was created to sift through the documents. The initial step was to both clean and format all the data. First, a basic spelling and grammar check was applied, as well as a Python script for normalized formatting which used an open-source grammatical formatter to make the data as coherent as possible. 10 documents were randomly selected to manually review, where words often incorrectly translated during the transcription were recorded and replaced throughout all other documents. Then, to remove all banter and side comments, the transcripts were spliced into paragraphs (separated by change in speaker) and all paragraphs with less than 300 characters were removed. Secondly, a keyword extractor, a form of natural language processing where significant words in a document are selected, was run on each paragraph for all interviews. Every proper noun was put into a data structure corresponding to that respective interview. From there, a Bidirectional and Auto-Regressive Transformer (B.A.R.T.) summary model was then applied to each paragraph that included any of the proper nouns selected from the interview. At this stage the information to review had been sent from about 60 hours’ worth of data to 20. The data was further processed through light, manual observation – any summaries which proved to fit the criteria of the proposed deliverable were selected, as well their locations within the document. This narrowed that data down to about 5 hours’ worth of processing. The qualitative researchers were then able to find 8 more connections in addition to our previous 4, exceeding our minimum quota of 3 to satisfy the grant. Major findings of the study and subsequent curation of this methodology raised a conceptual finding crucial to working with qualitative data of this magnitude. In the use of artificial intelligence there is a general trade off in a model between breadth of knowledge and specificity. If the model has too much knowledge, the user risks leaving out important data (too general). If the tool is too specific, it has not seen enough data to be useful. Thus, this methodology proposes a solution to this tradeoff. The data is never altered outside of grammatical and spelling checks. Instead, the important information is marked, creating an indicator of where the significant data is without compromising the purity of it. Secondly, the data is chunked into smaller paragraphs, giving specificity, and then cross-referenced with the keywords (allowing generalization over the whole document). This way, no data is harmed, and qualitative experts can go over the raw data instead of using highly manipulated results. Given the success in deliverable creation as well as the circumvention of this tradeoff, this methodology should stand as a model for synthesizing qualitative data while maintaining its original form.Keywords: B.A.R.T.model, keyword extractor, natural language processing, qualitative coding
Procedia PDF Downloads 295 Cross Cultural Adaptation and Content Validation of the Assessment Instrument Preschooler Awareness of Stuttering Survey
Authors: Catarina Belchior, Catarina Martins, Sara Mendes, Ana Rita S. Valente, Elsa Marta Soares
Abstract:
Introduction: The negative feelings and attitudes that a person who stutters can develop are extremely relevant when considering assessment and intervention in Speech and Language Therapy. This relates to the fact that the person who stutters can experience feelings such as shame, fear and negative beliefs when communicating. Considering the complexity and importance of integrating diverse aspects in stuttering intervention, it is central to identify those emotions as early as possible. Therefore, this research aimed to achieve the translation, adaptation to European Portuguese and to analyze the content validation of the Preschooler Awareness Stuttering Survey (Abbiati, Guitar & Hutchins, 2015), an instrument that allows the assessment of the impact of stuttering on preschool children who stutter considering feelings and attitudes. Methodology: Cross-sectional descriptive qualitative research. The following methodological procedures were followed: translation, back-translation, panel of experts and pilot study. This abstract describes the results of the first three phases of this process. The translation was accomplished by two Speech Language Therapists (SLT). Both professionals have more than five years of experience and are users of English language. One of them has a broad experience in the field of stuttering. Back-translation was conducted by two bilingual individuals without experience in health or any knowledge about the instrument. The panel of experts was composed by 3 different SLT, experts in the field of stuttering. Results and Discussion: In the translation and back-translation process it was possible to verify differences in semantic and idiomatic equivalences of several concepts and expressions, as well as the need to include new information to enhance the understanding of the application of the instrument. The meeting between the two translators and the researchers allowed the achievement of a consensus version that was used in back-translation. Considering adaptation and content validation, the main change made by the experts was the conceptual equivalence of the questions and answers of the instrument's sheets. Considering that in the translated consensus version the questions began with various nouns such as 'is' or 'the cow' and that the answers did not contain the adverb 'much' as in the original instrument, the panel agreed that it would be more appropriate if the questions all started with 'how' and that all the answers should present the adverb 'much'. This decision was made to ensure that the translate instrument would be similar to the original and so that the results obtained could be comparable between the original and the translated instrument. There was also elaborated one semantic equivalence between concepts. The panel of experts found that all other items and specificities of the instrument were adequate, concluding the adequacy of the instrument considering its objectives and its intended target population. Conclusion: This research aspires to diversify the existing validated resources in this scope, adding a new instrument that allows the assessment of preschool children who stutter. Consequently, it is hoped that this instrument will provide a real and reliable assessment that can lead to an appropriate therapeutic intervention according to the characteristics and needs of each child.Keywords: stuttering, assessment, feelings and attitudes, speech language therapy
Procedia PDF Downloads 1494 Courtyard Evolution in Contemporary Sustainable Living
Authors: Yiorgos Hadjichristou
Abstract:
The paper will focus on the strategic development deriving from the evolution of the traditional courtyard spatial organization towards a new, contemporary sustainable way of living. New sustainable approaches that engulf the social issues, the notion of place, the understanding of weather architecture blended together with the bioclimatic behaviour will be seen through a series of experimental case studies in the island of Cyprus, inspired and originated from its traditional wisdom, ranging from small scale of living to urban interventions. Weather and nature will be seen as co-architectural authors with architects as intelligently claimed by Jonathan Hill in his Weather Architecture discourse. Furthermore, following Pallasmaa’s understanding, the building will be seen not as an end itself and the elements of an architectural experience as having a verb form rather than being nouns. This will further enhance the notion of merging the subject-human and the object-building as discussed by Julio Bermudez. This eventually will enable to generate the discussion of the understanding of the building constructed according to the specifics of place and inhabitants, shaped by its physical and human topography as referred by Adam Sharr in relation to Heidegger’s thinking. The specificities of the divided island and the dealing with sites that are in vicinity with the diving Green Line will further trigger explorations dealing with the regeneration issues and the social sustainability offering unprecedented opportunities for innovative sustainable ways of living. The above premises will lead us to develop innovative strategies for a profound, both technical and social sustainability, which fruitfully yields to innovative living built environments, responding to the ever changing environmental and social needs. As a starting point, a case study in Kaimakli in Nicosia a refurbishment with an extension of a traditional house, already engulfs all the traditional/ vernacular wisdom of the bioclimatic architecture. It aims at capturing not only its direct and quite obvious bioclimatic features, but rather to evolve them by adjusting the whole house in a contemporary living environment. In order to succeed this, evolutions of traditional architectural elements and spatial conditions are integrated in a way that does not only respond to some certain weather conditions, but they integrate and blend the weather within the built environment. A series of innovations aiming at maximum flexibility is proposed. The house can finally be transformed into a winter enclosure, while for the most part of the year it turns into a ‘camping’ living environment. Parallel to experimental interventions in existing traditional units, we will proceed examining the implementation of the same developed methodology in designing living units and complexes. Malleable courtyard organizations that attempt to blend the traditional wisdom with the contemporary needs for living, the weather and nature with the built environment will be seen tested in both horizontal and vertical developments. A new social identity of people, directly involved and interacting with the weather and climatic conditions will be seen as the result of balancing the social with the technological sustainability, the immaterial and the material aspects of the built environment.Keywords: building as a verb, contemporary living, traditional bioclimatic wisdom, weather architecture
Procedia PDF Downloads 4193 Mondoc: Informal Lightweight Ontology for Faceted Semantic Classification of Hypernymy
Authors: M. Regina Carreira-Lopez
Abstract:
Lightweight ontologies seek to concrete union relationships between a parent node, and a secondary node, also called "child node". This logic relation (L) can be formally defined as a triple ontological relation (LO) equivalent to LO in ⟨LN, LE, LC⟩, and where LN represents a finite set of nodes (N); LE is a set of entities (E), each of which represents a relationship between nodes to form a rooted tree of ⟨LN, LE⟩; and LC is a finite set of concepts (C), encoded in a formal language (FL). Mondoc enables more refined searches on semantic and classified facets for retrieving specialized knowledge about Atlantic migrations, from the Declaration of Independence of the United States of America (1776) and to the end of the Spanish Civil War (1939). The model looks forward to increasing documentary relevance by applying an inverse frequency of co-ocurrent hypernymy phenomena for a concrete dataset of textual corpora, with RMySQL package. Mondoc profiles archival utilities implementing SQL programming code, and allows data export to XML schemas, for achieving semantic and faceted analysis of speech by analyzing keywords in context (KWIC). The methodology applies random and unrestricted sampling techniques with RMySQL to verify the resonance phenomena of inverse documentary relevance between the number of co-occurrences of the same term (t) in more than two documents of a set of texts (D). Secondly, the research also evidences co-associations between (t) and their corresponding synonyms and antonyms (synsets) are also inverse. The results from grouping facets or polysemic words with synsets in more than two textual corpora within their syntagmatic context (nouns, verbs, adjectives, etc.) state how to proceed with semantic indexing of hypernymy phenomena for subject-heading lists and for authority lists for documentary and archival purposes. Mondoc contributes to the development of web directories and seems to achieve a proper and more selective search of e-documents (classification ontology). It can also foster on-line catalogs production for semantic authorities, or concepts, through XML schemas, because its applications could be used for implementing data models, by a prior adaptation of the based-ontology to structured meta-languages, such as OWL, RDF (descriptive ontology). Mondoc serves to the classification of concepts and applies a semantic indexing approach of facets. It enables information retrieval, as well as quantitative and qualitative data interpretation. The model reproduces a triple tuple ⟨LN, LE, LT, LCF L, BKF⟩ where LN is a set of entities that connect with other nodes to concrete a rooted tree in ⟨LN, LE⟩. LT specifies a set of terms, and LCF acts as a finite set of concepts, encoded in a formal language, L. Mondoc only resolves partial problems of linguistic ambiguity (in case of synonymy and antonymy), but neither the pragmatic dimension of natural language nor the cognitive perspective is addressed. To achieve this goal, forthcoming programming developments should target at oriented meta-languages with structured documents in XML.Keywords: hypernymy, information retrieval, lightweight ontology, resonance
Procedia PDF Downloads 1252 Describing Cognitive Decline in Alzheimer's Disease via a Picture Description Writing Task
Authors: Marielle Leijten, Catherine Meulemans, Sven De Maeyer, Luuk Van Waes
Abstract:
For the diagnosis of Alzheimer's disease (AD), a large variety of neuropsychological tests are available. In some of these tests, linguistic processing - both oral and written - is an important factor. Language disturbances might serve as a strong indicator for an underlying neurodegenerative disorder like AD. However, the current diagnostic instruments for language assessment mainly focus on product measures, such as text length or number of errors, ignoring the importance of the process that leads to written or spoken language production. In this study, it is our aim to describe and test differences between cognitive and impaired elderly on the basis of a selection of writing process variables (inter- and intrapersonal characteristics). These process variables are mainly related to pause times, because the number, length, and location of pauses have proven to be an important indicator of the cognitive complexity of a process. Method: Participants that were enrolled in our research were chosen on the basis of a number of basic criteria necessary to collect reliable writing process data. Furthermore, we opted to match the thirteen cognitively impaired patients (8 MCI and 5 AD) with thirteen cognitively healthy elderly. At the start of the experiment, participants were each given a number of tests, such as the Mini-Mental State Examination test (MMSE), the Geriatric Depression Scale (GDS), the forward and backward digit span and the Edinburgh Handedness Inventory (EHI). Also, a questionnaire was used to collect socio-demographic information (age, gender, eduction) of the subjects as well as more details on their level of computer literacy. The tests and questionnaire were followed by two typing tasks and two picture description tasks. For the typing tasks participants had to copy (type) characters, words and sentences from a screen, whereas the picture description tasks each consisted of an image they had to describe in a few sentences. Both the typing and the picture description tasks were logged with Inputlog, a keystroke logging tool that allows us to log and time stamp keystroke activity to reconstruct and describe text production processes. The main rationale behind keystroke logging is that writing fluency and flow reveal traces of the underlying cognitive processes. This explains the analytical focus on pause (length, number, distribution, location, etc.) and revision (number, type, operation, embeddedness, location, etc.) characteristics. As in speech, pause times are seen as indexical of cognitive effort. Results. Preliminary analysis already showed some promising results concerning pause times before, within and after words. For all variables, mixed effects models were used that included participants as a random effect and MMSE scores, GDS scores and word categories (such as determiners and nouns) as a fixed effect. For pause times before and after words cognitively impaired patients paused longer than healthy elderly. These variables did not show an interaction effect between the group participants (cognitively impaired or healthy elderly) belonged to and word categories. However, pause times within words did show an interaction effect, which indicates pause times within certain word categories differ significantly between patients and healthy elderly.Keywords: Alzheimer's disease, keystroke logging, matching, writing process
Procedia PDF Downloads 3661 Synthetic Method of Contextual Knowledge Extraction
Authors: Olga Kononova, Sergey Lyapin
Abstract:
Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction
Procedia PDF Downloads 359