Search results for: lexical invariant
11 Modern Detection and Description Methods for Natural Plants Recognition
Authors: Masoud Fathi Kazerouni, Jens Schlemper, Klaus-Dieter Kuhnert
Abstract:
Green planet is one of the Earth’s names which is known as a terrestrial planet and also can be named the fifth largest planet of the solar system as another scientific interpretation. Plants do not have a constant and steady distribution all around the world, and even plant species’ variations are not the same in one specific region. Presence of plants is not only limited to one field like botany; they exist in different fields such as literature and mythology and they hold useful and inestimable historical records. No one can imagine the world without oxygen which is produced mostly by plants. Their influences become more manifest since no other live species can exist on earth without plants as they form the basic food staples too. Regulation of water cycle and oxygen production are the other roles of plants. The roles affect environment and climate. Plants are the main components of agricultural activities. Many countries benefit from these activities. Therefore, plants have impacts on political and economic situations and future of countries. Due to importance of plants and their roles, study of plants is essential in various fields. Consideration of their different applications leads to focus on details of them too. Automatic recognition of plants is a novel field to contribute other researches and future of studies. Moreover, plants can survive their life in different places and regions by means of adaptations. Therefore, adaptations are their special factors to help them in hard life situations. Weather condition is one of the parameters which affect plants life and their existence in one area. Recognition of plants in different weather conditions is a new window of research in the field. Only natural images are usable to consider weather conditions as new factors. Thus, it will be a generalized and useful system. In order to have a general system, distance from the camera to plants is considered as another factor. The other considered factor is change of light intensity in environment as it changes during the day. Adding these factors leads to a huge challenge to invent an accurate and secure system. Development of an efficient plant recognition system is essential and effective. One important component of plant is leaf which can be used to implement automatic systems for plant recognition without any human interface and interaction. Due to the nature of used images, characteristic investigation of plants is done. Leaves of plants are the first characteristics to select as trusty parts. Four different plant species are specified for the goal to classify them with an accurate system. The current paper is devoted to principal directions of the proposed methods and implemented system, image dataset, and results. The procedure of algorithm and classification is explained in details. First steps, feature detection and description of visual information, are outperformed by using Scale invariant feature transform (SIFT), HARRIS-SIFT, and FAST-SIFT methods. The accuracy of the implemented methods is computed. In addition to comparison, robustness and efficiency of results in different conditions are investigated and explained.Keywords: SIFT combination, feature extraction, feature detection, natural images, natural plant recognition, HARRIS-SIFT, FAST-SIFT
Procedia PDF Downloads 27610 Convolutional Neural Network Based on Random Kernels for Analyzing Visual Imagery
Authors: Ja-Keoung Koo, Kensuke Nakamura, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Byung-Woo Hong
Abstract:
The machine learning techniques based on a convolutional neural network (CNN) have been actively developed and successfully applied to a variety of image analysis tasks including reconstruction, noise reduction, resolution enhancement, segmentation, motion estimation, object recognition. The classical visual information processing that ranges from low level tasks to high level ones has been widely developed in the deep learning framework. It is generally considered as a challenging problem to derive visual interpretation from high dimensional imagery data. A CNN is a class of feed-forward artificial neural network that usually consists of deep layers the connections of which are established by a series of non-linear operations. The CNN architecture is known to be shift invariant due to its shared weights and translation invariance characteristics. However, it is often computationally intractable to optimize the network in particular with a large number of convolution layers due to a large number of unknowns to be optimized with respect to the training set that is generally required to be large enough to effectively generalize the model under consideration. It is also necessary to limit the size of convolution kernels due to the computational expense despite of the recent development of effective parallel processing machinery, which leads to the use of the constantly small size of the convolution kernels throughout the deep CNN architecture. However, it is often desired to consider different scales in the analysis of visual features at different layers in the network. Thus, we propose a CNN model where different sizes of the convolution kernels are applied at each layer based on the random projection. We apply random filters with varying sizes and associate the filter responses with scalar weights that correspond to the standard deviation of the random filters. We are allowed to use large number of random filters with the cost of one scalar unknown for each filter. The computational cost in the back-propagation procedure does not increase with the larger size of the filters even though the additional computational cost is required in the computation of convolution in the feed-forward procedure. The use of random kernels with varying sizes allows to effectively analyze image features at multiple scales leading to a better generalization. The robustness and effectiveness of the proposed CNN based on random kernels are demonstrated by numerical experiments where the quantitative comparison of the well-known CNN architectures and our models that simply replace the convolution kernels with the random filters is performed. The experimental results indicate that our model achieves better performance with less number of unknown weights. The proposed algorithm has a high potential in the application of a variety of visual tasks based on the CNN framework. Acknowledgement—This work was supported by the MISP (Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by IITP, and NRF-2014R1A2A1A11051941, NRF2017R1A2B4006023.Keywords: deep learning, convolutional neural network, random kernel, random projection, dimensionality reduction, object recognition
Procedia PDF Downloads 2899 Use of Extended Conversation to Boost Vocabulary Knowledge and Soft Skills in English for Employment Classes
Authors: James G. Matthew, Seonmin Huh, Frank X. Bennett
Abstract:
English for Specific Purposes, ESP, aims to equip learners with necessary English language skills. Many ESP programs address language skills for job performance, including reading job related documents and oral proficiency. Within ESP is English for occupational purposes, EOP, which centers around developing communicative competence for the globalized workplace. Many ESP and EOP courses lack the content needed to assist students to progress at work, resulting in the need to create lexical compilation for different professions. It is important to teach communicative competence and soft skills for real job-related problem situations and address the complexities of the real world to help students to be successful in their professions. ESP and EOP research is therefore trying to balance both profession-specific educational contents as well as international multi-disciplinary language skills for the globalized workforce. The current study will build upon the existing discussion by developing pedagogy to assist students in their career through developing a strong practical command of relevant English vocabulary. Our research question focuses on the pedagogy two professors incorporated in their English for employment courses. The current study is a qualitative case study on the modes of teaching delivery for EOP in South Korea. Two foreign professors teaching at two different universities in South Korea volunteered for the study to explore their teaching practices. Both professors’ curriculums included the components of employment-related concept vocabulary, business presentations, CV/resume and cover letter preparation, and job interview preparation. All the pre-made recorded video lectures, live online class sessions with students, teachers’ lesson plans, teachers’ class materials, students’ assignments, and midterm and finals video conferences were collected for data analysis. The study then focused on unpacking representative patterns in their teaching methods. The professors used their strengths as native speakers to extend the class discussion from narrow and restricted conversations to giving students broader opportunities to practice authentic English conversation. The methods of teaching utilized three main steps to extend the conversation. Firstly, students were taught concept vocabulary. Secondly, the vocabulary was then combined in speaking activities where students had to solve scenarios, and the students were required to expand on the given forms of words and language expressions. Lastly, the students had conversations in English, using the language learnt. The conversations observed in both classes were those of authentic, expanded English communication and this way of expanding concept vocabulary lessons into extended conversation is one representative pedagogical approach that both professors took. Extended English conversation, therefore, is crucial for EOP education.Keywords: concept vocabulary, english as a foreign language, english for employment, extended conversation
Procedia PDF Downloads 928 A Case Report on Cognitive-Communication Intervention in Traumatic Brain Injury
Authors: Nikitha Francis, Anjana Hoode, Vinitha George, Jayashree S. Bhat
Abstract:
The interaction between cognition and language, referred as cognitive-communication, is very intricate, involving several mental processes such as perception, memory, attention, lexical retrieval, decision making, motor planning, self-monitoring and knowledge. Cognitive-communication disorders are difficulties in communicative competencies that result from underlying cognitive impairments of attention, memory, organization, information processing, problem solving, and executive functions. Traumatic brain injury (TBI) is an acquired, non - progressive condition, resulting in distinct deficits of cognitive communication abilities such as naming, word-finding, self-monitoring, auditory recognition, attention, perception and memory. Cognitive-communication intervention in TBI is individualized, in order to enhance the person’s ability to process and interpret information for better functioning in their family and community life. The present case report illustrates the cognitive-communicative behaviors and the intervention outcomes of an adult with TBI, who was brought to the Department of Audiology and Speech Language Pathology, with cognitive and communicative disturbances, consequent to road traffic accident. On a detailed assessment, she showed naming deficits along with perseverations and had severe difficulty in recalling the details of the accident, her house address, places she had visited earlier, names of people known to her, as well as the activities she did each day, leading to severe breakdowns in her communicative abilities. She had difficulty in initiating, maintaining and following a conversation. She also lacked orientation to time and place. On administration of the Manipal Manual of Cognitive Linguistic Abilities (MMCLA), she exhibited poor performance on tasks related to visual and auditory perception, short term memory, working memory and executive functions. She attended 20 sessions of cognitive-communication intervention which followed a domain-general, adaptive training paradigm, with tasks relevant to everyday cognitive-communication skills. Compensatory strategies such as maintaining a dairy with reminders of her daily routine, names of people, date, time and place was also recommended. MMCLA was re-administered and her performance in the tasks showed significant improvements. Occurrence of perseverations and word retrieval difficulties reduced. She developed interests to initiate her day-to-day activities at home independently, as well as involve herself in conversations with her family members. Though she lacked awareness about her deficits, she actively involved herself in all the therapy activities. Rehabilitation of moderate to severe head injury patients can be done effectively through a holistic cognitive retraining with a focus on different cognitive-linguistic domains. Selection of goals and activities should have relevance to the functional needs of each individual with TBI, as highlighted in the present case report.Keywords: cognitive-communication, executive functions, memory, traumatic brain injury
Procedia PDF Downloads 3477 A Corpus-based Study of Adjuncts in Colombian English as a Second Language (ESL) Argumentative Essays
Authors: E. Velasco
Abstract:
Meeting high standards of writing in a Second Language (L2) is extremely important for many students who wish to undertake studies at universities in both English and non-English speaking countries. University lecturers in English speaking countries continue to express dissatisfaction with the apparent poor quality of essay writing skills displayed by English as a Second Language (ESL) students, whose essays are often criticised for their lack of cohesion and coherence. These critiques have extended to contexts such as Colombia, where many ESL students are criticised for their inability to write high-quality academic texts in L2-English, particularly at the tertiary level. If Colombian ESL students are expected to meet high standards of writing when studying locally and abroad, it makes sense to carry out specific research that can perhaps lead to recommendations to support their quest for improving argumentative strategies. Employing Corpus Linguistics methods within a Learner Corpus Research framework, and a combination of Log-Likelihood and Bayes Factor measures, this paper investigated argumentative essays written by Colombian ESL students. The study specifically aimed to analyse conjunctive adjuncts in argumentative essays to find out how Colombian ESL students connect their ideas in discourse. Results suggest that a) Colombian ESL learners need explicit instruction on specific areas of conjunctive adjuncts to counteract overuse, underuse and misuse; b) underuse of endophoric and evidential adjuncts highlights gaps between IELTS-like essays and good quality tertiary-level essays and published papers, and these gaps are linked to prior knowledge brought into writing task, rhetorical functions in writing, and research processes before writing takes place; c) both Colombian ESL learners and L1-English writers (in a reference corpus) overuse some adjuncts and underuse endophoric and evidential adjuncts, when compared to skilled L1-English and L2-English writers, so differences in frequencies of adjuncts has little to do with the writers’ L1, and differences are rather linked to types of essays writers produce (e.g. ESL vs. university essays). Ender Velasco: The pedagogical recommendations deriving from the study are that: a) Colombian ESL learners need to be shown that overuse is not the only way of giving cohesion to argumentative essays and there are other alternatives to cohesion (e.g., implicit adjuncts, lexical chains and collocations); b) syllabi and classroom input need to raise awareness of gaps in writing skills between IELTS-like and tertiary-level argumentative essays, and of how endophoric and evidential adjuncts are used to refer to anaphoric and cataphoric sections of essays, and to other people’s work or ideas; c) syllabi and classroom input need to include essay-writing tasks based on previous research/reading which learners need to incorporate into their arguments, and tasks that raise awareness of referencing systems (e.g., APA); d) classroom input needs to include explicit instruction on use of punctuation, functions and/or syntax with specific conjunctive adjuncts such as for example, for that reason, although, despite and nevertheless.Keywords: argumentative essays, colombian english as a second language (esl) learners, conjunctive adjuncts, corpus linguistics
Procedia PDF Downloads 856 Assessing of Social Comfort of the Russian Population with Big Data
Authors: Marina Shakleina, Konstantin Shaklein, Stanislav Yakiro
Abstract:
The digitalization of modern human life over the last decade has facilitated the acquisition, storage, and processing of data, which are used to detect changes in consumer preferences and to improve the internal efficiency of the production process. This emerging trend has attracted academic interest in the use of big data in research. The study focuses on modeling the social comfort of the Russian population for the period 2010-2021 using big data. Big data provides enormous opportunities for understanding human interactions at the scale of society with plenty of space and time dynamics. One of the most popular big data sources is Google Trends. The methodology for assessing social comfort using big data involves several steps: 1. 574 words were selected based on the Harvard IV-4 Dictionary adjusted to fit the reality of everyday Russian life. The set of keywords was further cleansed by excluding queries consisting of verbs and words with several lexical meanings. 2. Search queries were processed to ensure comparability of results: the transformation of data to a 10-point scale, elimination of popularity peaks, detrending, and deseasoning. The proposed methodology for keyword search and Google Trends processing was implemented in the form of a script in the Python programming language. 3. Block and summary integral indicators of social comfort were constructed using the first modified principal component resulting in weighting coefficients values of block components. According to the study, social comfort is described by 12 blocks: ‘health’, ‘education’, ‘social support’, ‘financial situation’, ‘employment’, ‘housing’, ‘ethical norms’, ‘security’, ‘political stability’, ‘leisure’, ‘environment’, ‘infrastructure’. According to the model, the summary integral indicator increased by 54% and was 4.631 points; the average annual rate was 3.6%, which is higher than the rate of economic growth by 2.7 p.p. The value of the indicator describing social comfort in Russia is determined by 26% by ‘social support’, 24% by ‘education’, 12% by ‘infrastructure’, 10% by ‘leisure’, and the remaining 28% by others. Among 25% of the most popular searches, 85% are of negative nature and are mainly related to the blocks ‘security’, ‘political stability’, ‘health’, for example, ‘crime rate’, ‘vulnerability’. Among the 25% most unpopular queries, 99% of the queries were positive and mostly related to the blocks ‘ethical norms’, ‘education’, ‘employment’, for example, ‘social package’, ‘recycling’. In conclusion, the introduction of the latent category ‘social comfort’ into the scientific vocabulary deepens the theory of the quality of life of the population in terms of the study of the involvement of an individual in the society and expanding the subjective aspect of the measurements of various indicators. Integral assessment of social comfort demonstrates the overall picture of the development of the phenomenon over time and space and quantitatively evaluates ongoing socio-economic policy. The application of big data in the assessment of latent categories gives stable results, which opens up possibilities for their practical implementation.Keywords: big data, Google trends, integral indicator, social comfort
Procedia PDF Downloads 2005 A Computer-Aided System for Tooth Shade Matching
Authors: Zuhal Kurt, Meral Kurt, Bilge T. Bal, Kemal Ozkan
Abstract:
Shade matching and reproduction is the most important element of success in prosthetic dentistry. Until recently, shade matching procedure was implemented by dentists visual perception with the help of shade guides. Since many factors influence visual perception; tooth shade matching using visual devices (shade guides) is highly subjective and inconsistent. Subjective nature of this process has lead to the development of instrumental devices. Nowadays, colorimeters, spectrophotometers, spectroradiometers and digital image analysing systems are used for instrumental shade selection. Instrumental devices have advantages that readings are quantifiable, can obtain more rapidly and simply, objectively and precisely. However, these devices have noticeable drawbacks. For example, translucent structure and irregular surfaces of teeth lead to defects on measurement with these devices. Also between the results acquired by devices with different measurement principles may make inconsistencies. So, its obligatory to search for new methods for dental shade matching process. A computer-aided system device; digital camera has developed rapidly upon today. Currently, advances in image processing and computing have resulted in the extensive use of digital cameras for color imaging. This procedure has a much cheaper process than the use of traditional contact-type color measurement devices. Digital cameras can be taken by the place of contact-type instruments for shade selection and overcome their disadvantages. Images taken from teeth show morphology and color texture of teeth. In last decades, a new method was recommended to compare the color of shade tabs taken by a digital camera using color features. This method showed that visual and computer-aided shade matching systems should be used as concatenated. Recently using methods of feature extraction techniques are based on shape description and not used color information. However, color is mostly experienced as an essential property in depicting and extracting features from objects in the world around us. When local feature descriptors with color information are extended by concatenating color descriptor with the shape descriptor, that descriptor will be effective on visual object recognition and classification task. Therefore, the color descriptor is to be used in combination with a shape descriptor it does not need to contain any spatial information, which leads us to use local histograms. This local color histogram method is remain reliable under variation of photometric changes, geometrical changes and variation of image quality. So, coloring local feature extraction methods are used to extract features, and also the Scale Invariant Feature Transform (SIFT) descriptor used to for shape description in the proposed method. After the combination of these descriptors, the state-of-art descriptor named by Color-SIFT will be used in this study. Finally, the image feature vectors obtained from quantization algorithm are fed to classifiers such as Nearest Neighbor (KNN), Naive Bayes or Support Vector Machines (SVM) to determine label(s) of the visual object category or matching. In this study, SVM are used as classifiers for color determination and shade matching. Finally, experimental results of this method will be compared with other recent studies. It is concluded from the study that the proposed method is remarkable development on computer aided tooth shade determination system.Keywords: classifiers, color determination, computer-aided system, tooth shade matching, feature extraction
Procedia PDF Downloads 4444 Translation, Cross-Cultural Adaption, and Validation of the Vividness of Movement Imagery Questionnaire 2 (VMIQ-2) to Classical Arabic Language
Authors: Majid Alenezi, Abdelbare Algamode, Amy Hayes, Gavin Lawrence, Nichola Callow
Abstract:
The purpose of this study was to translate and culturally adapt the Vividness of Movement Imagery Questionnaire-2 (VMIQ-2) from English to produce a new Arabic version (VMIQ-2A), and to evaluate the reliability and validity of the translated questionnaire. The questionnaire assesses how vividly and clearly individuals are able to imagine themselves performing everyday actions. Its purpose is to measure individuals’ ability to conduct movement imagery, which can be defined as “the cognitive rehearsal of a task in the absence of overt physical movement.” Movement imagery has been introduced in physiotherapy as a promising intervention technique, especially when physical exercise is not possible (e.g. pain, immobilisation.) Considerable evidence indicates movement imagery interventions improve physical function, but to maximize efficacy it is important to know the imagery abilities of the individuals being treated. Given the increase in the global sharing of knowledge it is desirable to use standard measures of imagery ability across language and cultures, thus motivating this project. The translation procedure followed guidelines from the Translation and Cultural Adaptation group of the International Society for Pharmacoeconomics and Outcomes Research and involved the following phases: Preparation; the original VMIQ-2 was adapted slightly to provide additional information and simplified grammar. Forward translation; three native speakers resident in Saudi Arabia translated the original VMIQ-2 from English to Arabic, following instruction to preserve meaning (not literal translation), and cultural relevance. Reconciliation; the project manager (first author), the primary translator and a physiotherapist reviewed the three independent translations to produce a reconciled first Arabic draft of VMIQ-2A. Backward translation; a fourth translator (native Arabic speaker fluent in English) translated literally the reconciled first Arabic draft to English. The project manager and two study authors compared the English back translation to the original VMIQ-2 and produced the second Arabic draft. Cognitive debriefing; to assess participants’ understanding of the second Arabic draft, 7 native Arabic speakers resident in the UK completed the questionnaire, and rated the clearness of the questions, specified difficult words or passages, and wrote in their own words their understanding of key terms. Following review of this feedback, a final Arabic version was created. 142 native Arabic speakers completed the questionnaire in community meeting places or at home; a subset of 44 participants completed the questionnaire a second time 1 week later. Results showed the translated questionnaire to be valid and reliable. Correlation coefficients indicated good test-retest reliability. Cronbach’s a indicated high internal consistency. Construct validity was tested in two ways. Imagery ability scores have been found to be invariant across gender; this result was replicated within the current study, assessed by independent-samples t-test. Additionally, experienced sports participants have higher imagery ability than those less experienced; this result was also replicated within the current study, assessed by analysis of variance, supporting construct validity. Results provide preliminary evidence that the VMIQ-2A is reliable and valid to be used with a general population who are native Arabic speakers. Future research will include validation of the VMIQ-2A in a larger sample, and testing validity in specific patient populations.Keywords: motor imagery, physiotherapy, translation and validation, imagery ability
Procedia PDF Downloads 3343 Ensemble Sampler For Infinite-Dimensional Inverse Problems
Authors: Jeremie Coullon, Robert J. Webber
Abstract:
We introduce a Markov chain Monte Carlo (MCMC) sam-pler for infinite-dimensional inverse problems. Our sam-pler is based on the affine invariant ensemble sampler, which uses interacting walkers to adapt to the covariance structure of the target distribution. We extend this ensem-ble sampler for the first time to infinite-dimensional func-tion spaces, yielding a highly efficient gradient-free MCMC algorithm. Because our ensemble sampler does not require gradients or posterior covariance estimates, it is simple to implement and broadly applicable. In many Bayes-ian inverse problems, Markov chain Monte Carlo (MCMC) meth-ods are needed to approximate distributions on infinite-dimensional function spaces, for example, in groundwater flow, medical imaging, and traffic flow. Yet designing efficient MCMC methods for function spaces has proved challenging. Recent gradi-ent-based MCMC methods preconditioned MCMC methods, and SMC methods have improved the computational efficiency of functional random walk. However, these samplers require gradi-ents or posterior covariance estimates that may be challenging to obtain. Calculating gradients is difficult or impossible in many high-dimensional inverse problems involving a numerical integra-tor with a black-box code base. Additionally, accurately estimating posterior covariances can require a lengthy pilot run or adaptation period. These concerns raise the question: is there a functional sampler that outperforms functional random walk without requir-ing gradients or posterior covariance estimates? To address this question, we consider a gradient-free sampler that avoids explicit covariance estimation yet adapts naturally to the covariance struc-ture of the sampled distribution. This sampler works by consider-ing an ensemble of walkers and interpolating and extrapolating between walkers to make a proposal. This is called the affine in-variant ensemble sampler (AIES), which is easy to tune, easy to parallelize, and efficient at sampling spaces of moderate dimen-sionality (less than 20). The main contribution of this work is to propose a functional ensemble sampler (FES) that combines func-tional random walk and AIES. To apply this sampler, we first cal-culate the Karhunen–Loeve (KL) expansion for the Bayesian prior distribution, assumed to be Gaussian and trace-class. Then, we use AIES to sample the posterior distribution on the low-wavenumber KL components and use the functional random walk to sample the posterior distribution on the high-wavenumber KL components. Alternating between AIES and functional random walk updates, we obtain our functional ensemble sampler that is efficient and easy to use without requiring detailed knowledge of the target dis-tribution. In past work, several authors have proposed splitting the Bayesian posterior into low-wavenumber and high-wavenumber components and then applying enhanced sampling to the low-wavenumber components. Yet compared to these other samplers, FES is unique in its simplicity and broad applicability. FES does not require any derivatives, and the need for derivative-free sam-plers has previously been emphasized. FES also eliminates the requirement for posterior covariance estimates. Lastly, FES is more efficient than other gradient-free samplers in our tests. In two nu-merical examples, we apply FES to challenging inverse problems that involve estimating a functional parameter and one or more scalar parameters. We compare the performance of functional random walk, FES, and an alternative derivative-free sampler that explicitly estimates the posterior covariance matrix. We conclude that FES is the fastest available gradient-free sampler for these challenging and multimodal test problems.Keywords: Bayesian inverse problems, Markov chain Monte Carlo, infinite-dimensional inverse problems, dimensionality reduction
Procedia PDF Downloads 1542 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts
Authors: William Michael Short
Abstract:
Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics
Procedia PDF Downloads 1321 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure
Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer
Abstract:
The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition
Procedia PDF Downloads 108