Search results for: semantic data
25314 Arabic Light Word Analyser: Roles with Deep Learning Approach
Authors: Mohammed Abu Shquier
Abstract:
This paper introduces a word segmentation method using the novel BP-LSTM-CRF architecture for processing semantic output training. The objective of web morphological analysis tools is to link a formal morpho-syntactic description to a lemma, along with morpho-syntactic information, a vocalized form, a vocalized analysis with morpho-syntactic information, and a list of paradigms. A key objective is to continuously enhance the proposed system through an inductive learning approach that considers semantic influences. The system is currently under construction and development based on data-driven learning. To evaluate the tool, an experiment on homograph analysis was conducted. The tool also encompasses the assumption of deep binary segmentation hypotheses, the arbitrary choice of trigram or n-gram continuation probabilities, language limitations, and morphology for both Modern Standard Arabic (MSA) and Dialectal Arabic (DA), which provide justification for updating this system. Most Arabic word analysis systems are based on the phonotactic morpho-syntactic analysis of a word transmitted using lexical rules, which are mainly used in MENA language technology tools, without taking into account contextual or semantic morphological implications. Therefore, it is necessary to have an automatic analysis tool taking into account the word sense and not only the morpho-syntactic category. Moreover, they are also based on statistical/stochastic models. These stochastic models, such as HMMs, have shown their effectiveness in different NLP applications: part-of-speech tagging, machine translation, speech recognition, etc. As an extension, we focus on language modeling using Recurrent Neural Network (RNN); given that morphological analysis coverage was very low in dialectal Arabic, it is significantly important to investigate deeply how the dialect data influence the accuracy of these approaches by developing dialectal morphological processing tools to show that dialectal variability can support to improve analysis.Keywords: NLP, DL, ML, analyser, MSA, RNN, CNN
Procedia PDF Downloads 4225313 A Chinese Nested Named Entity Recognition Model Based on Lexical Features
Abstract:
In the field of named entity recognition, most of the research has been conducted around simple entities. However, for nested named entities, which still contain entities within entities, it has been difficult to identify them accurately due to their boundary ambiguity. In this paper, a hierarchical recognition model is constructed based on the grammatical structure and semantic features of Chinese text for boundary calculation based on lexical features. The analysis is carried out at different levels in terms of granularity, semantics, and lexicality, respectively, avoiding repetitive work to reduce computational effort and using the semantic features of words to calculate the boundaries of entities to improve the accuracy of the recognition work. The results of the experiments carried out on web-based microblogging data show that the model achieves an accuracy of 86.33% and an F1 value of 89.27% in recognizing nested named entities, making up for the shortcomings of some previous recognition models and improving the efficiency of recognition of nested named entities.Keywords: coarse-grained, nested named entity, Chinese natural language processing, word embedding, T-SNE dimensionality reduction algorithm
Procedia PDF Downloads 12825312 The Culture of Journal Writing among Manobo Senior High School Students
Authors: Jessevel Montes
Abstract:
This study explored on the culture of journal writing among the Senior High School Manobo students. The purpose of this qualitative morpho-semantic and syntactic study was to discover the morphological, semantic, and syntactic features of the written output through morphological, semantic, and syntactic categories present in their journal writings. Also, beliefs and practices embedded in the norms, values, and ideologies were identified. The study was conducted among the Manobo students in the Senior High Schools of Central Mindanao, particularly in the Division of North Cotabato. Findings revealed that morphologically, the features that flourished are the following: subject-verb concordance, tenses, pronouns, prepositions, articles, and the use of adjectives. Semantically, the features are the following: word choice, idiomatic expression, borrowing, and vernacular. Syntactically, the features are the types of sentences according to structure and function; and the dominance of code switching and run-on sentences. Lastly, as to the beliefs and practices embedded in the norms, values, and ideologies of their journal writing, the major themes are: valuing education, family, and friends as treasure, preservation of culture, and emancipation from the bondage of poverty. This study has shed light on the writing capabilities and weaknesses of the Manobo students when it comes to English language. Further, such an insight into language learning problems is useful to teachers because it provides information on common trouble-spots in language learning, which can be used in the preparation of effective teaching materials.Keywords: applied linguistics, culture, morpho-semantic and syntactic analysis, Manobo Senior High School, Philippines
Procedia PDF Downloads 12125311 The Cognitive Perspective on Arabic Spatial Preposition ‘Ala
Authors: Zaqiatul Mardiah, Afdol Tharik Wastono, Abdul Muta'ali
Abstract:
In general, the Arabic preposition ‘ala encodes the sense of UP-DOWN schema. However, the use of the preposition ‘ala can has many extended schemas that still have relation to its primary sense. In this paper, we show how the framework of cognitive linguistics (CL) based on image schemas can be applied to analyze the spatial semantic of the use of preposition ‘ala in the horizontal and vertical axes. The preposition ‘ala is usually used in the locative sense in which one physical entity is UP-DOWN relation to another physical entity. In spite of that, the cognitive analysis of ‘ala justifies the use of this preposition in many situations to seemingly encode non-up down-related spatial relations, and non-physical relation. This uncovers some of the unsolved issues concerning prepositions in general and the Arabic prepositions in particular the use of ‘ala as a sample. Using the Arabic corpus data, we reveal that in many cases and situations, the use of ‘ala is extended to depict relations other than the ones where the Trajector (TR) is actually in up-down relation to the Landmark (LM). The instances analyzed in this paper show that ‘ala encodes not only the spatial relations in which the TR and the LM are horizontally or vertically related to each other, but also non-spatial relations.Keywords: image schema, preposition, spatial semantic, up-down relation
Procedia PDF Downloads 14825310 Suicide Conceptualization in Adolescents through Semantic Networks
Authors: K. P. Valdés García, E. I. Rodríguez Fonseca, L. G. Juárez Cantú
Abstract:
Suicide is a global, multidimensional and dynamic problem of mental health, which requires a constant study for its understanding and prevention. When research of this phenomenon is done, it is necessary to consider the different characteristics it may have because of the individual and sociocultural variables, the importance of this consideration is related to the generation of effective treatments and interventions. Adolescents are a vulnerable population due to the characteristics of the development stage. The investigation was carried out with the objective of identifying and describing the conceptualization of adolescents of suicide, and in this process, we find possible differences between men and women. The study was carried out in Saltillo, Coahuila, Mexico. The sample was composed of 418 volunteer students aged between 11 and 18 years. The ethical aspects of the research were reviewed and considered in all the processes of the investigation with the participants, their parents and the schools to which they belonged, psychological attention was offered to the participants and preventive workshops were carried in the educational institutions. Natural semantic networks were the instrument used, since this hybrid method allows to find and analyze the social concept of a phenomenon; in this case, the word suicide was used as an evocative stimulus and participants were asked to evoke at least five words and a maximum 10 that they thought were related to suicide, and then hierarchize them according to the closeness with the construct. The subsequent analysis was carried with Excel, yielding the semantic weights, affective loads and the distances between each of the semantic fields established according to the words reported by the subjects. The results showed similarities in the conceptualization of suicide in adolescents, men and women. Seven semantic fields were generated; the words were related in the discourse analysis: 1) death, 2) possible triggering factors, 3) associated moods, 4) methods used to carry it out, 5) psychological symptomatology that could affect, 6) words associated with a rejection of suicide, and finally, 7) specific objects to carry it out. One of the necessary aspects to consider in the investigations of complex issues such as suicide is to have a diversity of instruments and techniques that adjust to the characteristics of the population and that allow to understand the phenomena from the social constructs and not only theoretical. The constant study of suicide is a pressing need, the loss of a life from emotional difficulties that can be solved through psychiatry and psychological methods requires governments and professionals to pay attention and work with the risk population.Keywords: adolescents, psychological construct, semantic networks, suicide
Procedia PDF Downloads 10925309 Semantic Indexing Improvement for Textual Documents: Contribution of Classification by Fuzzy Association Rules
Authors: Mohsen Maraoui
Abstract:
In the aim of natural language processing applications improvement, such as information retrieval, machine translation, lexical disambiguation, we focus on statistical approach to semantic indexing for multilingual text documents based on conceptual network formalism. We propose to use this formalism as an indexing language to represent the descriptive concepts and their weighting. These concepts represent the content of the document. Our contribution is based on two steps. In the first step, we propose the extraction of index terms using the multilingual lexical resource Euro WordNet (EWN). In the second step, we pass from the representation of index terms to the representation of index concepts through conceptual network formalism. This network is generated using the EWN resource and pass by a classification step based on association rules model (in attempt to discover the non-taxonomic relations or contextual relations between the concepts of a document). These relations are latent relations buried in the text and carried by the semantic context of the co-occurrence of concepts in the document. Our proposed indexing approach can be applied to text documents in various languages because it is based on a linguistic method adapted to the language through a multilingual thesaurus. Next, we apply the same statistical process regardless of the language in order to extract the significant concepts and their associated weights. We prove that the proposed indexing approach provides encouraging results.Keywords: concept extraction, conceptual network formalism, fuzzy association rules, multilingual thesaurus, semantic indexing
Procedia PDF Downloads 14125308 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation
Authors: Akrem Sellami, Imed Riadh Farah
Abstract:
Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph
Procedia PDF Downloads 30625307 Collective Intelligence-Based Early Warning Management for Agriculture
Authors: Jarbas Lopes Cardoso Jr., Frederic Andres, Alexandre Guitton, Asanee Kawtrakul, Silvio E. Barbin
Abstract:
The important objective of the CyberBrain Mass Agriculture Alarm Acquisition and Analysis (CBMa4) project is to minimize the impacts of diseases and disasters on rice cultivation. For example, early detection of insects will reduce the volume of insecticides that is applied to the rice fields through the use of CBMa4 platform. In order to reach this goal, two major factors need to be considered: (1) the social network of smart farmers; and (2) the warning data alarm acquisition and analysis component. This paper outlines the process for collecting the warning and improving the decision-making result to the warning. It involves two sub-processes: the warning collection and the understanding enrichment. Human sensors combine basic suitable data processing techniques in order to extract warning related semantic according to collective intelligence. We identify each warning by a semantic content called 'warncons' with multimedia metaphors and metadata related to these metaphors. It is important to describe the metric to measuring the relation among warncons. With this knowledge, a collective intelligence-based decision-making approach determines the action(s) to be launched regarding one or a set of warncons.Keywords: agricultural engineering, warning systems, social network services, context awareness
Procedia PDF Downloads 38225306 Human Action Retrieval System Using Features Weight Updating Based Relevance Feedback Approach
Authors: Munaf Rashid
Abstract:
For content-based human action retrieval systems, search accuracy is often inferior because of the following two reasons 1) global information pertaining to videos is totally ignored, only low level motion descriptors are considered as a significant feature to match the similarity between query and database videos, and 2) the semantic gap between the high level user concept and low level visual features. Hence, in this paper, we propose a method that will address these two issues and in doing so, this paper contributes in two ways. Firstly, we introduce a method that uses both global and local information in one framework for an action retrieval task. Secondly, to minimize the semantic gap, a user concept is involved by incorporating features weight updating (FWU) Relevance Feedback (RF) approach. We use statistical characteristics to dynamically update weights of the feature descriptors so that after every RF iteration feature space is modified accordingly. For testing and validation purpose two human action recognition datasets have been utilized, namely Weizmann and UCF. Results show that even with a number of visual challenges the proposed approach performs well.Keywords: relevance feedback (RF), action retrieval, semantic gap, feature descriptor, codebook
Procedia PDF Downloads 47325305 Unlocking the Potential of Short Texts with Semantic Enrichment, Disambiguation Techniques, and Context Fusion
Authors: Mouheb Mehdoui, Amel Fraisse, Mounir Zrigui
Abstract:
This paper explores the potential of short texts through semantic enrichment and disambiguation techniques. By employing context fusion, we aim to enhance the comprehension and utility of concise textual information. The methodologies utilized are grounded in recent advancements in natural language processing, which allow for a deeper understanding of semantics within limited text formats. Specifically, topic classification is employed to understand the context of the sentence and assess the relevance of added expressions. Additionally, word sense disambiguation is used to clarify unclear words, replacing them with more precise terms. The implications of this research extend to various applications, including information retrieval and knowledge representation. Ultimately, this work highlights the importance of refining short text processing techniques to unlock their full potential in real-world applications.Keywords: information traffic, text summarization, word-sense disambiguation, semantic enrichment, ambiguity resolution, short text enhancement, information retrieval, contextual understanding, natural language processing, ambiguity
Procedia PDF Downloads 825304 Reconstruction of Visual Stimuli Using Stable Diffusion with Text Conditioning
Authors: ShyamKrishna Kirithivasan, Shreyas Battula, Aditi Soori, Richa Ramesh, Ramamoorthy Srinath
Abstract:
The human brain, among the most complex and mysterious aspects of the body, harbors vast potential for extensive exploration. Unraveling these enigmas, especially within neural perception and cognition, delves into the realm of neural decoding. Harnessing advancements in generative AI, particularly in Visual Computing, seeks to elucidate how the brain comprehends visual stimuli observed by humans. The paper endeavors to reconstruct human-perceived visual stimuli using Functional Magnetic Resonance Imaging (fMRI). This fMRI data is then processed through pre-trained deep-learning models to recreate the stimuli. Introducing a new architecture named LatentNeuroNet, the aim is to achieve the utmost semantic fidelity in stimuli reconstruction. The approach employs a Latent Diffusion Model (LDM) - Stable Diffusion v1.5, emphasizing semantic accuracy and generating superior quality outputs. This addresses the limitations of prior methods, such as GANs, known for poor semantic performance and inherent instability. Text conditioning within the LDM's denoising process is handled by extracting text from the brain's ventral visual cortex region. This extracted text undergoes processing through a Bootstrapping Language-Image Pre-training (BLIP) encoder before it is injected into the denoising process. In conclusion, a successful architecture is developed that reconstructs the visual stimuli perceived and finally, this research provides us with enough evidence to identify the most influential regions of the brain responsible for cognition and perception.Keywords: BLIP, fMRI, latent diffusion model, neural perception.
Procedia PDF Downloads 6825303 Syllogistic Reasoning with 108 Inference Rules While Case Quantities Change
Authors: Mikhail Zarechnev, Bora I. Kumova
Abstract:
A syllogism is a deductive inference scheme used to derive a conclusion from a set of premises. In a categorical syllogisms, there are only two premises and every premise and conclusion is given in form of a quantified relationship between two objects. The different order of objects in premises give classification known as figures. We have shown that the ordered combinations of 3 generalized quantifiers with certain figure provide in total of 108 syllogistic moods which can be considered as different inference rules. The classical syllogistic system allows to model human thought and reasoning with syllogistic structures always attracted the attention of cognitive scientists. Since automated reasoning is considered as part of learning subsystem of AI agents, syllogistic system can be applied for this approach. Another application of syllogistic system is related to inference mechanisms on the Semantic Web applications. In this paper we proposed the mathematical model and algorithm for syllogistic reasoning. Also the model of iterative syllogistic reasoning in case of continuous flows of incoming data based on case–based reasoning and possible applications of proposed system were discussed.Keywords: categorical syllogism, case-based reasoning, cognitive architecture, inference on the semantic web, syllogistic reasoning
Procedia PDF Downloads 41125302 On the Semantics and Pragmatics of 'Be Able To': Modality and Actualisation
Authors: Benoît Leclercq, Ilse Depraetere
Abstract:
The goal of this presentation is to shed new light on the semantics and pragmatics of be able to. It presents the results of a corpus analysis based on data from the BNC (British National Corpus), and discusses these results in light of a specific stance on the semantics-pragmatics interface taking into account recent developments. Be able to is often discussed in relation to can and could, all of which can be used to express ability. Such an onomasiological approach often results in the identification of usage constraints for each expression. In the case of be able to, it is the formal properties of the modal expression (unlike can and could, be able to has non-finite forms) that are in the foreground, and the modal expression is described as the verb that conveys future ability. Be able to is also argued to expressed actualised ability in the past (I was able/could to open the door). This presentation aims to provide a more accurate pragmatic-semantic profile of be able to, based on extensive data analysis and one that is embedded in a very explicit view on the semantics-pragmatics interface. A random sample of 3000 examples (1000 for each modal verb) extracted from the BNC was analysed to account for the following issues. First, the challenge is to identify the exact semantic range of be able to. The results show that, contrary to general assumption, be able to does not only express ability but it shares most of the root meanings usually associated with the possibility modals can and could. The data reveal that what is called opportunity is, in fact, the most frequent meaning of be able to. Second, attention will be given to the notion of actualisation. It is commonly argued that be able to is the preferred form when the residue actualises: (1) The only reason he was able to do that was because of the restriction (BNC, spoken) (2) It is only through my imaginative shuffling of the aces that we are able to stay ahead of the pack. (BNC, written) Although this notion has been studied in detail within formal semantic approaches, empirical data is crucially lacking and it is unclear whether actualisation constitutes a conventional (and distinguishing) property of be able to. The empirical analysis provides solid evidence that actualisation is indeed a conventional feature of the modal. Furthermore, the dataset reveals that be able to expresses actualised 'opportunities' and not actualised 'abilities'. In the final part of this paper, attention will be given to the theoretical implications of the empirical findings, and in particular to the following paradox: how can the same expression encode both modal meaning (non-factual) and actualisation (factual)? It will be argued that this largely depends on one's conception of the semantics-pragmatics interface, and that this need not be an issue when actualisation (unlike modality) is analysed as a generalised conversational implicature and thus is considered part of the conventional pragmatic layer of be able to.Keywords: Actualisation, Modality, Pragmatics, Semantics
Procedia PDF Downloads 13125301 Designing a Patient Monitoring System Using Cloud and Semantic Web Technologies
Authors: Chryssa Thermolia, Ekaterini S. Bei, Stelios Sotiriadis, Kostas Stravoskoufos, Euripides G. M. Petrakis
Abstract:
Moving into a new era of healthcare, new tools and devices are developed to extend and improve health services, such as remote patient monitoring and risk prevention. In this concept, Internet of Things (IoT) and Cloud Computing present great advantages by providing remote and efficient services, as well as cooperation between patients, clinicians, researchers and other health professionals. This paper focuses on patients suffering from bipolar disorder, a brain disorder that belongs to a group of conditions called effective disorders, which is characterized by great mood swings.We exploit the advantages of Semantic Web and Cloud Technologies to develop a patient monitoring system to support clinicians. Based on intelligently filtering of evidence-knowledge and individual-specific information we aim to provide treatment notifications and recommended function tests at appropriate times or concluding into alerts for serious mood changes and patient’s non-response to treatment. We propose an architecture, as the back-end part of a cloud platform for IoT, intertwining intelligence devices with patients’ daily routine and clinicians’ support.Keywords: bipolar disorder, intelligent systems patient monitoring, semantic web technologies, healthcare
Procedia PDF Downloads 50825300 An Exploratory Sequential Design: A Mixed Methods Model for the Statistics Learning Assessment with a Bayesian Network Representation
Authors: Zhidong Zhang
Abstract:
This study established a mixed method model in assessing statistics learning with Bayesian network models. There are three variants in exploratory sequential designs. There are three linked steps in one of the designs: qualitative data collection and analysis, quantitative measure, instrument, intervention, and quantitative data collection analysis. The study used a scoring model of analysis of variance (ANOVA) as a content domain. The research study is to examine students’ learning in both semantic and performance aspects at fine grain level. The ANOVA score model, y = α+ βx1 + γx1+ ε, as a cognitive task to collect data during the student learning process. When the learning processes were decomposed into multiple steps in both semantic and performance aspects, a hierarchical Bayesian network was established. This is a theory-driven process. The hierarchical structure was gained based on qualitative cognitive analysis. The data from students’ ANOVA score model learning was used to give evidence to the hierarchical Bayesian network model from the evidential variables. Finally, the assessment results of students’ ANOVA score model learning were reported. Briefly, this was a mixed method research design applied to statistics learning assessment. The mixed methods designs expanded more possibilities for researchers to establish advanced quantitative models initially with a theory-driven qualitative mode.Keywords: exploratory sequential design, ANOVA score model, Bayesian network model, mixed methods research design, cognitive analysis
Procedia PDF Downloads 17825299 Efficiency of Google Translate and Bing Translator in Translating Persian-to-English Texts
Authors: Samad Sajjadi
Abstract:
Machine translation is a new subject increasingly being used by academic writers, especially students and researchers whose native language is not English. There are numerous studies conducted on machine translation, but few investigations have assessed the accuracy of machine translation from Persian to English at lexical, semantic, and syntactic levels. Using Groves and Mundt’s (2015) Model of error taxonomy, the current study evaluated Persian-to-English translations produced by two famous online translators, Google Translate and Bing Translator. A total of 240 texts were randomly selected from different academic fields (law, literature, medicine, and mass media), and 60 texts were considered for each domain. All texts were rendered by the two translation systems and then by four human translators. All statistical analyses were applied using SPSS. The results indicated that Google translations were more accurate than the translations produced by the Bing Translator, especially in the domains of medicine (lexis: 186 vs. 225; semantic: 44 vs. 48; syntactic: 148 vs. 264 errors) and mass media (lexis: 118 vs. 149; semantic: 25 vs. 32; syntactic: 110 vs. 220 errors), respectively. Nonetheless, both machines are reasonably accurate in Persian-to-English translation of lexicons and syntactic structures, particularly from mass media and medical texts.Keywords: machine translations, accuracy, human translation, efficiency
Procedia PDF Downloads 7825298 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts
Authors: William Michael Short
Abstract:
Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics
Procedia PDF Downloads 13225297 Spatial Evaluations of Haskoy: The Emperial Village
Authors: Yasemin Filiz-Kuruel, Emine Koseoglu
Abstract:
This study aims to evaluate Haskoy district of Beyoglu town of Istanbul. Haskoy is located in Halic region, between Kasimpasa district and Kagithane district. After the conquest of Istanbul, Fatih Sultan Mehmet (the Conqueror) set up his tent here. Therefore, the area gets its name as Haskoy, 'imperial village' that means a village which is special for Sultan. Today, there are shipyard and ateliers in variable sizes in Haskoy. In this study, the legibility of Haskoy streets is investigated comparatively. As a research method, semantic differential scale is used. The photos of the streets, which contain specific criteria, are chosen. The questionnaire is directed to first and third grade architecture students. The spatial evaluation of Haskoy streets is done through the survey.Keywords: Haskoy, legibility, semantic differential scale, urban streets
Procedia PDF Downloads 56625296 Effect of the Keyword Strategy on Lexical Semantic Acquisition: Recognition, Retention and Comprehension in an English as Second Language Context
Authors: Fatima Muhammad Shitu
Abstract:
This study seeks to investigate the effect of the keyword strategy on lexico–semantic acquisition, recognition, retention and comprehension in an ESL context. The aim of the study is to determine whether the keyword strategy can be used to enhance acquisition. As a quasi- experimental research, the objectives of the study include: To determine the extent to which the scores obtained by the subjects, who were trained on the use of the keyword strategy for acquisition, differ at the pre-tests and the post–tests and also to find out the relationship in the scores obtained at these tests levels. The sample for the study consists of 300 hundred undergraduate ESL Students in the Federal College of Education, Kano. The seventy-five lexical items for acquisition belong to the lexical field category known as register, and they include Medical, Agriculture and Photography registers (MAP). These were divided in the ratio twenty-five (25) lexical items in each lexical field. The testing technique was used to collect the data while the descriptive and inferential statistics were employed for data analysis. For the purpose of testing, the two kinds of tests administered at each test level include the WARRT (Word Acquisition, Recognition, and Retention Test) and the CCPT (Cloze Comprehension Passage Test). The results of the study revealed that there are significant differences in the scores obtained between the pre-tests, and the post–tests and there are no correlations in the scores obtained as well. This implies that the keyword strategy has effectively enhanced the acquisition of the lexical items studied.Keywords: keyword, lexical, semantics, strategy
Procedia PDF Downloads 31125295 Modified Active (MA) Algorithm to Generate Semantic Web Related Clustered Hierarchy for Keyword Search
Authors: G. Leena Giri, Archana Mathur, S. H. Manjula, K. R. Venugopal, L. M. Patnaik
Abstract:
Keyword search in XML documents is based on the notion of lowest common ancestors in the labelled trees model of XML documents and has recently gained a lot of research interest in the database community. In this paper, we propose the Modified Active (MA) algorithm which is an improvement over the active clustering algorithm by taking into consideration the entity aspect of the nodes to find the level of the node pertaining to a particular keyword input by the user. A portion of the bibliography database is used to experimentally evaluate the modified active algorithm and results show that it performs better than the active algorithm. Our modification improves the response time of the system and thereby increases the efficiency of the system.Keywords: keyword matching patterns, MA algorithm, semantic search, knowledge management
Procedia PDF Downloads 41325294 Probing Language Models for Multiple Linguistic Information
Authors: Bowen Ding, Yihao Kuang
Abstract:
In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.Keywords: language models, probing task, text presentation, linguistic information
Procedia PDF Downloads 11025293 A Cognitive Semantic Analysis of the Metaphorical Extensions of Come out and Take Over
Authors: Raquel Rossini, Edelvais Caldeira
Abstract:
The aim of this work is to investigate the motivation for the metaphorical uses of two verb combinations: come out and take over. Drawing from cognitive semantics theories, image schemas and metaphors, it was attempted to demonstrate that: a) the metaphorical senses of both 'come out' and 'take over' extend from both the verbs and the particles central (spatial) senses in such verb combinations; and b) the particles 'out' and 'over' also contribute to the whole meaning of the verb combinations. In order to do so, a random selection of 579 concordance lines for come out and 1,412 for take over was obtained from the Corpus of Contemporary American English – COCA. One of the main procedures adopted in the present work was the establishment of verb and particle central senses. As per the research questions addressed in this study, they are as follows: a) how does the identification of trajector and landmark help reveal patterns that contribute for the identification of the semantic network of these two verb combinations?; b) what is the relationship between the schematic structures attributed to the particles and the metaphorical uses found in empirical data?; and c) what conceptual metaphors underlie the mappings from the source to the target domains? The results demonstrated that not only the lexical verbs come and take, but also the particles out and over play an important whole in the different meanings of come out and take over. Besides, image schemas and conceptual metaphors were found to be helpful in order to establish the motivations for the metaphorical uses of these linguistic structures.Keywords: cognitive linguistics, English syntax, multi-word verbs, prepositions
Procedia PDF Downloads 15525292 Understanding the Semantic Network of Tourism Studies in Taiwan by Using Bibliometrics Analysis
Authors: Chun-Min Lin, Yuh-Jen Wu, Ching-Ting Chung
Abstract:
The formulation of tourism policies requires objective academic research and evidence as support, especially research from local academia. Taiwan is a small island, and its economic growth relies heavily on tourism revenue. Taiwanese government has been devoting to the promotion of the tourism industry over the past few decades. Scientific research outcomes by Taiwanese scholars may and will help lay the foundations for drafting future tourism policy by the government. In this study, a total of 120 full journal articles published between 2008 and 2016 from the Journal of Tourism and Leisure Studies (JTSL) were examined to explore the scientific research trend of tourism study in Taiwan. JTSL is one of the most important Taiwanese journals in the tourism discipline which focuses on tourism-related issues and uses traditional Chinese as the study language. The method of co-word analysis from bibliometrics approaches was employed for semantic analysis in this study. When analyzing Chinese words and phrases, word segmentation analysis is a crucial step. It must be carried out initially and precisely in order to obtain meaningful word or word chunks for further frequency calculation. A word segmentation system basing on N-gram algorithm was developed in this study to conduct semantic analysis, and 100 groups of meaningful phrases with the highest recurrent rates were located. Subsequently, co-word analysis was employed for semantic classification. The results showed that the themes of tourism research in Taiwan in recent years cover the scope of tourism education, environmental protection, hotel management, information technology, and senior tourism. The results can give insight on the related issues and serve as a reference for tourism-related policy making and follow-up research.Keywords: bibliometrics, co-word analysis, word segmentation, tourism research, policy
Procedia PDF Downloads 22925291 Methodologies for Deriving Semantic Technical Information Using an Unstructured Patent Text Data
Authors: Jaehyung An, Sungjoo Lee
Abstract:
Patent documents constitute an up-to-date and reliable source of knowledge for reflecting technological advance, so patent analysis has been widely used for identification of technological trends and formulation of technology strategies. But, identifying technological information from patent data entails some limitations such as, high cost, complexity, and inconsistency because it rely on the expert’ knowledge. To overcome these limitations, researchers have applied to a quantitative analysis based on the keyword technique. By using this method, you can include a technological implication, particularly patent documents, or extract a keyword that indicates the important contents. However, it only uses the simple-counting method by keyword frequency, so it cannot take into account the sematic relationship with the keywords and sematic information such as, how the technologies are used in their technology area and how the technologies affect the other technologies. To automatically analyze unstructured technological information in patents to extract the semantic information, it should be transformed into an abstracted form that includes the technological key concepts. Specific sentence structure ‘SAO’ (subject, action, object) is newly emerged by representing ‘key concepts’ and can be extracted by NLP (Natural language processor). An SAO structure can be organized in a problem-solution format if the action-object (AO) states that the problem and subject (S) form the solution. In this paper, we propose the new methodology that can extract the SAO structure through technical elements extracting rules. Although sentence structures in the patents text have a unique format, prior studies have depended on general NLP (Natural language processor) applied to the common documents such as newspaper, research paper, and twitter mentions, so it cannot take into account the specific sentence structure types of the patent documents. To overcome this limitation, we identified a unique form of the patent sentences and defined the SAO structures in the patents text data. There are four types of technical elements that consist of technology adoption purpose, application area, tool for technology, and technical components. These four types of sentence structures from patents have their own specific word structure by location or sequence of the part of speech at each sentence. Finally, we developed algorithms for extracting SAOs and this result offer insight for the technology innovation process by providing different perspectives of technology.Keywords: NLP, patent analysis, SAO, semantic-analysis
Procedia PDF Downloads 26225290 Investigating Translations of Websites of Pakistani Public Offices
Authors: Sufia Maroof
Abstract:
This empirical study investigated the web-translations of five Pakistani public offices (FPSC, FIA, HEC, USB, and Ministry of Finance) offering Urdu tab as an option to access information on their official websites. Triangulation of quantitative and qualitative research design informed the researcher of the semantic, lexical and syntactic caveats in these translations. The study hypothesized that majority of the Pakistani population is oblivious of the Supreme Court’s amendments in language policy concerning national and official language; hence, Urdu web-translations of the public departments have not been accessed effectively. Firstly, the researcher conducted an online survey, comprising of two sections, close ended and short answer based questions. Secondly, the researcher compiled corpus of the five selected websites in a tabular form to compare the data. Thirdly, the administrators of the departments had been contacted regarding the methods of translation and the expertise of the personnel involved. The corpus was assessed for TQA after examining the lexical, semantic, syntactical and technical alignment inaccuracies and imperfections. The study suggests the public offices to invest in their Urdu webs by either hiring expert translators or engaging expertise of a translation agency for this project to offer quality translation to public.Keywords: machine translations, public offices, Urdu translations, websites
Procedia PDF Downloads 12625289 An Approach to Specify Software Requirements in Semantic Form
Authors: Deepa Vijay, Chellammal Surianarayanan, Gopinath Ganapathy
Abstract:
Requirements of a software project serve as a guideline for the entire project team which enable the team towards producing the right outcome. As requirements are the key in deciding the success of the project, it should be specified in an unambiguous manner. Also, the requirements should be complete and consistent. It should be interpreted in the same way by the entire software project team as the customer interprets. Specifying requirements in textual manner is common in software development. This leads to poor understanding of the requirements which results in more errors and degraded quality. There are some literatures which focus on semantic way of specifying functional requirement which ensure the consistency and completeness of requirements. Alternately in the work, a method is proposed to map the syntactic requirements with corresponding semantics in the form of ontologies. This improves the understanding of requirements, prevents errors and improves quality.Keywords: functional requirement, ontology, requirements management, semantics
Procedia PDF Downloads 36425288 Smart Sensor Data to Predict Machine Performance with IoT-Based Machine Learning and Artificial Intelligence
Authors: C. J. Rossouw, T. I. van Niekerk
Abstract:
The global manufacturing industry is utilizing the internet and cloud-based services to further explore the anatomy and optimize manufacturing processes in support of the movement into the Fourth Industrial Revolution (4IR). The 4IR from a third world and African perspective is hindered by the fact that many manufacturing systems that were developed in the third industrial revolution are not inherently equipped to utilize the internet and services of the 4IR, hindering the progression of third world manufacturing industries into the 4IR. This research focuses on the development of a non-invasive and cost-effective cyber-physical IoT system that will exploit a machine’s vibration to expose semantic characteristics in the manufacturing process and utilize these results through a real-time cloud-based machine condition monitoring system with the intention to optimize the system. A microcontroller-based IoT sensor was designed to acquire a machine’s mechanical vibration data, process it in real-time, and transmit it to a cloud-based platform via Wi-Fi and the internet. Time-frequency Fourier analysis was applied to the vibration data to form an image representation of the machine’s behaviour. This data was used to train a Convolutional Neural Network (CNN) to learn semantic characteristics in the machine’s behaviour and relate them to a state of operation. The same data was also used to train a Convolutional Autoencoder (CAE) to detect anomalies in the data. Real-time edge-based artificial intelligence was achieved by deploying the CNN and CAE on the sensor to analyse the vibration. A cloud platform was deployed to visualize the vibration data and the results of the CNN and CAE in real-time. The cyber-physical IoT system was deployed on a semi-automated metal granulation machine with a set of trained machine learning models. Using a single sensor, the system was able to accurately visualize three states of the machine’s operation in real-time. The system was also able to detect a variance in the material being granulated. The research demonstrates how non-IoT manufacturing systems can be equipped with edge-based artificial intelligence to establish a remote machine condition monitoring system.Keywords: IoT, cyber-physical systems, artificial intelligence, manufacturing, vibration analytics, continuous machine condition monitoring
Procedia PDF Downloads 8825287 Semantic Differences between Bug Labeling of Different Repositories via Machine Learning
Authors: Pooja Khanal, Huaming Zhang
Abstract:
Labeling of issues/bugs, also known as bug classification, plays a vital role in software engineering. Some known labels/classes of bugs are 'User Interface', 'Security', and 'API'. Most of the time, when a reporter reports a bug, they try to assign some predefined label to it. Those issues are reported for a project, and each project is a repository in GitHub/GitLab, which contains multiple issues. There are many software project repositories -ranging from individual projects to commercial projects. The labels assigned for different repositories may be dependent on various factors like human instinct, generalization of labels, label assignment policy followed by the reporter, etc. While the reporter of the issue may instinctively give that issue a label, another person reporting the same issue may label it differently. This way, it is not known mathematically if a label in one repository is similar or different to the label in another repository. Hence, the primary goal of this research is to find the semantic differences between bug labeling of different repositories via machine learning. Independent optimal classifiers for individual repositories are built first using the text features from the reported issues. The optimal classifiers may include a combination of multiple classifiers stacked together. Then, those classifiers are used to cross-test other repositories which leads the result to be deduced mathematically. The produce of this ongoing research includes a formalized open-source GitHub issues database that is used to deduce the similarity of the labels pertaining to the different repositories.Keywords: bug classification, bug labels, GitHub issues, semantic differences
Procedia PDF Downloads 20025286 Visualization-Based Feature Extraction for Classification in Real-Time Interaction
Authors: Ágoston Nagy
Abstract:
This paper introduces a method of using unsupervised machine learning to visualize the feature space of a dataset in 2D, in order to find most characteristic segments in the set. After dimension reduction, users can select clusters by manual drawing. Selected clusters are recorded into a data model that is used for later predictions, based on realtime data. Predictions are made with supervised learning, using Gesture Recognition Toolkit. The paper introduces two example applications: a semantic audio organizer for analyzing incoming sounds, and a gesture database organizer where gestural data (recorded by a Leap motion) is visualized for further manipulation.Keywords: gesture recognition, machine learning, real-time interaction, visualization
Procedia PDF Downloads 35325285 Story of Per-: The Radial Network of One Lithuanian Prefix
Authors: Samanta Kietytė
Abstract:
The object of this study is the verbal derivatives stemming from the Lithuanian prefix per-. The prefix under examination can be classified as prepositional, having descended from the preposition per, thereby sharing the same prototypical meaning – denoting movement OVER. These frequently co-occur within sentences (1). The aim of this paper is to conduct a semantic analysis of the prefix per- and to propose a possible radial network of its meanings. In essence, the aim is to identify the interrelationships existing between its meanings. 1) Jis peršoko per tvorą/ 3SG.NOM.M jump.PST.3 over fence.ACC.SG. /ʻHe jumped over the fenceʼ. The foundation of this work lies in the methodological and theoretical framework of cognitive linguistics. The prototypical meaning of prefixes consistently embodies spatial dimensions that can be described through image schemas. This entails the identification of the trajectory, the landmark, and the relation between them in the situation described by the prefixed verb. The meanings of linguistic units are not perceived as arbitrary, but rather, they are interconnected through semantic motivation. According to this perspective, a singular meaning within linguistic units is considered as prototypical, while additional meanings are descended (not necessarily directly) from it. For example, one of the per- meanings TRANSFER (2) is derived from the prototypical meaning OVER. 2) Prašau persiųsti vadovo laišką man./ Ask.PRS.1 forward.INF manager.GEN.SG email.ACC.SG 1.SG.DAT/ ʻPlease forward the manager‘s email to meʼ. Certain semantic relations are explained by the conceptual metaphor and metonymy theory. For instances, when prefixed verb has a meaning WIN (3) it is related to the prototypical meaning. In this case, the prefixed verb describes situations of winning in various ways. In the prototypical meaning, the trajector moves higher than the landmark, and winning is metaphorically perceived as being higher. 3) Sūnus peraugo tėvą./ Son.NOM.SG outgrow.PST.3 father.ACC.SG/ ʻThe son has outgrown the fatherʼ. The data utilized for this study was collected from the 2014 grammatically annotated text "Lithuanian Web (LithuanianWaC v2)", consisting of 63,645,700 words. Given that the corpus is grammatically lemmatized, the list of the 793 items was obtained using the wordlist function and specifying that verbs starting with per were searched. The list included not only prefixed verbs but also other verbs whose roots have the same letter sequences as prefixes. Also, words with misspellings, without diacritical marks, and words listed for lemmatization errors were rejected, and a total of 475 derivatives were left for further analysis. The semantic analysis revealed that there are 12 distinct meanings of the prefix per-. The spatial meanings were extracted by determining what a trajector is, what a landmark is, and what the relation between them is. The connection between non-spatial meanings and spatial ones occurs through semantic motivation established by identifying elements that correspond to the trajector and landmark. The analysis reveals that there are no strict boundaries among these meanings, instead showing a continuum that encompasses a central core and a peripheral association with their internal structure, i.e., some derivatives are more prototypical of a particular meaning than others.Keywords: word-formation, cognitive semantics, metaphor, radial networks, prototype theory, prefix
Procedia PDF Downloads 77