Search results for: semantic processing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3975

Search results for: semantic processing

3885 Cerrado and Vereda: A Survey of Portuguese Lexicon for Brazilian Biomes

Authors: Daniel Marra

Abstract:

This paper analyses from a semantic-diachronic viewpoint the change of meanings that two lexical items of Brazilian-Portuguese language have gone through. Cerrado and Vereda designate currently the second largest Brazilian biome and one of its most important subsystems. Nevertheless, these two words have long individual histories that can be traced back to their Latin etymons. Therefore, the purpose of this work is to highlight the process by which meaning instantiated itself in these words’ formation and to discuss how semantic change installed subsequently in them. As this paper shows, the aforementioned words have been, in different past, synchronizes, created, and undergone changes of meanings by metaphor and metonymy. Besides, it is argued here that semantic change takes place due to external causes, such as generalization and specialization of meaning. It happens when a specialized use of a lexical item, restricted to a particular linguistic group, is adopted by other groups, having its meaning generalized by them. In these processes, the etymological idea of the word is generally lost, which gains, in the new group, less specific meaning in relation to its etymology, sometimes with no relation to the original idea. As a final point, it is claimed that both the creation of a lexical item and its change of meaning involve pragmatic goals, such as the need the language users have to express a new meaning related to a certain reality in the empirical world.

Keywords: Brazilian biomes, metaphor and metonymy, Portuguese lexicon, semantic change

Procedia PDF Downloads 97
3884 Preserving Urban Cultural Heritage with Deep Learning: Color Planning for Japanese Merchant Towns

Authors: Dongqi Li, Yunjia Huang, Tomo Inoue, Kohei Inoue

Abstract:

With urbanization, urban cultural heritage is facing the impact and destruction of modernization and urbanization. Many historical areas are losing their historical information and regional cultural characteristics, so it is necessary to carry out systematic color planning for historical areas in conservation. As an early focus on urban color planning, Japan has a systematic approach to urban color planning. Hence, this paper selects five merchant towns from the category of important traditional building preservation areas in Japan as the subject of this study to explore the color structure and emotion of this type of historic area. First, the image semantic segmentation method identifies the buildings, roads, and landscape environments. Their color data were extracted for color composition and emotion analysis to summarize their common features. Second, the obtained Internet evaluations were extracted by natural language processing for keyword extraction. The correlation analysis of the color structure and keywords provides a valuable reference for conservation decisions for this historic area in the town. This paper also combines the color structure and Internet evaluation results with generative adversarial networks to generate predicted images of color structure improvements and color improvement schemes. The methods and conclusions of this paper can provide new ideas for the digital management of environmental colors in historic districts and provide a valuable reference for the inheritance of local traditional culture.

Keywords: historic districts, color planning, semantic segmentation, natural language processing

Procedia PDF Downloads 57
3883 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 64
3882 Towards Long-Range Pixels Connection for Context-Aware Semantic Segmentation

Authors: Muhammad Zubair Khan, Yugyung Lee

Abstract:

Deep learning has recently achieved enormous response in semantic image segmentation. The previously developed U-Net inspired architectures operate with continuous stride and pooling operations, leading to spatial data loss. Also, the methods lack establishing long-term pixels connection to preserve context knowledge and reduce spatial loss in prediction. This article developed encoder-decoder architecture with bi-directional LSTM embedded in long skip-connections and densely connected convolution blocks. The network non-linearly combines the feature maps across encoder-decoder paths for finding dependency and correlation between image pixels. Additionally, the densely connected convolutional blocks are kept in the final encoding layer to reuse features and prevent redundant data sharing. The method applied batch-normalization for reducing internal covariate shift in data distributions. The empirical evidence shows a promising response to our method compared with other semantic segmentation techniques.

Keywords: deep learning, semantic segmentation, image analysis, pixels connection, convolution neural network

Procedia PDF Downloads 77
3881 An AI-generated Semantic Communication Platform in HCI Course

Authors: Yi Yang, Jiasong Sun

Abstract:

Almost every aspect of our daily lives is now intertwined with some degree of human-computer interaction (HCI). HCI courses draw on knowledge from disciplines as diverse as computer science, psychology, design principles, anthropology, and more. Our HCI courses, named the Media and Cognition course, are constantly updated to reflect state-of-the-art technological advancements such as virtual reality, augmented reality, and artificial intelligence-based interactions. For more than a decade, our course has used an interest-based approach to teaching, in which students proactively propose some research-based questions and collaborate with teachers, using course knowledge to explore potential solutions. Semantic communication plays a key role in facilitating understanding and interaction between users and computer systems, ultimately enhancing system usability and user experience. The advancements in AI-generated technology, which have gained significant attention from both academia and industry in recent years, are exemplified by language models like GPT-3 that generate human-like dialogues from given prompts. Our latest version of the Human-Computer Interaction course practices a semantic communication platform based on AI-generated techniques. The purpose of this semantic communication is twofold: to extract and transmit task-specific information while ensuring efficient end-to-end communication with minimal latency. An AI-generated semantic communication platform evaluates the retention of signal sources and converts low-retain ability visual signals into textual prompts. These data are transmitted through AI-generated techniques and reconstructed at the receiving end; on the other hand, visual signals with a high retain ability rate are compressed and transmitted according to their respective regions. The platform and associated research are a testament to our students' growing ability to independently investigate state-of-the-art technologies.

Keywords: human-computer interaction, media and cognition course, semantic communication, retainability, prompts

Procedia PDF Downloads 79
3880 Understanding the Interactive Nature in Auditory Recognition of Phonological/Grammatical/Semantic Errors at the Sentence Level: An Investigation Based upon Japanese EFL Learners’ Self-Evaluation and Actual Language Performance

Authors: Hirokatsu Kawashima

Abstract:

One important element of teaching/learning listening is intensive listening such as listening for precise sounds, words, grammatical, and semantic units. Several classroom-based investigations have been conducted to explore the usefulness of auditory recognition of phonological, grammatical and semantic errors in such a context. The current study reports the results of one such investigation, which targeted auditory recognition of phonological, grammatical, and semantic errors at the sentence level. 56 Japanese EFL learners participated in this investigation, in which their recognition performance of phonological, grammatical and semantic errors was measured on a 9-point scale by learners’ self-evaluation from the perspective of 1) two types of similar English sound (vowel and consonant minimal pair words), 2) two types of sentence word order (verb phrase-based and noun phrase-based word orders), and 3) two types of semantic consistency (verb-purpose and verb-place agreements), respectively, and their general listening proficiency was examined using standardized tests. A number of findings have been made about the interactive relationships between the three types of auditory error recognition and general listening proficiency. Analyses based on the OPLS (Orthogonal Projections to Latent Structure) regression model have disclosed, for example, that the three types of auditory error recognition are linked in a non-linear way: the highest explanatory power for general listening proficiency may be attained when quadratic interactions between auditory recognition of errors related to vowel minimal pair words and that of errors related to noun phrase-based word order are embraced (R2=.33, p=.01).

Keywords: auditory error recognition, intensive listening, interaction, investigation

Procedia PDF Downloads 489
3879 Aspects of Semantics of Standard British English and Nigerian English: A Contrastive Study

Authors: Chris Adetuyi, Adeola Adeniran

Abstract:

The concept of meaning is a complex one in language study when cultural features are added. This is mandatory because language cannot be completely separated from the culture in which case language and culture complement each other. When there are two varieties of a language in a society, i.e. two varieties functioning side by side in a speech community, there is a tendency to view one of the varieties with each other. There is, therefore, the need to make a linguistic comparative study of varieties of such languages. In this paper, a semantic contrastive study is made between Standard British English (SBE) and Nigerian English (NB). The semantic study is limited to aspects of semantics: semantic extension (Kinship terms, metaphors), semantic shift (lexical items considered are ‘drop’ ‘befriend’ ‘dowry’ and escort) acronyms (NEPA, JAMB, NTA) linguistic borrowing or loan words (Seriki, Agbada, Eba, Dodo, Iroko) coinages (long leg, bush meat; bottom power and juju). In the study of these aspects of semantics of SBE and NE lexical terms, conservative statements are made, problems areas and hierarchy of difficulties are highlighted with a view to bringing out areas of differences are highlighted in this paper are concerned. The study will also serve as a guide in further contrastive studies in some other area of languages.

Keywords: aspect, British, English, Nigeria, semantics

Procedia PDF Downloads 322
3878 Single-Camera Basketball Tracker through Pose and Semantic Feature Fusion

Authors: Adrià Arbués-Sangüesa, Coloma Ballester, Gloria Haro

Abstract:

Tracking sports players is a widely challenging scenario, specially in single-feed videos recorded in tight courts, where cluttering and occlusions cannot be avoided. This paper presents an analysis of several geometric and semantic visual features to detect and track basketball players. An ablation study is carried out and then used to remark that a robust tracker can be built with Deep Learning features, without the need of extracting contextual ones, such as proximity or color similarity, nor applying camera stabilization techniques. The presented tracker consists of: (1) a detection step, which uses a pretrained deep learning model to estimate the players pose, followed by (2) a tracking step, which leverages pose and semantic information from the output of a convolutional layer in a VGG network. Its performance is analyzed in terms of MOTA over a basketball dataset with more than 10k instances.

Keywords: basketball, deep learning, feature extraction, single-camera, tracking

Procedia PDF Downloads 113
3877 The Neurofunctional Dissociation between Animal and Tool Concepts: A Network-Based Model

Authors: Skiker Kaoutar, Mounir Maouene

Abstract:

Neuroimaging studies have shown that animal and tool concepts rely on distinct networks of brain areas. Animal concepts depend predominantly on temporal areas while tool concepts rely on fronto-temporo-parietal areas. However, the origin of this neurofunctional distinction for processing animal and tool concepts remains still unclear. Here, we address this question from a network perspective suggesting that the neural distinction between animals and tools might reflect the differences in their structural semantic networks. We build semantic networks for animal and tool concepts derived from McRae and colleagues’s behavioral study conducted on a large number of participants. These two networks are thus analyzed through a large number of graph theoretical measures for small-worldness: centrality, clustering coefficient, average shortest path length, as well as resistance to random and targeted attacks. The results indicate that both animal and tool networks have small-world properties. More importantly, the animal network is more vulnerable to targeted attacks compared to the tool network a result that correlates with brain lesions studies.

Keywords: animals, tools, network, semantics, small-worls, resilience to damage

Procedia PDF Downloads 517
3876 A Semantic Registry to Support Brazilian Aeronautical Web Services Operations

Authors: Luís Antonio de Almeida Rodriguez, José Maria Parente de Oliveira, Ednelson Oliveira

Abstract:

In the last two decades, the world’s aviation authorities have made several attempts to create consensus about a global and accepted approach for applying semantics to web services registry descriptions. This problem has led communities to face a fat and disorganized infrastructure to describe aeronautical web services. It is usual for developers to implement ad-hoc connections among consumers and providers and manually create non-standardized service compositions, which need some particular approach to compose and semantically discover a desired web service. Current practices are not precise and tend to focus on lightweight specifications of some parts of the OWL-S and embed them into syntactic descriptions (SOAP artifacts and OWL language). It is necessary to have the ability to manage the use of both technologies. This paper presents an implementation of the ontology OWL-S that describes a Brazilian Aeronautical Web Service Registry, which makes it able to publish, advertise, make multi-criteria semantic discovery aligned with the ideas of the System Wide Information Management (SWIM) Program, and invoke web services within the Air Traffic Management context. The proposal’s best finding is a generic approach to describe semantic web services. The paper also presents a set of functional requirements to guide the ontology development and to compare them to the results to validate the implementation of the OWL-S Ontology.

Keywords: aeronautical web services, OWL-S, semantic web services discovery, ontologies

Procedia PDF Downloads 61
3875 Using Corpora in Semantic Studies of English Adjectives

Authors: Oxana Lukoshus

Abstract:

The methods of corpus linguistics, a well-established field of research, are being increasingly applied in cognitive linguistics. Corpora data are especially useful for different quantitative studies of grammatical and other aspects of language. The main objective of this paper is to demonstrate how present-day corpora can be applied in semantic studies in general and in semantic studies of adjectives in particular. Polysemantic adjectives have been the subject of numerous studies. But most of them have been carried out on dictionaries. Undoubtedly, dictionaries are viewed as one of the basic data sources, but only at the initial steps of a research. The author usually starts with the analysis of the lexicographic data after which s/he comes up with a hypothesis. In the research conducted three polysemantic synonyms true, loyal, faithful have been analyzed in terms of differences and similarities in their semantic structure. A corpus-based approach in the study of the above-mentioned adjectives involves the following. After the analysis of the dictionary data there was the reference to the following corpora to study the distributional patterns of the words under study – the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA). These corpora are continually updated and contain thousands of examples of the words under research which make them a useful and convenient data source. For the purpose of this study there were no special needs regarding genre, mode or time of the texts included in the corpora. Out of the range of possibilities offered by corpus-analysis software (e.g. word lists, statistics of word frequencies, etc.), the most useful tool for the semantic analysis was the extracting a list of co-occurrence for the given search words. Searching by lemmas, e.g. true, true to, and grouping the results by lemmas have proved to be the most efficient corpora feature for the adjectives under the study. Following the search process, the corpora provided a list of co-occurrences, which were then to be analyzed and classified. Not every co-occurrence was relevant for the analysis. For example, the phrases like An enormous sense of responsibility to protect the minds and hearts of the faithful from incursions by the state was perceived to be the basic duty of the church leaders or ‘True,’ said Phoebe, ‘but I'd probably get to be a Union Official immediately were left out as in the first example the faithful is a substantivized adjective and in the second example true is used alone with no other parts of speech. The subsequent analysis of the corpora data gave the grounds for the distribution groups of the adjectives under the study which were then investigated with the help of a semantic experiment. To sum it up, the corpora-based approach has proved to be a powerful, reliable and convenient tool to get the data for the further semantic study.

Keywords: corpora, corpus-based approach, polysemantic adjectives, semantic studies

Procedia PDF Downloads 296
3874 Toward an Understanding of the Neurofunctional Dissociation between Animal and Tool Concepts: A Graph Theoretical Analysis

Authors: Skiker Kaoutar, Mounir Maouene

Abstract:

Neuroimaging studies have shown that animal and tool concepts rely on distinct networks of brain areas. Animal concepts depend predominantly on temporal areas while tool concepts rely on fronto-temporo-parietal areas. However, the origin of this neurofunctional distinction for processing animal and tool concepts remains still unclear. Here, we address this question from a network perspective suggesting that the neural distinction between animals and tools might reflect the differences in their structural semantic networks. We build semantic networks for animal and tool concepts derived from Mc Rae and colleagues’s behavioral study conducted on a large number of participants. These two networks are thus analyzed through a large number of graph theoretical measures for small-worldness: centrality, clustering coefficient, average shortest path length, as well as resistance to random and targeted attacks. The results indicate that both animal and tool networks have small-world properties. More importantly, the animal network is more vulnerable to targeted attacks compared to the tool network a result that correlates with brain lesions studies.

Keywords: animals, tools, network, semantics, small-world, resilience to damage

Procedia PDF Downloads 510
3873 Lexical Semantic Analysis to Support Ontology Modeling of Maintenance Activities– Case Study of Offshore Riser Integrity

Authors: Vahid Ebrahimipour

Abstract:

Word representation and context meaning of text-based documents play an essential role in knowledge modeling. Business procedures written in natural language are meant to store technical and engineering information, management decision and operation experience during the production system life cycle. Context meaning representation is highly dependent upon word sense, lexical relativity, and sematic features of the argument. This paper proposes a method for lexical semantic analysis and context meaning representation of maintenance activity in a mass production system. Our approach constructs a straightforward lexical semantic approach to analyze facilitates semantic and syntactic features of context structure of maintenance report to facilitate translation, interpretation, and conversion of human-readable interpretation into computer-readable representation and understandable with less heterogeneity and ambiguity. The methodology will enable users to obtain a representation format that maximizes shareability and accessibility for multi-purpose usage. It provides a contextualized structure to obtain a generic context model that can be utilized during the system life cycle. At first, it employs a co-occurrence-based clustering framework to recognize a group of highly frequent contextual features that correspond to a maintenance report text. Then the keywords are identified for syntactic and semantic extraction analysis. The analysis exercises causality-driven logic of keywords’ senses to divulge the structural and meaning dependency relationships between the words in a context. The output is a word contextualized representation of maintenance activity accommodating computer-based representation and inference using OWL/RDF.

Keywords: lexical semantic analysis, metadata modeling, contextual meaning extraction, ontology modeling, knowledge representation

Procedia PDF Downloads 81
3872 The Conceptual Relationships in N+N Compounds in Arabic Compared to English

Authors: Abdel Rahman Altakhaineh

Abstract:

This paper has analysed the conceptual relations between the elements of NN compounds in Arabic and compared them to those found in English based on the framework of Conceptual Semantics and a modified version of Parallel Architecture referred to as Relational Morphology. The analysis revealed that the repertoire of possible semantic relations between the two nouns in Arabic NN compounds reproduces that in English NN compounds and that, therefore, the main difference is in headedness (right-headed in English, left-headed in Arabic). Adopting RM allows productive and idiosyncratic elements to interweave with each other naturally. Semantically transparent compounds can be stored in memory or produced and understood online, while compounds with different degrees of semantic idiosyncrasy are stored in memory. Furthermore, the predictable parts of idiosyncratic compounds are captured by general schemas. In compounds, such schemas pick out the range of possible semantic relations between the two nouns. Finally, conducting a cross-linguistic study of the systematic patterns of possible conceptual relationships between compound elements is an area worthy of further exploration. In addition, comparing and contrasting compounding in Arabic and Hebrew, especially as they are both Semitic languages, is another area that needs to be investigated thoroughly. It will help morphologists understand the extent to which Jackendoff’s repertoire of semantic relations in compounds is universal. That is, if a language as distant from English as Arabic displays a similar range of cases, this is evidence for a (relatively) universal set of relations from which individual languages may pick and choose.

Keywords: conceptual semantics, morphology, compounds, arabic, english

Procedia PDF Downloads 77
3871 An Intensional Conceptualization Model for Ontology-Based Semantic Integration

Authors: Fateh Adhnouss, Husam El-Asfour, Kenneth McIsaac, AbdulMutalib Wahaishi, Idris El-Feghia

Abstract:

Conceptualization is an essential component of semantic ontology-based approaches. There have been several approaches that rely on extensional structure and extensional reduction structure in order to construct conceptualization. In this paper, several limitations are highlighted relating to their applicability to the construction of conceptualizations in dynamic and open environments. These limitations arise from a number of strong assumptions that do not apply to such environments. An intensional structure is strongly argued to be a natural and adequate modeling approach. This paper presents a conceptualization structure based on property relations and propositions theory (PRP) to the model ontology that is suitable for open environments. The model extends the First-Order Logic (FOL) notation and defines the formal representation that enables interoperability between software systems and supports semantic integration for software systems in open, dynamic environments.

Keywords: conceptualization, ontology, extensional structure, intensional structure

Procedia PDF Downloads 81
3870 A Deep Learning Based Approach for Dynamically Selecting Pre-processing Technique for Images

Authors: Revoti Prasad Bora, Nikita Katyal, Saurabh Yadav

Abstract:

Pre-processing plays an important role in various image processing applications. Most of the time due to the similar nature of images, a particular pre-processing or a set of pre-processing steps are sufficient to produce the desired results. However, in the education domain, there is a wide variety of images in various aspects like images with line-based diagrams, chemical formulas, mathematical equations, etc. Hence a single pre-processing or a set of pre-processing steps may not yield good results. Therefore, a Deep Learning based approach for dynamically selecting a relevant pre-processing technique for each image is proposed. The proposed method works as a classifier to detect hidden patterns in the images and predicts the relevant pre-processing technique needed for the image. This approach experimented for an image similarity matching problem but it can be adapted to other use cases too. Experimental results showed significant improvement in average similarity ranking with the proposed method as opposed to static pre-processing techniques.

Keywords: deep-learning, classification, pre-processing, computer vision, image processing, educational data mining

Procedia PDF Downloads 121
3869 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 107
3868 Semantic Features of Turkish and Spanish Phraseological Units with a Somatic Component ‘Hand’

Authors: Narmina Mammadova

Abstract:

In modern linguistics, the comparative study of languages is becoming increasingly popular, the typology and comparison of languages that have different structures is expanding and deepening. Of particular interest is the study of phraseological units, which makes it possible to identify the specific features of the compared languages in all their national identity. This paper gives a brief analysis of the comparative study of somatic phraseological units (SFU) of the Spanish and Turkish languages with the component "hand" in the semantic aspect; identification of equivalents, analogs and non-equivalent units, as well as a description of methods of translation of non-equivalent somatic phraseological units. Comparative study of the phraseology of unrelated languages is of particular relevance since it allows us to identify both general, universal features and differential and specific features characteristic of a particular language. Based on the results of the generalization of the study, it can be assumed that phraseological units containing a somatic component have a high interlingual phraseological activity, which contributes to an increase in the degree of interlingual equivalence.

Keywords: Linguoculturology, Turkish, Spanish, language picture of the world, phraseological units, semantic microfield

Procedia PDF Downloads 172
3867 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts

Authors: William Michael Short

Abstract:

Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.

Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics

Procedia PDF Downloads 102
3866 Academic Literacy: Semantic-Discursive Resource and the Relationship with the Constitution of Genre for the Development of Writing

Authors: Lucia Rottava

Abstract:

The present study focuses on academic literacy and addresses the impact of semantic-discursive resources on the constitution of genres that are produced in such context. The research considers the development of writing in the academic context in Portuguese. Researches that address academic literacy and the characteristics of the texts produced in this context are rare, mainly with focus on the development of writing, considering three variables: the constitution of the writer, the perception of the reader/interlocutor and the organization of the informational text flow. The research aims to map the semantic-discursive resources of the written register in texts of several genres and produced by students in the first semester of the undergraduate course in letters. The hypothesis raised is that writing in the academic environment is not a recurrent literacy practice for these learners and can be explained by the ontogenetic and phylogenetic nature of language development. Qualitative in nature, the present research has as empirical data texts produced in a half-yearly course of Reading and Textual Production; these data result from the proposition of four different writing proposals, in a total of 600 texts. The corpus is analyzed based on semantic-discursive resources, seeking to contemplate relevant aspects of language (grammar, discourse and social context) that reveal the choices made in the reader/writer interrelationship and the organizational flow of the text. Among the semantic-discursive resources, the analysis includes three resources, including (a) appraisal and negotiation to understand the attitudes negotiated (roles of the participants of the discourse and their relationship with the other); (b) ideation to explain the construction of the experience (activities performed and participants); and (c) periodicity to outline the flow of information in the organization of the text according to the genre it instantiates. The results indicate the organizational difficulties of the flow of the text information. Cartography contributes to the understanding of the way writers use language in an effort to present themselves, evaluate someone else’s work, and communicate with readers.

Keywords: academic writing, portuguese mother tongue, semantic-discursive resources, sistemic funcional linguistic

Procedia PDF Downloads 103
3865 Factorization of Computations in Bayesian Networks: Interpretation of Factors

Authors: Linda Smail, Zineb Azouz

Abstract:

Given a Bayesian network relative to a set I of discrete random variables, we are interested in computing the probability distribution P(S) where S is a subset of I. The general idea is to write the expression of P(S) in the form of a product of factors where each factor is easy to compute. More importantly, it will be very useful to give an interpretation of each of the factors in terms of conditional probabilities. This paper considers a semantic interpretation of the factors involved in computing marginal probabilities in Bayesian networks. Establishing such a semantic interpretations is indeed interesting and relevant in the case of large Bayesian networks.

Keywords: Bayesian networks, D-Separation, level two Bayesian networks, factorization of computation

Procedia PDF Downloads 495
3864 On Supporting a Meta-Design Approach in Socio-Technical Ontology Engineering

Authors: Mesnan Silalahi, Dana Indra Sensuse, Indra Budi

Abstract:

Many research have revealed the fact of the complexity of ontology building process that there is a need to have a new approach which addresses the socio-technical aspects in the collaboration to reach a consensus. Meta-design approach is considered applicable as a method in the methodological model in a socio-technical ontology engineering. Principles in the meta-design framework is applied in the construction phases on the ontology. A portal is developed to support the meta-design principles requirements. To validate the methodological model semantic web applications were developed and integrated in the portal and also used as a way to show the usefulness of the ontology. The knowledge based system will be filled with data of Indonesian medicinal plants. By showing the usefulness of the developed ontology in a web semantic application, we motivate all stakeholders to participate in the development of knowledge based system of medicinal plants in Indonesia.

Keywords: socio-technical, metadesign, ontology engineering methodology, semantic web application

Procedia PDF Downloads 418
3863 Social Media, Networks and Related Technology: Business and Governance Perspectives

Authors: M. A. T. AlSudairi, T. G. K. Vasista

Abstract:

The concept of social media is becoming the top of the agenda for many business executives and public sector executives today. Decision makers as well as consultants, try to identify ways in which firms and enterprises can make profitable use of social media and network related applications such as Wikipedia, Face book, YouTube, Google+, Twitter. While it is fun and useful to participating in this media and network for achieving the communication effectively and efficiently, semantic and sentiment analysis and interpretation becomes a crucial issue. So, the objective of this paper is to provide literature review on social media, network and related technology related to semantics and sentiment or opinion analysis covering business and governance perspectives. In this regard, a case study on the use and adoption of Social media in Saudi Arabia has been discussed. It is concluded that semantic web technology play a significant role in analyzing the social networks and social media content for extracting the interpretational knowledge towards strategic decision support.

Keywords: CRASP methodology, formative assessment, literature review, semantic web services, social media, social networks

Procedia PDF Downloads 424
3862 The Diminished Online Persona: A Semantic Change of Chinese Classifier Mei on Weibo

Authors: Hui Shi

Abstract:

This study investigates a newly emerged usage of Chinese numeral classifier mei (枚) in the cyberspace. In modern Chinese grammar, mei as a classifier should occupy the pre-nominal position, and its valid accompanying nouns are restricted to small, flat, fragile inanimate objects rather than humans. To examine the semantic change of mei, two types of data from Weibo.com were collected. First, 500 mei-included Weibo posts constructed a corpus for analyzing this classifier's word order distribution (post-nominal or pre-nominal) as well as its accompanying nouns' semantics (inanimate or human). Second, considering that mei accompanies a remarkable number of human nouns in the first corpus, the second corpus is composed of mei-involved Weibo IDs from users located in first and third-tier cities (n=8 respectively). The findings show that in the cyber community, mei frequently classifies human-related neologisms at the archaic post-normal position. Besides, the 23 to 29-year-old females as well as Weibo users from third-tier cities are the major populations who adopt mei in their user IDs for self-description and identity expression. This paper argues that the creative usage of mei gains popularity in the Chinese internet due to a humor effect. The marked word order switch and semantic misapplication combined to trigger incongruity and jocularity. This study has significance for research on Chinese cyber neologism. It may also lay a foundation for further studies on Chinese classifier change and Chinese internet communication.

Keywords: Chinese classifier, humor, neologism, semantic change

Procedia PDF Downloads 230
3861 Collective Intelligence-Based Early Warning Management for Agriculture

Authors: Jarbas Lopes Cardoso Jr., Frederic Andres, Alexandre Guitton, Asanee Kawtrakul, Silvio E. Barbin

Abstract:

The important objective of the CyberBrain Mass Agriculture Alarm Acquisition and Analysis (CBMa4) project is to minimize the impacts of diseases and disasters on rice cultivation. For example, early detection of insects will reduce the volume of insecticides that is applied to the rice fields through the use of CBMa4 platform. In order to reach this goal, two major factors need to be considered: (1) the social network of smart farmers; and (2) the warning data alarm acquisition and analysis component. This paper outlines the process for collecting the warning and improving the decision-making result to the warning. It involves two sub-processes: the warning collection and the understanding enrichment. Human sensors combine basic suitable data processing techniques in order to extract warning related semantic according to collective intelligence. We identify each warning by a semantic content called 'warncons' with multimedia metaphors and metadata related to these metaphors. It is important to describe the metric to measuring the relation among warncons. With this knowledge, a collective intelligence-based decision-making approach determines the action(s) to be launched regarding one or a set of warncons.

Keywords: agricultural engineering, warning systems, social network services, context awareness

Procedia PDF Downloads 350
3860 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 124
3859 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 66
3858 Gender Bias in Natural Language Processing: Machines Reflect Misogyny in Society

Authors: Irene Yi

Abstract:

Machine learning, natural language processing, and neural network models of language are becoming more and more prevalent in the fields of technology and linguistics today. Training data for machines are at best, large corpora of human literature and at worst, a reflection of the ugliness in society. Machines have been trained on millions of human books, only to find that in the course of human history, derogatory and sexist adjectives are used significantly more frequently when describing females in history and literature than when describing males. This is extremely problematic, both as training data, and as the outcome of natural language processing. As machines start to handle more responsibilities, it is crucial to ensure that they do not take with them historical sexist and misogynistic notions. This paper gathers data and algorithms from neural network models of language having to deal with syntax, semantics, sociolinguistics, and text classification. Results are significant in showing the existing intentional and unintentional misogynistic notions used to train machines, as well as in developing better technologies that take into account the semantics and syntax of text to be more mindful and reflect gender equality. Further, this paper deals with the idea of non-binary gender pronouns and how machines can process these pronouns correctly, given its semantic and syntactic context. This paper also delves into the implications of gendered grammar and its effect, cross-linguistically, on natural language processing. Languages such as French or Spanish not only have rigid gendered grammar rules, but also historically patriarchal societies. The progression of society comes hand in hand with not only its language, but how machines process those natural languages. These ideas are all extremely vital to the development of natural language models in technology, and they must be taken into account immediately.

Keywords: gendered grammar, misogynistic language, natural language processing, neural networks

Procedia PDF Downloads 93
3857 Effect of Semantic Relational Cues in Action Memory Performance over School Ages

Authors: Farzaneh Badinlou, Reza Kormi-Nouri, Monika Knopf, Kamal Kharazi

Abstract:

Research into long-term memory has demonstrated that the richness of the knowledge base cues in memory tasks improves retrieval process, which in turn influences learning and memory performance. The present research investigated the idea that adding cues connected to knowledge can affect memory performance in the context of action memory in children. In action memory studies, participants are instructed to learn a series of verb–object phrases as verbal learning and experience-based learning (learning by doing and learning by observation). It is well established that executing action phrases is a more memorable way to learn than verbally repeating the phrases, a finding called enactment effect. In the present study, a total of 410 students from four grade groups—2nd, 4th, 6th, and 8th—participated in this study. During the study, participants listened to verbal action phrases (VTs), performed the phrases (SPTs: subject-performed tasks), and observed the experimenter perform the phrases (EPTs: experimenter-performed tasks). During the test phase, cued recall test was administered. Semantic relational cues (i.e., well-integrated vs. poorly integrated items) were manipulated in the present study. In that, the participants were presented two lists of action phrases with high semantic integration between verb and noun, e.g., “write with the pen” and with low semantic integration between verb and noun, e.g., “pick up the glass”. Results revealed that experience-based learning had a better results than verbal learning for both well-integrated and poorly integrated items, though manipulations of semantic relational cues can moderate the enactment effect. In addition, children of different grade groups outperformed for well- than poorly integrated items, in flavour of older children. The results were discussed in relation to the effect of knowledge-based information in facilitating retrieval process in children.

Keywords: action memory, enactment effect, knowledge-based cues, school-aged children, semantic relational cues

Procedia PDF Downloads 254
3856 Historical Development of Negative Emotive Intensifiers in Hungarian

Authors: Martina Katalin Szabó, Bernadett Lipóczi, Csenge Guba, István Uveges

Abstract:

In this study, an exhaustive analysis was carried out about the historical development of negative emotive intensifiers in the Hungarian language via NLP methods. Intensifiers are linguistic elements which modify or reinforce a variable character in the lexical unit they apply to. Therefore, intensifiers appear with other lexical items, such as adverbs, adjectives, verbs, infrequently with nouns. Due to the complexity of this phenomenon (set of sociolinguistic, semantic, and historical aspects), there are many lexical items which can operate as intensifiers. The group of intensifiers are admittedly one of the most rapidly changing elements in the language. From a linguistic point of view, particularly interesting are a special group of intensifiers, the so-called negative emotive intensifiers, that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g.borzasztóanjó ’awfully good’, which means ’excellent’). Despite their special semantic features, negative emotive intensifiers are scarcely examined in literature based on large Historical corpora via NLP methods. In order to become better acquainted with trends over time concerning the intensifiers, The exhaustively analysed a specific historical corpus, namely the Magyar TörténetiSzövegtár (Hungarian Historical Corpus). This corpus (containing 3 millions text words) is a collection of texts of various genres and styles, produced between 1772 and 2010. Since the corpus consists of raw texts and does not contain any additional information about the language features of the data (such as stemming or morphological analysis), a large amount of manual work was required to process the data. Thus, based on a lexicon of negative emotive intensifiers compiled in a previous phase of the research, every occurrence of each intensifier was queried, and the results were stored in a separate data frame. Then, basic linguistic processing (POS-tagging, lemmatization etc.) was carried out automatically with the ‘magyarlanc’ NLP-toolkit. Finally, the frequency and collocation features of all the negative emotive words were automatically analyzed in the corpus. Outcomes of the research revealed in detail how these words have proceeded through grammaticalization over time, i.e., they change from lexical elements to grammatical ones, and they slowly go through a delexicalization process (their negative content diminishes over time). What is more, it was also pointed out which negative emotive intensifiers are at the same stage in this process in the same time period. Giving a closer look to the different domains of the analysed corpus, it also became certain that during this process, the pragmatic role’s importance increases: the newer use expresses the speaker's subjective, evaluative opinion at a certain level.

Keywords: historical corpus analysis, historical linguistics, negative emotive intensifiers, semantic changes over time

Procedia PDF Downloads 201