Search results for: semantic data
25344 A Semantic Registry to Support Brazilian Aeronautical Web Services Operations
Authors: Luís Antonio de Almeida Rodriguez, José Maria Parente de Oliveira, Ednelson Oliveira
Abstract:
In the last two decades, the world’s aviation authorities have made several attempts to create consensus about a global and accepted approach for applying semantics to web services registry descriptions. This problem has led communities to face a fat and disorganized infrastructure to describe aeronautical web services. It is usual for developers to implement ad-hoc connections among consumers and providers and manually create non-standardized service compositions, which need some particular approach to compose and semantically discover a desired web service. Current practices are not precise and tend to focus on lightweight specifications of some parts of the OWL-S and embed them into syntactic descriptions (SOAP artifacts and OWL language). It is necessary to have the ability to manage the use of both technologies. This paper presents an implementation of the ontology OWL-S that describes a Brazilian Aeronautical Web Service Registry, which makes it able to publish, advertise, make multi-criteria semantic discovery aligned with the ideas of the System Wide Information Management (SWIM) Program, and invoke web services within the Air Traffic Management context. The proposal’s best finding is a generic approach to describe semantic web services. The paper also presents a set of functional requirements to guide the ontology development and to compare them to the results to validate the implementation of the OWL-S Ontology.Keywords: aeronautical web services, OWL-S, semantic web services discovery, ontologies
Procedia PDF Downloads 8625343 Lexico-semantic and Morphosyntactic Analyses of Student-generated Paraphrased Academic Texts
Authors: Hazel P. Atilano
Abstract:
In this age of AI-assisted teaching and learning, there seems to be a dearth of research literature on the linguistic analysis of English as a Second Language (ESL) student-generated paraphrased academic texts. This study sought to examine the lexico-semantic, morphosyntactic features of paraphrased academic texts generated by ESL students. Employing a descriptive qualitative design, specifically linguistic analysis, the study involved a total of 85 students from senior high school, college, and graduate school enrolled in research courses. Data collection consisted of a 60-minute real-time, on-site paraphrasing practice exercise using excerpts from discipline-specific literature reviews of 150 to 200 words. A focus group discussion (FGD) was conducted to probe into the challenges experienced by the participants. The writing exercise yielded a total of 516 paraphrase pairs. A total of 176 paraphrase units (PUs) and 340 non-paraphrase pairs (NPPs) were detected. Findings from the linguistic analysis of PUs reveal that the modifications made to the original texts are predominantly syntax-based (Diathesis Alterations and Coordination Changes) and a combination of Miscellaneous Changes (Change of Order, Change of Format, and Addition/Deletion). Results of the analysis of paraphrase extremes (PE) show that Identical Structures resulting from the use of synonymous substitutions, with no significant change in the structural features of the original, is the most frequently occurring instance of PE. The analysis of paraphrase errors reveals that synonymous substitutions resulting in identical structures are the most frequently occurring error that leads to PE. Another type of paraphrasing error involves semantic and content loss resulting from the deletion or addition of meaning-altering content. Three major themes emerged from the FGD: (1) The Challenge of Preserving Semantic Content and Fidelity; (2) The Best Words in the Best Order: Grappling with the Lexico-semantic and Morphosyntactic Demands of Paraphrasing; and (3) Contending with Limited Vocabulary, Poor Comprehension, and Lack of Practice. A pedagogical paradigm was designed based on the major findings of the study for a sustainable instructional intervention.Keywords: academic text, lexico-semantic analysis, linguistic analysis, morphosyntactic analysis, paraphrasing
Procedia PDF Downloads 6725342 Study of Syntactic Errors for Deep Parsing at Machine Translation
Authors: Yukiko Sasaki Alam, Shahid Alam
Abstract:
Syntactic parsing is vital for semantic treatment by many applications related to natural language processing (NLP), because form and content coincide in many cases. However, it has not yet reached the levels of reliable performance. By manually examining and analyzing individual machine translation output errors that involve syntax as well as semantics, this study attempts to discover what is required for improving syntactic and semantic parsing.Keywords: syntactic parsing, error analysis, machine translation, deep parsing
Procedia PDF Downloads 56025341 Unraveling the Phonosignological Foundations of Human Language and Semantic Analysis of Linguistic Elements in Cross-Cultural Contexts
Authors: Mahmudjon Kuchkarov, Marufjon Kuchkarov, Mukhayyo Sobirjanova
Abstract:
The origins of human language remain a profound scientific mystery, characterized by speculative theories often lacking empirical support. This study presents findings that may illuminate the genesis of human language, emphasizing its roots in natural, systematic, and repetitive sound patterns. Also, this paper presents the phonosignological and semantic analysis of linguistic elements across various languages and cultures. By utilizing the principles of the "Human Language" theory, we analyze the symbolic, phonetic, and semantic characteristics of elements such as "A", "L", "I", "F", and "四" (pronounced /si/ in Chinese and /shi/ in Japanese). Our findings reveal that natural sounds and their symbolic representations form the foundation of language, with significant implications for understanding religious and secular myths. This paper explores the intricate relationships between these elements and their cultural connotations, particularly focusing on the concept of "descent" in the context of the phonetic sequence "A, L, I, F," and the symbolic associations of the number four with death.Keywords: empirical research, human language, phonosignology, semantics, sound patterns, symbolism, body shape, body language, coding, Latin alphabet, merging method, natural sound, origin of language, pairing, phonetics, sound and shape production, word origin, word semantic
Procedia PDF Downloads 3725340 Lexical-Semantic Processing by Chinese as a Second Language Learners
Authors: Yi-Hsiu Lai
Abstract:
The present study aimed to elucidate the lexical-semantic processing for Chinese as second language (CSL) learners. Twenty L1 speakers of Chinese and twenty CSL learners in Taiwan participated in a picture naming task and a category fluency task. Based on their Chinese proficiency levels, these CSL learners were further divided into two sub-groups: ten CSL learners of elementary Chinese proficiency level and ten CSL learners of intermediate Chinese proficiency level. Instruments for the naming task were sixty black-and-white pictures: thirty-five object pictures and twenty-five action pictures. Object pictures were divided into two categories: living objects and non-living objects. Action pictures were composed of two categories: action verbs and process verbs. As in the naming task, the category fluency task consisted of two semantic categories – objects (i.e., living and non-living objects) and actions (i.e., action and process verbs). Participants were asked to report as many items within a category as possible in one minute. Oral productions were tape-recorded and transcribed for further analysis. Both error types and error frequency were calculated. Statistical analysis was further conducted to examine these error types and frequency made by CSL learners. Additionally, category effects, pictorial effects and L2 proficiency were discussed. Findings in the present study helped characterize the lexical-semantic process of Chinese naming in CSL learners of different Chinese proficiency levels and made contributions to Chinese vocabulary teaching and learning in the future.Keywords: lexical-semantic processing, Mandarin Chinese, naming, category effects
Procedia PDF Downloads 46225339 Classification of Contexts for Mentioning Love in Interviews with Victims of the Holocaust
Authors: Marina Yurievna Aleksandrova
Abstract:
Research of the Holocaust retains value not only for history but also for sociology and psychology. One of the most important fields of study is how people were coping during and after this traumatic event. The aim of this paper is to identify the main contexts of the topic of love and to determine which contexts are more characteristic for different groups of victims of the Holocaust (gender, nationality, age). In this research, transcripts of interviews with Holocaust victims that were collected during 1946 for the "Voices of the Holocaust" project were used as data. Main contexts were analyzed with methods of network analysis and latent semantic analysis and classified by gender, age, and nationality with random forest. The results show that love is articulated and described significantly differently for male and female informants, nationality is shown results with lower values of quality metrics, as well as the age.Keywords: Holocaust, latent semantic analysis, network analysis, text-mining, random forest
Procedia PDF Downloads 18025338 Lexical Semantic Analysis to Support Ontology Modeling of Maintenance Activities– Case Study of Offshore Riser Integrity
Authors: Vahid Ebrahimipour
Abstract:
Word representation and context meaning of text-based documents play an essential role in knowledge modeling. Business procedures written in natural language are meant to store technical and engineering information, management decision and operation experience during the production system life cycle. Context meaning representation is highly dependent upon word sense, lexical relativity, and sematic features of the argument. This paper proposes a method for lexical semantic analysis and context meaning representation of maintenance activity in a mass production system. Our approach constructs a straightforward lexical semantic approach to analyze facilitates semantic and syntactic features of context structure of maintenance report to facilitate translation, interpretation, and conversion of human-readable interpretation into computer-readable representation and understandable with less heterogeneity and ambiguity. The methodology will enable users to obtain a representation format that maximizes shareability and accessibility for multi-purpose usage. It provides a contextualized structure to obtain a generic context model that can be utilized during the system life cycle. At first, it employs a co-occurrence-based clustering framework to recognize a group of highly frequent contextual features that correspond to a maintenance report text. Then the keywords are identified for syntactic and semantic extraction analysis. The analysis exercises causality-driven logic of keywords’ senses to divulge the structural and meaning dependency relationships between the words in a context. The output is a word contextualized representation of maintenance activity accommodating computer-based representation and inference using OWL/RDF.Keywords: lexical semantic analysis, metadata modeling, contextual meaning extraction, ontology modeling, knowledge representation
Procedia PDF Downloads 10525337 The Conceptual Relationships in N+N Compounds in Arabic Compared to English
Authors: Abdel Rahman Altakhaineh
Abstract:
This paper has analysed the conceptual relations between the elements of NN compounds in Arabic and compared them to those found in English based on the framework of Conceptual Semantics and a modified version of Parallel Architecture referred to as Relational Morphology. The analysis revealed that the repertoire of possible semantic relations between the two nouns in Arabic NN compounds reproduces that in English NN compounds and that, therefore, the main difference is in headedness (right-headed in English, left-headed in Arabic). Adopting RM allows productive and idiosyncratic elements to interweave with each other naturally. Semantically transparent compounds can be stored in memory or produced and understood online, while compounds with different degrees of semantic idiosyncrasy are stored in memory. Furthermore, the predictable parts of idiosyncratic compounds are captured by general schemas. In compounds, such schemas pick out the range of possible semantic relations between the two nouns. Finally, conducting a cross-linguistic study of the systematic patterns of possible conceptual relationships between compound elements is an area worthy of further exploration. In addition, comparing and contrasting compounding in Arabic and Hebrew, especially as they are both Semitic languages, is another area that needs to be investigated thoroughly. It will help morphologists understand the extent to which Jackendoff’s repertoire of semantic relations in compounds is universal. That is, if a language as distant from English as Arabic displays a similar range of cases, this is evidence for a (relatively) universal set of relations from which individual languages may pick and choose.Keywords: conceptual semantics, morphology, compounds, arabic, english
Procedia PDF Downloads 10025336 An Intensional Conceptualization Model for Ontology-Based Semantic Integration
Authors: Fateh Adhnouss, Husam El-Asfour, Kenneth McIsaac, AbdulMutalib Wahaishi, Idris El-Feghia
Abstract:
Conceptualization is an essential component of semantic ontology-based approaches. There have been several approaches that rely on extensional structure and extensional reduction structure in order to construct conceptualization. In this paper, several limitations are highlighted relating to their applicability to the construction of conceptualizations in dynamic and open environments. These limitations arise from a number of strong assumptions that do not apply to such environments. An intensional structure is strongly argued to be a natural and adequate modeling approach. This paper presents a conceptualization structure based on property relations and propositions theory (PRP) to the model ontology that is suitable for open environments. The model extends the First-Order Logic (FOL) notation and defines the formal representation that enables interoperability between software systems and supports semantic integration for software systems in open, dynamic environments.Keywords: conceptualization, ontology, extensional structure, intensional structure
Procedia PDF Downloads 11525335 Smart Web Services in the Web of Things
Authors: Sekkal Nawel
Abstract:
The Web of Things (WoT), integration of smart technologies from the Internet or network to Web architecture or application, is becoming more complex, larger, and dynamic. The WoT is associated with various elements such as sensors, devices, networks, protocols, data, functionalities, and architectures to perform services for stakeholders. These services operate in the context of the interaction of stakeholders and the WoT elements. Such context is becoming a key information source from which data are of various nature and uncertain, thus leading to complex situations. In this paper, we take interest in the development of intelligent Web services. The key ingredients of this “intelligent” notion are the context diversity, the necessity of a semantic representation to manage complex situations and the capacity to reason with uncertain data. In this perspective, we introduce a multi-layered architecture based on a generic intelligent Web service model dealing with various contexts, which proactively predict future situations and reactively respond to real-time situations in order to support decision-making. For semantic context data representation, we use PR-OWL, which is a probabilistic ontology based on Multi-Entity Bayesian Networks (MEBN). PR-OWL is flexible enough to represent complex, dynamic, and uncertain contexts, the key requirements of the development for the intelligent Web services. A case study was carried out using the proposed architecture for intelligent plant watering to show the role of proactive and reactive contextual reasoning in terms of WoT.Keywords: smart web service, the web of things, context reasoning, proactive, reactive, multi-entity bayesian networks, PR-OWL
Procedia PDF Downloads 7125334 'Caucasian Mountaineer / Scottish Highlander': Correlation between Semantics and Culture
Authors: Natalia M. Nepomniashchikh
Abstract:
The research focuses on Russian and English linguoculturemes Caucasian mountaineer and Scottish Highlander, the effort of comparative-contrastive analysis was made. In order to reach the aim, the analysis of the vocabulary definitions of the concepts under consideration was taken, which made it possible to build the lexical-semantic fields of both lexical items in Russian and English. This stage of research helped to turn to the linguistic-cultural fields construction. To build these fields, literary pieces containing the concepts under consideration and the items directly related to them were taken from the works about the Caucasus mountains and mountaineers living there by M. Yu. Lermontov and the ones by W. Scott devoted to the Scottish Highlands and their inhabitants. All collected data was systematized in schemes and tables reflecting the differences and intercrossing areas.Keywords: lexemes, lexical items, lexical-semantic field, linguistic-cultural field, linguoculturemes
Procedia PDF Downloads 23125333 A Proposal for a Secure and Interoperable Data Framework for Energy Digitalization
Authors: Hebberly Ahatlan
Abstract:
The process of digitizing energy systems involves transforming traditional energy infrastructure into interconnected, data-driven systems that enhance efficiency, sustainability, and responsiveness. As smart grids become increasingly integral to the efficient distribution and management of electricity from both fossil and renewable energy sources, the energy industry faces strategic challenges associated with digitalization and interoperability — particularly in the context of modern energy business models, such as virtual power plants (VPPs). The critical challenge in modern smart grids is to seamlessly integrate diverse technologies and systems, including virtualization, grid computing and service-oriented architecture (SOA), across the entire energy ecosystem. Achieving this requires addressing issues like semantic interoperability, IT/OT convergence, and digital asset scalability, all while ensuring security and risk management. This paper proposes a four-layer digitalization framework to tackle these challenges, encompassing persistent data protection, trusted key management, secure messaging, and authentication of IoT resources. Data assets generated through this framework enable AI systems to derive insights for improving smart grid operations, security, and revenue generation. Furthermore, this paper also proposes a Trusted Energy Interoperability Alliance as a universal guiding standard in the development of this digitalization framework to support more dynamic and interoperable energy markets.Keywords: digitalization, IT/OT convergence, semantic interoperability, VPP, energy blockchain
Procedia PDF Downloads 18325332 Semantic Features of Turkish and Spanish Phraseological Units with a Somatic Component ‘Hand’
Authors: Narmina Mammadova
Abstract:
In modern linguistics, the comparative study of languages is becoming increasingly popular, the typology and comparison of languages that have different structures is expanding and deepening. Of particular interest is the study of phraseological units, which makes it possible to identify the specific features of the compared languages in all their national identity. This paper gives a brief analysis of the comparative study of somatic phraseological units (SFU) of the Spanish and Turkish languages with the component "hand" in the semantic aspect; identification of equivalents, analogs and non-equivalent units, as well as a description of methods of translation of non-equivalent somatic phraseological units. Comparative study of the phraseology of unrelated languages is of particular relevance since it allows us to identify both general, universal features and differential and specific features characteristic of a particular language. Based on the results of the generalization of the study, it can be assumed that phraseological units containing a somatic component have a high interlingual phraseological activity, which contributes to an increase in the degree of interlingual equivalence.Keywords: Linguoculturology, Turkish, Spanish, language picture of the world, phraseological units, semantic microfield
Procedia PDF Downloads 19625331 Factorization of Computations in Bayesian Networks: Interpretation of Factors
Authors: Linda Smail, Zineb Azouz
Abstract:
Given a Bayesian network relative to a set I of discrete random variables, we are interested in computing the probability distribution P(S) where S is a subset of I. The general idea is to write the expression of P(S) in the form of a product of factors where each factor is easy to compute. More importantly, it will be very useful to give an interpretation of each of the factors in terms of conditional probabilities. This paper considers a semantic interpretation of the factors involved in computing marginal probabilities in Bayesian networks. Establishing such a semantic interpretations is indeed interesting and relevant in the case of large Bayesian networks.Keywords: Bayesian networks, D-Separation, level two Bayesian networks, factorization of computation
Procedia PDF Downloads 52925330 Social Media, Networks and Related Technology: Business and Governance Perspectives
Authors: M. A. T. AlSudairi, T. G. K. Vasista
Abstract:
The concept of social media is becoming the top of the agenda for many business executives and public sector executives today. Decision makers as well as consultants, try to identify ways in which firms and enterprises can make profitable use of social media and network related applications such as Wikipedia, Face book, YouTube, Google+, Twitter. While it is fun and useful to participating in this media and network for achieving the communication effectively and efficiently, semantic and sentiment analysis and interpretation becomes a crucial issue. So, the objective of this paper is to provide literature review on social media, network and related technology related to semantics and sentiment or opinion analysis covering business and governance perspectives. In this regard, a case study on the use and adoption of Social media in Saudi Arabia has been discussed. It is concluded that semantic web technology play a significant role in analyzing the social networks and social media content for extracting the interpretational knowledge towards strategic decision support.Keywords: CRASP methodology, formative assessment, literature review, semantic web services, social media, social networks
Procedia PDF Downloads 45125329 A Supervised Face Parts Labeling Framework
Authors: Khalil Khan, Ikram Syed, Muhammad Ehsan Mazhar, Iran Uddin, Nasir Ahmad
Abstract:
Face parts labeling is the process of assigning class labels to each face part. A face parts labeling method (FPL) which divides a given image into its constitutes parts is proposed in this paper. A database FaceD consisting of 564 images is labeled with hand and make publically available. A supervised learning model is built through extraction of features from the training data. The testing phase is performed with two semantic segmentation methods, i.e., pixel and super-pixel based segmentation. In pixel-based segmentation class label is provided to each pixel individually. In super-pixel based method class label is assigned to super-pixel only – as a result, the same class label is given to all pixels inside a super-pixel. Pixel labeling accuracy reported with pixel and super-pixel based methods is 97.68 % and 93.45% respectively.Keywords: face labeling, semantic segmentation, classification, face segmentation
Procedia PDF Downloads 25525328 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers
Authors: Yogendra Sisodia
Abstract:
Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity
Procedia PDF Downloads 10725327 Text Similarity in Vector Space Models: A Comparative Study
Authors: Omid Shahmirzadi, Adam Lugowski, Kenneth Younge
Abstract:
Automatic measurement of semantic text similarity is an important task in natural language processing. In this paper, we evaluate the performance of different vector space models to perform this task. We address the real-world problem of modeling patent-to-patent similarity and compare TFIDF (and related extensions), topic models (e.g., latent semantic indexing), and neural models (e.g., paragraph vectors). Contrary to expectations, the added computational cost of text embedding methods is justified only when: 1) the target text is condensed; and 2) the similarity comparison is trivial. Otherwise, TFIDF performs surprisingly well in other cases: in particular for longer and more technical texts or for making finer-grained distinctions between nearest neighbors. Unexpectedly, extensions to the TFIDF method, such as adding noun phrases or calculating term weights incrementally, were not helpful in our context.Keywords: big data, patent, text embedding, text similarity, vector space model
Procedia PDF Downloads 17525326 On the Framework of Contemporary Intelligent Mathematics Underpinning Intelligent Science, Autonomous AI, and Cognitive Computers
Authors: Yingxu Wang, Jianhua Lu, Jun Peng, Jiawei Zhang
Abstract:
The fundamental demand in contemporary intelligent science towards Autonomous AI (AI*) is the creation of unprecedented formal means of Intelligent Mathematics (IM). It is discovered that natural intelligence is inductively created rather than exhaustively trained. Therefore, IM is a family of algebraic and denotational mathematics encompassing Inference Algebra, Real-Time Process Algebra, Concept Algebra, Semantic Algebra, Visual Frame Algebra, etc., developed in our labs. IM plays indispensable roles in training-free AI* theories and systems beyond traditional empirical data-driven technologies. A set of applications of IM-driven AI* systems will be demonstrated in contemporary intelligence science, AI*, and cognitive computers.Keywords: intelligence mathematics, foundations of intelligent science, autonomous AI, cognitive computers, inference algebra, real-time process algebra, concept algebra, semantic algebra, applications
Procedia PDF Downloads 6125325 Deep Vision: A Robust Dominant Colour Extraction Framework for T-Shirts Based on Semantic Segmentation
Authors: Kishore Kumar R., Kaustav Sengupta, Shalini Sood Sehgal, Poornima Santhanam
Abstract:
Fashion is a human expression that is constantly changing. One of the prime factors that consistently influences fashion is the change in colour preferences. The role of colour in our everyday lives is very significant. It subconsciously explains a lot about one’s mindset and mood. Analyzing the colours by extracting them from the outfit images is a critical study to examine the individual’s/consumer behaviour. Several research works have been carried out on extracting colours from images, but to the best of our knowledge, there were no studies that extract colours to specific apparel and identify colour patterns geographically. This paper proposes a framework for accurately extracting colours from T-shirt images and predicting dominant colours geographically. The proposed method consists of two stages: first, a U-Net deep learning model is adopted to segment the T-shirts from the images. Second, the colours are extracted only from the T-shirt segments. The proposed method employs the iMaterialist (Fashion) 2019 dataset for the semantic segmentation task. The proposed framework also includes a mechanism for gathering data and analyzing India’s general colour preferences. From this research, it was observed that black and grey are the dominant colour in different regions of India. The proposed method can be adapted to study fashion’s evolving colour preferences.Keywords: colour analysis in t-shirts, convolutional neural network, encoder-decoder, k-means clustering, semantic segmentation, U-Net model
Procedia PDF Downloads 11125324 Automatic Multi-Label Image Annotation System Guided by Firefly Algorithm and Bayesian Method
Authors: Saad M. Darwish, Mohamed A. El-Iskandarani, Guitar M. Shawkat
Abstract:
Nowadays, the amount of available multimedia data is continuously on the rise. The need to find a required image for an ordinary user is a challenging task. Content based image retrieval (CBIR) computes relevance based on the visual similarity of low-level image features such as color, textures, etc. However, there is a gap between low-level visual features and semantic meanings required by applications. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) that extracts semantic features using machine learning techniques. In this paper, a multi-label image annotation system guided by Firefly and Bayesian method is proposed. Firstly, images are segmented using the maximum variance intra cluster and Firefly algorithm, which is a swarm-based approach with high convergence speed, less computation rate and search for the optimal multiple threshold. Feature extraction techniques based on color features and region properties are applied to obtain the representative features. After that, the images are annotated using translation model based on the Net Bayes system, which is efficient for multi-label learning with high precision and less complexity. Experiments are performed using Corel Database. The results show that the proposed system is better than traditional ones for automatic image annotation and retrieval.Keywords: feature extraction, feature selection, image annotation, classification
Procedia PDF Downloads 58625323 A Concept of Data Mining with XML Document
Authors: Akshay Agrawal, Anand K. Srivastava
Abstract:
The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering
Procedia PDF Downloads 37925322 Historical Development of Negative Emotive Intensifiers in Hungarian
Authors: Martina Katalin Szabó, Bernadett Lipóczi, Csenge Guba, István Uveges
Abstract:
In this study, an exhaustive analysis was carried out about the historical development of negative emotive intensifiers in the Hungarian language via NLP methods. Intensifiers are linguistic elements which modify or reinforce a variable character in the lexical unit they apply to. Therefore, intensifiers appear with other lexical items, such as adverbs, adjectives, verbs, infrequently with nouns. Due to the complexity of this phenomenon (set of sociolinguistic, semantic, and historical aspects), there are many lexical items which can operate as intensifiers. The group of intensifiers are admittedly one of the most rapidly changing elements in the language. From a linguistic point of view, particularly interesting are a special group of intensifiers, the so-called negative emotive intensifiers, that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g.borzasztóanjó ’awfully good’, which means ’excellent’). Despite their special semantic features, negative emotive intensifiers are scarcely examined in literature based on large Historical corpora via NLP methods. In order to become better acquainted with trends over time concerning the intensifiers, The exhaustively analysed a specific historical corpus, namely the Magyar TörténetiSzövegtár (Hungarian Historical Corpus). This corpus (containing 3 millions text words) is a collection of texts of various genres and styles, produced between 1772 and 2010. Since the corpus consists of raw texts and does not contain any additional information about the language features of the data (such as stemming or morphological analysis), a large amount of manual work was required to process the data. Thus, based on a lexicon of negative emotive intensifiers compiled in a previous phase of the research, every occurrence of each intensifier was queried, and the results were stored in a separate data frame. Then, basic linguistic processing (POS-tagging, lemmatization etc.) was carried out automatically with the ‘magyarlanc’ NLP-toolkit. Finally, the frequency and collocation features of all the negative emotive words were automatically analyzed in the corpus. Outcomes of the research revealed in detail how these words have proceeded through grammaticalization over time, i.e., they change from lexical elements to grammatical ones, and they slowly go through a delexicalization process (their negative content diminishes over time). What is more, it was also pointed out which negative emotive intensifiers are at the same stage in this process in the same time period. Giving a closer look to the different domains of the analysed corpus, it also became certain that during this process, the pragmatic role’s importance increases: the newer use expresses the speaker's subjective, evaluative opinion at a certain level.Keywords: historical corpus analysis, historical linguistics, negative emotive intensifiers, semantic changes over time
Procedia PDF Downloads 23325321 Mondoc: Informal Lightweight Ontology for Faceted Semantic Classification of Hypernymy
Authors: M. Regina Carreira-Lopez
Abstract:
Lightweight ontologies seek to concrete union relationships between a parent node, and a secondary node, also called "child node". This logic relation (L) can be formally defined as a triple ontological relation (LO) equivalent to LO in ⟨LN, LE, LC⟩, and where LN represents a finite set of nodes (N); LE is a set of entities (E), each of which represents a relationship between nodes to form a rooted tree of ⟨LN, LE⟩; and LC is a finite set of concepts (C), encoded in a formal language (FL). Mondoc enables more refined searches on semantic and classified facets for retrieving specialized knowledge about Atlantic migrations, from the Declaration of Independence of the United States of America (1776) and to the end of the Spanish Civil War (1939). The model looks forward to increasing documentary relevance by applying an inverse frequency of co-ocurrent hypernymy phenomena for a concrete dataset of textual corpora, with RMySQL package. Mondoc profiles archival utilities implementing SQL programming code, and allows data export to XML schemas, for achieving semantic and faceted analysis of speech by analyzing keywords in context (KWIC). The methodology applies random and unrestricted sampling techniques with RMySQL to verify the resonance phenomena of inverse documentary relevance between the number of co-occurrences of the same term (t) in more than two documents of a set of texts (D). Secondly, the research also evidences co-associations between (t) and their corresponding synonyms and antonyms (synsets) are also inverse. The results from grouping facets or polysemic words with synsets in more than two textual corpora within their syntagmatic context (nouns, verbs, adjectives, etc.) state how to proceed with semantic indexing of hypernymy phenomena for subject-heading lists and for authority lists for documentary and archival purposes. Mondoc contributes to the development of web directories and seems to achieve a proper and more selective search of e-documents (classification ontology). It can also foster on-line catalogs production for semantic authorities, or concepts, through XML schemas, because its applications could be used for implementing data models, by a prior adaptation of the based-ontology to structured meta-languages, such as OWL, RDF (descriptive ontology). Mondoc serves to the classification of concepts and applies a semantic indexing approach of facets. It enables information retrieval, as well as quantitative and qualitative data interpretation. The model reproduces a triple tuple ⟨LN, LE, LT, LCF L, BKF⟩ where LN is a set of entities that connect with other nodes to concrete a rooted tree in ⟨LN, LE⟩. LT specifies a set of terms, and LCF acts as a finite set of concepts, encoded in a formal language, L. Mondoc only resolves partial problems of linguistic ambiguity (in case of synonymy and antonymy), but neither the pragmatic dimension of natural language nor the cognitive perspective is addressed. To achieve this goal, forthcoming programming developments should target at oriented meta-languages with structured documents in XML.Keywords: hypernymy, information retrieval, lightweight ontology, resonance
Procedia PDF Downloads 12525320 Treating Voxels as Words: Word-to-Vector Methods for fMRI Meta-Analyses
Authors: Matthew Baucum
Abstract:
With the increasing popularity of fMRI as an experimental method, psychology and neuroscience can greatly benefit from advanced techniques for summarizing and synthesizing large amounts of data from brain imaging studies. One promising avenue is automated meta-analyses, in which natural language processing methods are used to identify the brain regions consistently associated with certain semantic concepts (e.g. “social”, “reward’) across large corpora of studies. This study builds on this approach by demonstrating how, in fMRI meta-analyses, individual voxels can be treated as vectors in a semantic space and evaluated for their “proximity” to terms of interest. In this technique, a low-dimensional semantic space is built from brain imaging study texts, allowing words in each text to be represented as vectors (where words that frequently appear together are near each other in the semantic space). Consequently, each voxel in a brain mask can be represented as a normalized vector sum of all of the words in the studies that showed activation in that voxel. The entire brain mask can then be visualized in terms of each voxel’s proximity to a given term of interest (e.g., “vision”, “decision making”) or collection of terms (e.g., “theory of mind”, “social”, “agent”), as measured by the cosine similarity between the voxel’s vector and the term vector (or the average of multiple term vectors). Analysis can also proceed in the opposite direction, allowing word cloud visualizations of the nearest semantic neighbors for a given brain region. This approach allows for continuous, fine-grained metrics of voxel-term associations, and relies on state-of-the-art “open vocabulary” methods that go beyond mere word-counts. An analysis of over 11,000 neuroimaging studies from an existing meta-analytic fMRI database demonstrates that this technique can be used to recover known neural bases for multiple psychological functions, suggesting this method’s utility for efficient, high-level meta-analyses of localized brain function. While automated text analytic methods are no replacement for deliberate, manual meta-analyses, they seem to show promise for the efficient aggregation of large bodies of scientific knowledge, at least on a relatively general level.Keywords: FMRI, machine learning, meta-analysis, text analysis
Procedia PDF Downloads 44825319 Arabic Quran Search Tool Based on Ontology
Authors: Mohammad Alqahtani, Eric Atwell
Abstract:
This paper reviews and classifies most of the important types of search techniques that have been applied on the holy Quran. Then, it addresses the limitations in these techniques. Additionally, this paper surveys most existing Quranic ontologies and what are their deficiencies. Finally, it explains a new search tool called: A semantic search tool for Al Quran based on Qur’anic ontologies. This tool will overcome all limitations in the existing Quranic search applications.Keywords: holy Quran, natural language processing (NLP), semantic search, information retrieval (IR), ontology
Procedia PDF Downloads 57225318 Effect of Semantic Relational Cues in Action Memory Performance over School Ages
Authors: Farzaneh Badinlou, Reza Kormi-Nouri, Monika Knopf, Kamal Kharazi
Abstract:
Research into long-term memory has demonstrated that the richness of the knowledge base cues in memory tasks improves retrieval process, which in turn influences learning and memory performance. The present research investigated the idea that adding cues connected to knowledge can affect memory performance in the context of action memory in children. In action memory studies, participants are instructed to learn a series of verb–object phrases as verbal learning and experience-based learning (learning by doing and learning by observation). It is well established that executing action phrases is a more memorable way to learn than verbally repeating the phrases, a finding called enactment effect. In the present study, a total of 410 students from four grade groups—2nd, 4th, 6th, and 8th—participated in this study. During the study, participants listened to verbal action phrases (VTs), performed the phrases (SPTs: subject-performed tasks), and observed the experimenter perform the phrases (EPTs: experimenter-performed tasks). During the test phase, cued recall test was administered. Semantic relational cues (i.e., well-integrated vs. poorly integrated items) were manipulated in the present study. In that, the participants were presented two lists of action phrases with high semantic integration between verb and noun, e.g., “write with the pen” and with low semantic integration between verb and noun, e.g., “pick up the glass”. Results revealed that experience-based learning had a better results than verbal learning for both well-integrated and poorly integrated items, though manipulations of semantic relational cues can moderate the enactment effect. In addition, children of different grade groups outperformed for well- than poorly integrated items, in flavour of older children. The results were discussed in relation to the effect of knowledge-based information in facilitating retrieval process in children.Keywords: action memory, enactment effect, knowledge-based cues, school-aged children, semantic relational cues
Procedia PDF Downloads 27525317 Automated Adaptions of Semantic User- and Service Profile Representations by Learning the User Context
Authors: Nicole Merkle, Stefan Zander
Abstract:
Ambient Assisted Living (AAL) describes a technological and methodological stack of (e.g. formal model-theoretic semantics, rule-based reasoning and machine learning), different aspects regarding the behavior, activities and characteristics of humans. Hence, a semantic representation of the user environment and its relevant elements are required in order to allow assistive agents to recognize situations and deduce appropriate actions. Furthermore, the user and his/her characteristics (e.g. physical, cognitive, preferences) need to be represented with a high degree of expressiveness in order to allow software agents a precise evaluation of the users’ context models. The correct interpretation of these context models highly depends on temporal, spatial circumstances as well as individual user preferences. In most AAL approaches, model representations of real world situations represent the current state of a universe of discourse at a given point in time by neglecting transitions between a set of states. However, the AAL domain currently lacks sufficient approaches that contemplate on the dynamic adaptions of context-related representations. Semantic representations of relevant real-world excerpts (e.g. user activities) help cognitive, rule-based agents to reason and make decisions in order to help users in appropriate tasks and situations. Furthermore, rules and reasoning on semantic models are not sufficient for handling uncertainty and fuzzy situations. A certain situation can require different (re-)actions in order to achieve the best results with respect to the user and his/her needs. But what is the best result? To answer this question, we need to consider that every smart agent requires to achieve an objective, but this objective is mostly defined by domain experts who can also fail in their estimation of what is desired by the user and what not. Hence, a smart agent has to be able to learn from context history data and estimate or predict what is most likely in certain contexts. Furthermore, different agents with contrary objectives can cause collisions as their actions influence the user’s context and constituting conditions in unintended or uncontrolled ways. We present an approach for dynamically updating a semantic model with respect to the current user context that allows flexibility of the software agents and enhances their conformance in order to improve the user experience. The presented approach adapts rules by learning sensor evidence and user actions using probabilistic reasoning approaches, based on given expert knowledge. The semantic domain model consists basically of device-, service- and user profile representations. In this paper, we present how this semantic domain model can be used in order to compute the probability of matching rules and actions. We apply this probability estimation to compare the current domain model representation with the computed one in order to adapt the formal semantic representation. Our approach aims at minimizing the likelihood of unintended interferences in order to eliminate conflicts and unpredictable side-effects by updating pre-defined expert knowledge according to the most probable context representation. This enables agents to adapt to dynamic changes in the environment which enhances the provision of adequate assistance and affects positively the user satisfaction.Keywords: ambient intelligence, machine learning, semantic web, software agents
Procedia PDF Downloads 28125316 Syntactic, Semantic, and Pragmatic Rationalization of Modal Auxiliary Verbs in Akan
Authors: Joana Portia Sakyi
Abstract:
The uniqueness of auxiliary verbs and their contribution to grammar as constituents, which act as preverbs to supply additional grammatical or functional meanings to clauses, are well established. Functionally, they relate clauses to tense, aspect, mood, voice, emphasis, and modality, along with the main verbs conveying the appropriate lexical content. There has been an issue in Akan grammar vis-à-vis the status of auxiliary verbs, in terms of whether Akan has auxiliaries or not and even which forms are to be regarded as auxiliaries. We investigate the syntactic, semantic, and pragmatic components of expressions and claim that Akan has auxiliary verbs that contribute the functional or grammatical meaning of modality, tense/aspect, etc., to clauses they occur in. Essentially, we use a self-created corpus data to consider the affix bέ- ‘may’, ‘must’, ‘should’; the form tùmí ‘can’, ‘be able to’; mà ‘to let’, ‘to allow’, ‘to permit’, ‘to make’, or ‘to cause’ someone to do something; the multi-word forms ὲsὲ sέ ‘must’, ‘should’ or ‘have to’ and ètwà sέ ‘must’, ‘should’ or ‘have to’, and assert that they are legitimate modal auxiliaries conveying epistemic, deontic, and dynamic modalities, as well as other meanings in the language.Keywords: Akan, modality, modal auxiliaries, semantics
Procedia PDF Downloads 7725315 Replication of Meaningful Gesture Study for N400 Detection Using a Commercial Brain-Computer Interface
Authors: Thomas Ousterhout
Abstract:
In an effort to test the ability of a commercial grade EEG headset to effectively measure the N400 ERP, a replication study was conducted to see if similar results could be produced as that which used a medical grade EEG. Pictures of meaningful and meaningless hand postures were borrowed from the original author and subjects were required to perform a semantic discrimination task. The N400 was detected indicating semantic processing of the meaningfulness of the hand postures. The results corroborate those of the original author and support the use of some commercial grade EEG headsets for non-critical research applications.Keywords: EEG, ERP, N400, semantics, congruency, gestures, emotiv
Procedia PDF Downloads 263