Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 588

Search results for: semantic annotation

168 Synthetic Method of Contextual Knowledge Extraction

Authors: Olga Kononova, Sergey Lyapin

Abstract:

Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.

Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction

Procedia PDF Downloads 329

167 Original and the Translated: A Comparative Evaluation of Native and Non-Native English Translations of Faiz

Authors: Anam Nawaz

Abstract:

The present study is an attempt to compare the translations of Faiz’s poetry made by native and non-native translators, to determine the role of the translator in terms of preserving the cultural ethos of the original text. Peter Newmark and Katharine Reiss’s approaches to translation criticism have been used to provide a theoretical framework for the study. This study also emphasizes those cultural and semantic aspects of the original which are translated more convincingly by a native translator, and contrasting those features which the non-natives can tackle more ably. The research also highlights the linguistic sockets, ignored by the interpreters in the translation process. The analysis showed that both native and non-native translators have made an admirable effort to stay as close to the original as possible. The natives with their advantage of belonging to the same culture have excelled in preserving the original subject matter, whereas the non-native renderings have been presented in a much rhythmic and poetic manner with an excellent choice of words. Though none of the four translators has been successfully able to recreate Faiz’s magic, however V. G. Kiernan and Sarvat Rahman’s translations can be regarded as the closest to the original. Whereas V. G. Kiernan with his outstanding command over English mesmerizes the readers, Sarvat Rahman’s profound understanding of cultural ties helps establish her translations as a brilliant example of faithful re-renderings.

Keywords: comparative translations, linguistic and cultural constraints, native translators, non-native translators, poetry and translation, Faiz Ahmad Faiz

Procedia PDF Downloads 231

166 Quantitative Analysis of the Quality of Housing and Land Use in the Built-up area of Croatian Coastal City of Zadar

Authors: Silvija Šiljeg, Ante Šiljeg, Branko Cavrić

Abstract:

Housing is considered as a basic human need and important component of the quality of life (QoL) in urban areas worldwide. In contemporary housing studies, the concept of the quality of housing (QoH) is considered as a multi-dimensional and multi-disciplinary field. It emphasizes connection between various aspects of the QoL which could be measured by quantitative and qualitative indicators at different spatial levels (e.g. local, city, metropolitan, regional). The main goal of this paper is to examine the QoH and compare results of quantitative analysis with the clutter land use categories derived for selected local communities in Croatian Coastal City of Zadar. The qualitative housing analysis based on the four housing indicators (out of total 24 QoL indicators) has provided identification of the three Zadar’s local communities with the highest estimated QoH ranking. Furthermore, by using GIS overlay techniques, the QoH was merged with the urban environment analysis and introduction of spatial metrics based on the three categories: the element, class and environment as a whole. In terms of semantic-content analysis, the research has also generated a set of indexes suitable for evaluation of “housing state of affairs” and future decision making aiming at improvement of the QoH in selected local communities.

Keywords: housing, quality, indicators, indexes, urban environment, GIS, element, class

Procedia PDF Downloads 386

165 Development of a French to Yorùbá Machine Translation System

Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky

Abstract:

A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.

Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language

Procedia PDF Downloads 25

164 Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Authors: Ke He, Wumaier Parezhati, Haruka Yamashita

Abstract:

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

Keywords: Doc2Vec, online marketplace, marketing, recommendation systems

Procedia PDF Downloads 91

163 MAFB Expression in LPS-Induced Exosomes: Revealing the Connection to sepsis-trigerred Hepatic Injury

Authors: Gizaw Mamo Gebeyehu, Marianna Pap, Geza Makkai, Tibor Z. Janosi, Shima Rashidian, Tibor A. Rauch

Abstract:

Sepsis poses a significant global health threat, necessitating extensive exploration of indicators tied to its pathological mechanisms and multi-organ dysfunction. While murine studies have shed light on sepsis, the intricate cellular and molecular landscape in human sepsis remains enigmatic. Exploring the influence of activated monocyte-derived exosomes in sepsis sheds light on a promising pathway for understanding the intricate cellular and molecular mechanisms involved in this condition in humans. In sepsis, exosome-borne mRNA and miRNA orchestrate immune response gene expression in recipient cells. Yet, the specifics of exosome-mediated cell-to-cell communication, especially how mRNA cargoes modulate gene expression in recipient cells, remain poorly understood. This study aims to elucidate the precise molecular pathways through which exosomal mRNA cargo, particularly MAFB, contributes to the developing sepsis-induced molecular aberrations in liver tissues, employing rigorously defined cell culture conditions. THP-1 cells were treated with LPS to induce changes in exosomal RNA profiles. Exosomes were isolated and characterized using microscopy and mass spectrometry. RNA was extracted from exosomes and sequenced. The most abundant exosomal mRNAs were subjected to GO analysis for functional annotation analysis and KEGG database analysis to identify the involved enriched pathways. PCR (Polymerase Chain Reaction), RNA sequencing, and Western blotting were involved to analyze changes in gene expression, protein levels, and signaling pathways within the liver cells( HepG2) after exposure to exosomal MAFB. This study pinpoints exosomal MAFB as a potential key regulator linked to liver cell damage during sepsis, along with associated genes (miR155HG, H3F3A, and possibly JARD2) forming a crucial molecular pathway contributing to liver cell injury, Together, these elements indicate a vital molecular pathway that plays a significant role in the emergence of liver cell injury during sepsis.. These findings suggest the importance of further research on these components for potential therapeutic interventions in managing acute liver damage in sepsis.

Keywords: sepsis, exososome, exosomal MAFB, LPS-induced THP-1 cells, RNA profiles, sepsis-triggered liver injury

Procedia PDF Downloads 34

162 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 117

161 The Involvement of Visual and Verbal Representations Within a Quantitative and Qualitative Visual Change Detection Paradigm

Authors: Laura Jenkins, Tim Eschle, Joanne Ciafone, Colin Hamilton

Abstract:

An original working memory model suggested the separation of visual and verbal systems in working memory architecture, in which only visual working memory components were used during visual working memory tasks. It was later suggested that the visuo spatial sketch pad was the only memory component at use during visual working memory tasks, and components such as the phonological loop were not considered. In more recent years, a contrasting approach has been developed with the use of an executive resource to incorporate both visual and verbal representations in visual working memory paradigms. This was supported using research demonstrating the use of verbal representations and an executive resource in a visual matrix patterns task. The aim of the current research is to investigate the working memory architecture during both a quantitative and a qualitative visual working memory task. A dual task method will be used. Three secondary tasks will be used which are designed to hit specific components within the working memory architecture – Dynamic Visual Noise (visual components), Visual Attention (spatial components) and Verbal Attention (verbal components). A comparison of the visual working memory tasks will be made to discover if verbal representations are at use, as the previous literature suggested. This direct comparison has not been made so far in the literature. Considerations will be made as to whether a domain specific approach should be employed when discussing visual working memory tasks, or whether a more domain general approach could be used instead.

Keywords: semantic organisation, visual memory, change detection

Procedia PDF Downloads 557

160 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 341

159 Comparing Deep Architectures for Selecting Optimal Machine Translation

Authors: Despoina Mouratidis, Katia Lida Kermanidis

Abstract:

Machine translation (MT) is a very important task in Natural Language Processing (NLP). MT evaluation is crucial in MT development, as it constitutes the means to assess the success of an MT system, and also helps improve its performance. Several methods have been proposed for the evaluation of (MT) systems. Some of the most popular ones in automatic MT evaluation are score-based, such as the BLEU score, and others are based on lexical similarity or syntactic similarity between the MT outputs and the reference involving higher-level information like part of speech tagging (POS). This paper presents a language-independent machine learning framework for classifying pairwise translations. This framework uses vector representations of two machine-produced translations, one from a statistical machine translation model (SMT) and one from a neural machine translation model (NMT). The vector representations consist of automatically extracted word embeddings and string-like language-independent features. These vector representations used as an input to a multi-layer neural network (NN) that models the similarity between each MT output and the reference, as well as between the two MT outputs. To evaluate the proposed approach, a professional translation and a "ground-truth" annotation are used. The parallel corpora used are English-Greek (EN-GR) and English-Italian (EN-IT), in the educational domain and of informal genres (video lecture subtitles, course forum text, etc.) that are difficult to be reliably translated. They have tested three basic deep learning (DL) architectures to this schema: (i) fully-connected dense, (ii) Convolutional Neural Network (CNN), and (iii) Long Short-Term Memory (LSTM). Experiments show that all tested architectures achieved better results when compared against those of some of the well-known basic approaches, such as Random Forest (RF) and Support Vector Machine (SVM). Better accuracy results are obtained when LSTM layers are used in our schema. In terms of a balance between the results, better accuracy results are obtained when dense layers are used. The reason for this is that the model correctly classifies more sentences of the minority class (SMT). For a more integrated analysis of the accuracy results, a qualitative linguistic analysis is carried out. In this context, problems have been identified about some figures of speech, as the metaphors, or about certain linguistic phenomena, such as per etymology: paronyms. It is quite interesting to find out why all the classifiers led to worse accuracy results in Italian as compared to Greek, taking into account that the linguistic features employed are language independent.

Keywords: machine learning, machine translation evaluation, neural network architecture, pairwise classification

Procedia PDF Downloads 105

158 Unraveling the Evolution of Mycoplasma Hominis Through Its Genome Sequence

Authors: Boutheina Ben Abdelmoumen Mardassi, Salim Chibani, Safa Boujemaa, Amaury Vaysse, Julien Guglielmini, Elhem Yacoub

Abstract:

Background and aim: Mycoplasma hominis (MH) is a pathogenic bacterium belonging to the Mollicutes class. It causes a wide range of gynecological infections and infertility among adults. Recently, we have explored for the first time the phylodistribution of Tunisian M. hominis clinical strains using an expanded MLST. We have demonstrated their distinction into two pure lineages, which each corresponding to a specific pathotype: genital infections and infertility. The aim of this project is to gain further insight into the evolutionary dynamics and the specific genetic factors that distinguish MH pathotypes Methods: Whole genome sequencing of Mycoplasma hominis clinical strains was performed using illumina Miseq. Denovo assembly was performed using a publicly available in-house pipeline. We used prokka to annotate the genomes, panaroo to generate the gene presence matrix and Jolytree to establish the phylogenetic tree. We used treeWAS to identify genetic loci associated with the pathothype of interest from the presence matrix and phylogenetic tree. Results: Our results revealed a clear categorization of the 62 MH clinical strains into two distinct genetic lineages, with each corresponding to a specific pathotype.; gynecological infections and infertility[AV1] . Genome annotation showed that GC content is ranging between 26 and 27%, which is a known characteristic of Mycoplasma genome. Housekeeping genes belonging to the core genome are highly conserved among our strains. TreeWas identified 4 virulence genes associated with the pathotype gynecological infection. encoding for asparagine--tRNA ligase, restriction endonuclease subunit S, Eco47II restriction endonuclease, and transcription regulator XRE (involved in tolerance to oxidative stress). Five genes have been identified that have a statistical association with infertility, tow lipoprotein, one hypothetical protein, a glycosyl transferase involved in capsule synthesis, and pyruvate kinase involved in biofilm formation. All strains harbored an efflux pomp that belongs to the family of multidrug resistance ABC transporter, which confers resistance to a wide range of antibiotics. Indeed many adhesion factors and lipoproteins (p120, p120', p60, p80, Vaa) have been checked and confirmed in our strains with a relatively 99 % to 96 % conserved domain and hypervariable domain that represent 1 to 4 % of the reference sequence extracted from gene bank. Conclusion: In summary, this study led to the identification of specific genetic loci associated with distinct pathotypes in M hominis.

Keywords: mycoplasma hominis, infertility, gynecological infections, virulence genes, antibiotic resistance

Procedia PDF Downloads 57

157 Transfer of Constraints or Constraints on Transfer? Syntactic Islands in Danish L2 English

Authors: Anne Mette Nyvad, Ken Ramshøj Christensen

Abstract:

In the syntax literature, it has standardly been assumed that relative clauses and complement wh-clauses are islands for extraction in English, and that constraints on extraction from syntactic islands are universal. However, the Mainland Scandinavian languages has been known to provide counterexamples. Previous research on Danish has shown that neither relative clauses nor embedded questions are strong islands in Danish. Instead, extraction from this type of syntactic environment is degraded due to structural complexity and it interacts with nonstructural factors such as the frequency of occurrence of the matrix verb, the possibility of temporary misanalysis leading to semantic incongruity and exposure over time. We argue that these facts can be accounted for with parametric variation in the availability of CP-recursion, resulting in the patterns observed, as Danish would then “suspend” the ban on movement out of relative clauses and embedded questions. Given that Danish does not seem to adhere to allegedly universal syntactic constraints, such as the Complex NP Constraint and the Wh-Island Constraint, what happens in L2 English? We present results from a study investigating how native Danish speakers judge extractions from island structures in L2 English. Our findings suggest that Danes transfer their native language parameter setting when asked to judge island constructions in English. This is compatible with the Full Transfer Full Access Hypothesis, as the latter predicts that Danish would have difficulties resetting their [+/- CP-recursion] parameter in English because they are not exposed to negative evidence.

Keywords: syntax, islands, second language acquisition, danish

Procedia PDF Downloads 92

156 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 293

155 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia PDF Downloads 123

154 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19

Authors: Hussain Hameed Mayuuf

Abstract:

The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.

Keywords: cognitive, semantics, conceptual, metonymical, Covid-19

Procedia PDF Downloads 86

153 Corpus Linguistics as a Tool for Translation Studies Analysis: A Bilingual Parallel Corpus of Students’ Translations

Authors: Juan-Pedro Rica-Peromingo

Abstract:

Nowadays, corpus linguistics has become a key research methodology for Translation Studies, which broadens the scope of cross-linguistic studies. In the case of the study presented here, the approach used focuses on learners with little or no experience to study, at an early stage, general mistakes and errors, the correct or incorrect use of translation strategies, and to improve the translational competence of the students. Led by Sylviane Granger and Marie-Aude Lefer of the Centre for English Corpus Linguistics of the University of Louvain, the MUST corpus (MUltilingual Student Translation Corpus) is an international project which brings together partners from Europe and worldwide universities and connects Learner Corpus Research (LCR) and Translation Studies (TS). It aims to build a corpus of translations carried out by students including both direct (L2 > L1) an indirect (L1 > L2) translations, from a great variety of text types, genres, and registers in a wide variety of languages: audiovisual translations (including dubbing, subtitling for hearing population and for deaf population), scientific, humanistic, literary, economic and legal translation texts. This paper focuses on the work carried out by the Spanish team from the Complutense University (UCMA), which is part of the MUST project, and it describes the specific features of the corpus built by its members. All the texts used by UCMA are either direct or indirect translations between English and Spanish. Students’ profiles comprise translation trainees, foreign language students with a major in English, engineers studying EFL and MA students, all of them with different English levels (from B1 to C1); for some of the students, this would be their first experience with translation. The MUST corpus is searchable via Hypal4MUST, a web-based interface developed by Adam Obrusnik from Masaryk University (Czech Republic), which includes a translation-oriented annotation system (TAS). A distinctive feature of the interface is that it allows source texts and target texts to be aligned, so we can be able to observe and compare in detail both language structures and study translation strategies used by students. The initial data obtained point out the kind of difficulties encountered by the students and reveal the most frequent strategies implemented by the learners according to their level of English, their translation experience and the text genres. We have also found common errors in the graduate and postgraduate university students’ translations: transfer errors, lexical errors, grammatical errors, text-specific translation errors, and cultural-related errors have been identified. Analyzing all these parameters will provide more material to bring better solutions to improve the quality of teaching and the translations produced by the students.

Keywords: corpus studies, students’ corpus, the MUST corpus, translation studies

Procedia PDF Downloads 120

152 Advancing Trustworthy Human-robot Collaboration: Challenges and Opportunities in Diverse European Industrial Settings

Authors: Margarida Porfírio Tomás, Paula Pereira, José Manuel Palma Oliveira

Abstract:

The decline in employment rates across sectors like industry and construction is exacerbated by an aging workforce. This has far-reaching implications for the economy, including skills gaps, labour shortages, productivity challenges due to physical limitations, and workplace safety concerns. To sustain the workforce and pension systems, technology plays a pivotal role. Robots provide valuable support to human workers, and effective human-robot interaction is essential. FORTIS, a Horizon project, aims to address these challenges by creating a comprehensive Human-Robot Interaction (HRI) solution. This solution focuses on multi-modal communication and multi-aspect interaction, with a primary goal of maintaining a human-centric approach. By meeting the needs of both human workers and robots, FORTIS aims to facilitate efficient and safe collaboration. The project encompasses three key activities: 1) A Human-Centric Approach involving data collection, annotation, understanding human behavioural cognition, and contextual human-robot information exchange. 2) A Robotic-Centric Focus addressing the unique requirements of robots during the perception and evaluation of human behaviour. 3) Ensuring Human-Robot Trustworthiness through measures such as human-robot digital twins, safety protocols, and resource allocation. Factor Social, a project partner, will analyse psycho-physiological signals that influence human factors, particularly in hazardous working conditions. The analysis will be conducted using a combination of case studies, structured interviews, questionnaires, and a comprehensive literature review. However, the adoption of novel technologies, particularly those involving human-robot interaction, often faces hurdles related to acceptance. To address this challenge, FORTIS will draw upon insights from Social Sciences and Humanities (SSH), including risk perception and technology acceptance models. Throughout its lifecycle, FORTIS will uphold a human-centric approach, leveraging SSH methodologies to inform the design and development of solutions. This project received funding from European Union’s Horizon 2020/Horizon Europe research and innovation program under grant agreement No 101135707 (FORTIS).

Keywords: skills gaps, productivity challenges, workplace safety, human-robot interaction, human-centric approach, social sciences and humanities, risk perception

Procedia PDF Downloads 20

151 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 124

150 English Pashto Contact: Morphological Adaptation of Bilingual Compound Words in Pashto

Authors: Imran Ullah Imran

Abstract:

Language contact is a familiar concept in the present global world. Across the globe, languages get mixed up at different levels. Borrowing, code-switching are some of the means through which languages interact. This study examines Pashto-English contact at word and syllable levels. By recording the speech of 30 Pashto native speakers, selected via 'social network' sampling, the study located a number of Pashto-English compound words, which is a unique contact of its kind. In data analysis, tokens were categorized on the basis of their pattern and morphological structure. The study shows that Pashto-English Bilingual Compound words (BCWs) are very prevalent in the Pashto language. The study also found that the BCWs in Pashto are completely productive and have their own meanings. It also shows that the dominant pattern of hybrid words in Pashto is the conjugation of an independent English root word followed by a Pashto inflectional morpheme, which contributes to the core semantic content of the construction. The BCWs construction shows that how both the languages are closer to each other. Pashto-English contact results into bilingual compound and hybrid words, which forms a considerable number of tokens in the present-day spoken Pashto. On the basis of these findings, the study assumes that the same phenomenon may increase with the passage of time that would, in turn, result in the formation of more bilingual compound or hybrid words.

Keywords: code-mixing, bilingual compound words, pashto-english contact, hybrid words, inflectional lexical morpheme

Procedia PDF Downloads 220

149 Factor Analysis Based on Semantic Differential of the Public Perception of Public Art: A Case Study of the Malaysia National Monument

Authors: Yuhanis Ibrahim, Sung-Pil Lee

Abstract:

This study attempts to address factors that contribute to outline public art factors assessment, memorial monument specifically. Memorial monuments hold significant and rich message whether the intention of the art is to mark and commemorate important event or to inform younger generation about the past. Public monument should relate to the public and raise awareness about the significant issue. Therefore, by investigating the impact of the existing public memorial art will hopefully shed some lights to the upcoming public art projects’ stakeholders to ensure the lucid memorial message is delivered to the public directly. Public is the main actor as public is the fundamental purpose that the art was created. Perception is framed as one of the reliable evaluation tools to assess the public art impact factors. The Malaysia National Monument was selected to be the case study for the investigation. The public’s perceptions were gathered using a questionnaire that involved (n-115) participants to attain keywords, and next Semantical Differential Methodology (SDM) was adopted to evaluate the perceptions about the memorial monument. These perceptions were then measured with Reliability Factor and then were factorised using Factor Analysis of Principal Component Analysis (PCA) method to acquire concise factors for the monument assessment. The result revealed that there are four factors that influence public’s perception on the monument which are aesthetic, audience, topology, and public reception. The study concludes by proposing the factors for public memorial art assessment for the next future public memorial projects especially in Malaysia.

Keywords: factor analysis, public art, public perception, semantical differential methodology

Procedia PDF Downloads 478

148 Text as Reader Device Improving Subjectivity on the Role of Attestation between Interpretative Semiotics and Discursive Linguistics

Authors: Marco Castagna

Abstract:

Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.Proposed paper is aimed to inquire about the relation between text and reader, focusing on the concept of ‘attestation’. Indeed, despite being widely accepted in semiotic research, even today the concept of text remains uncertainly defined. So, it seems to be undeniable that what is called ‘text’ offers an image of internal cohesion and coherence, that makes it possible to analyze it as an object. Nevertheless, this same object remains problematic when it is pragmatically activated by the act of reading. In fact, as for the T.A.R:D.I.S., that is the unique space-temporal vehicle used by the well-known BBC character Doctor Who in his adventures, every text appears to its own readers not only “bigger inside than outside”, but also offering spaces that change according to the different traveller standing in it. In a few words, as everyone knows, this singular condition raises the questions about the gnosiological relation between text and reader. How can a text be considered the ‘same’, even if it can be read in different ways by different subjects? How can readers can be previously provided with knowledge required for ‘understanding’ a text, but at the same time learning something more from it? In order to explain this singular condition it seems useful to start thinking about text as a device more than an object. In other words, this unique status is more clearly understandable when ‘text’ ceases to be considered as a box designed to move meaning from a sender to a recipient (marking the semiotic priority of the “code”) and it starts to be recognized as performative meaning hypothesis, that is discursively configured by one or more forms and empirically perceivable by means of one or more substances. Thus, a text appears as a “semantic hanger”, potentially offered to the “unending deferral of interpretant", and from time to time fixed as “instance of Discourse”. In this perspective, every reading can be considered as an answer to the continuous request for confirming or denying the meaning configuration (the meaning hypothesis) expressed by text. Finally, ‘attestation’ is exactly what regulates this dynamic of request and answer, through which the reader is able to confirm his previous hypothesis on reality or maybe acquire some new ones.

Keywords: attestation, meaning, reader, text

Procedia PDF Downloads 218

147 Perceiving Casual Speech: A Gating Experiment with French Listeners of L2 English

Authors: Naouel Zoghlami

Abstract:

Spoken-word recognition involves the simultaneous activation of potential word candidates which compete with each other for final correct recognition. In continuous speech, the activation-competition process gets more complicated due to speech reductions existing at word boundaries. Lexical processing is more difficult in L2 than in L1 because L2 listeners often lack phonetic, lexico-semantic, syntactic, and prosodic knowledge in the target language. In this study, we investigate the on-line lexical segmentation hypotheses that French listeners of L2 English form and then revise as subsequent perceptual evidence is revealed. Our purpose is to shed further light on the processes of L2 spoken-word recognition in context and better understand L2 listening difficulties through a comparison of skilled and unskilled reactions at the point where their working hypothesis is rejected. We use a variant of the gating experiment in which subjects transcribe an English sentence presented in increments of progressively greater duration. The spoken sentence was “And this amazing athlete has just broken another world record”, chosen mainly because it included common reductions and phonetic features in English, such as elision and assimilation. Our preliminary results show that there is an important difference in the manner in which proficient and less-proficient L2 listeners handle connected speech. Less-proficient listeners delay recognition of words as they wait for lexical and syntactic evidence to appear in the gates. Further statistical results are currently being undertaken.

Keywords: gating paradigm, spoken word recognition, online lexical segmentation, L2 listening

Procedia PDF Downloads 442

146 Characteristic Sentence Stems in Academic English Texts: Definition, Identification, and Extraction

Authors: Jingjie Li, Wenjie Hu

Abstract:

Phraseological units in academic English texts have been a central focus in recent corpus linguistic research. A wide variety of phraseological units have been explored, including collocations, chunks, lexical bundles, patterns, semantic sequences, etc. This paper describes a special category of clause-level phraseological units, namely, Characteristic Sentence Stems (CSSs), with a view to describing their defining criteria and extraction method. CSSs are contiguous lexico-grammatical sequences which contain a subject-predicate structure and which are frame expressions characteristic of academic writing. The extraction of CSSs consists of six steps: Part-of-speech tagging, n-gram segmentation, structure identification, significance of occurrence calculation, text range calculation, and overlapping sequence reduction. Significance of occurrence calculation is the crux of this study. It includes the computing of both the internal association and the boundary independence of a CSS and tests the occurring significance of the CSS from both inside and outside perspectives. A new normalization algorithm is also introduced into the calculation of LocalMaxs for reducing overlapping sequences. It is argued that many sentence stems are so recurrent in academic texts that the most typical of them have become the habitual ways of making meaning in academic writing. Therefore, studies of CSSs could have potential implications and reference value for academic discourse analysis, English for Academic Purposes (EAP) teaching and writing.

Keywords: characteristic sentence stem, extraction method, phraseological unit, the statistical measure

Procedia PDF Downloads 140

145 Interaction Design In Home Appliance: An Integrated Approach InKanseiAnd Hedonomic “Cases: Rice Cooker, Juicer, Mixer”

Authors: Sara Mostowfi, Hassan Sadeghinaeini, Sana Behnamasl, Leila Ensaniat, Maryam Mostafaee

Abstract:

Nowadays, most of product producers, e.g. home appliance, electronic machines and vehicles focus on quality and comfort, and promise consumers ease of use and pleasurable experiences during product using. Consumers make their purchase decisions according to two needs: functional and emotional needs. Functional needs are fulfilled by product functionality, besides emotional needs are related to psychologists’ aspects of production. Emotions are distinctive elements which should be added to products and services to lead them up. In this case, the authors’ survey conducted pleasurable and hedonomic aspects in products of a home appliance company in Iran. In this regard, three samples of home appliance were selected: mixer, rice cooker, iron. Fifteen women (20-60) participated in this study. Every user evaluated each product by questionnaire based on 7 point semantic differential scale. After analyzing the results with statistical methods, results showed that 90% of users aren’t satisfied with hedonic and pleasurable criteria in interaction with these products. They notified that regarding hedonomics and pleasurable criteria’s they will have better ease of use and functionality. Our findings show a significant association between products’ features and user satisfaction. It seems that industrial design has a significant impression on the company’s products and with regard the pleasurable criteria the company sales will be more successful.

Keywords: home appliance, interaction, pleasure, hedonomy, ergonomy

Procedia PDF Downloads 355

144 Saudi Arabia Border Security Informatics: Challenges of a Harsh Environment

Authors: Syed Ahsan, Saleh Alshomrani, Ishtiaq Rasool, Ali Hassan

Abstract:

In this oral presentation, we will provide an overview of the technical and semantic architecture of a desert border security and critical infrastructure protection security system. Modern border security systems are designed to reduce the dependability and intrusion of human operators. To achieve this, different types of sensors are use along with video surveillance technologies. Application of these technologies in a harsh desert environment of Saudi Arabia poses unique challenges. Environmental and geographical factors including high temperatures, desert storms, temperature variations and remoteness adversely affect the reliability of surveillance systems. To successfully implement a reliable, effective system in a harsh desert environment, the following must be achieved: i) Selection of technology including sensors, video cameras, and communication infrastructure that suit desert environments. ii) Reduced power consumption and efficient usage of equipment to increase the battery life of the equipment. iii) A reliable and robust communication network with efficient usage of bandwidth. Also, to reduce the expert bottleneck, an ontology-based intelligent information systems needs to be developed. Domain knowledge unique and peculiar to Saudi Arabia needs to be formalized to develop an expert system that can detect abnormal activities and any intrusion.

Keywords: border security, sensors, abnormal activity detection, ontologies

Procedia PDF Downloads 457

143 Cognitive and Behavioral Disorders in Patients with Precuneal Infarcts

Authors: F. Ece Cetin, H. Nezih Ozdemir, Emre Kumral

Abstract:

Ischemic stroke of the precuneal cortex (PC) alone is extremely rare. This study aims to evaluate the clinical, neurocognitive, and behavioural characteristics of isolated PC infarcts. We assessed neuropsychological and behavioral findings in 12 patients with isolated PC infarct among 3800 patients with ischemic stroke. To determine the most frequently affected brain locus in patients, we first overlapped the ischemic area of patients with specific cognitive disorders and patients without specific cognitive disorders. Secondly, we compared both overlap maps using the 'subtraction plot' function of MRIcroGL. Patients showed various types of cognitive disorders. All patients experienced more than one category of cognitive disorder, except for two patients with only one cognitive disorder. Lesion topographical analysis showed that damage within the anterior precuneal region might lead to consciousness disorders (25%), self-processing impairment (42%), visuospatial disorders (58%), and lesions in the posterior precuneal region caused episodic and semantic memory impairment (33%). The whole precuneus is involved in at least one body awareness disorder. The cause of the stroke was cardioembolism in 5 patients (42%), large artery disease in 3 (25%), and unknown in 4 (33%). This study showed a wide variety of neuropsychological and behavioural disorders in patients with precuneal infarct. Future studies are needed to achieve a proper definition of the function of the precuneus in relation to the extended cortical areas. Precuneal cortex region infarcts have been found to predict a source of embolism from the large arteries or heart.

Keywords: cognition, pericallosal artery, precuneal cortex, ischemic stroke

Procedia PDF Downloads 105

142 Genomic Sequence Representation Learning: An Analysis of K-Mer Vector Embedding Dimensionality

Authors: James Jr. Mashiyane, Risuna Nkolele, Stephanie J. Müller, Gciniwe S. Dlamini, Rebone L. Meraba, Darlington S. Mapiye

Abstract:

When performing language tasks in natural language processing (NLP), the dimensionality of word embeddings is chosen either ad-hoc or is calculated by optimizing the Pairwise Inner Product (PIP) loss. The PIP loss is a metric that measures the dissimilarity between word embeddings, and it is obtained through matrix perturbation theory by utilizing the unitary invariance of word embeddings. Unlike in natural language, in genomics, especially in genome sequence processing, unlike in natural language processing, there is no notion of a “word,” but rather, there are sequence substrings of length k called k-mers. K-mers sizes matter, and they vary depending on the goal of the task at hand. The dimensionality of word embeddings in NLP has been studied using the matrix perturbation theory and the PIP loss. In this paper, the sufficiency and reliability of applying word-embedding algorithms to various genomic sequence datasets are investigated to understand the relationship between the k-mer size and their embedding dimension. This is completed by studying the scaling capability of three embedding algorithms, namely Latent Semantic analysis (LSA), Word2Vec, and Global Vectors (GloVe), with respect to the k-mer size. Utilising the PIP loss as a metric to train embeddings on different datasets, we also show that Word2Vec outperforms LSA and GloVe in accurate computing embeddings as both the k-mer size and vocabulary increase. Finally, the shortcomings of natural language processing embedding algorithms in performing genomic tasks are discussed.

Keywords: word embeddings, k-mer embedding, dimensionality reduction

Procedia PDF Downloads 107

141 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 315

140 Progress in Combining Image Captioning and Visual Question Answering Tasks

Authors: Prathiksha Kamath, Pratibha Jamkhandi, Prateek Ghanti, Priyanshu Gupta, M. Lakshmi Neelima

Abstract:

Combining Image Captioning and Visual Question Answering (VQA) tasks have emerged as a new and exciting research area. The image captioning task involves generating a textual description that summarizes the content of the image. VQA aims to answer a natural language question about the image. Both these tasks include computer vision and natural language processing (NLP) and require a deep understanding of the content of the image and semantic relationship within the image and the ability to generate a response in natural language. There has been remarkable growth in both these tasks with rapid advancement in deep learning. In this paper, we present a comprehensive review of recent progress in combining image captioning and visual question-answering (VQA) tasks. We first discuss both image captioning and VQA tasks individually and then the various ways in which both these tasks can be integrated. We also analyze the challenges associated with these tasks and ways to overcome them. We finally discuss the various datasets and evaluation metrics used in these tasks. This paper concludes with the need for generating captions based on the context and captions that are able to answer the most likely asked questions about the image so as to aid the VQA task. Overall, this review highlights the significant progress made in combining image captioning and VQA, as well as the ongoing challenges and opportunities for further research in this exciting and rapidly evolving field, which has the potential to improve the performance of real-world applications such as autonomous vehicles, robotics, and image search.

Keywords: image captioning, visual question answering, deep learning, natural language processing

Procedia PDF Downloads 51

139 Text Localization in Fixed-Layout Documents Using Convolutional Networks in a Coarse-to-Fine Manner

Authors: Beier Zhu, Rui Zhang, Qi Song

Abstract:

Text contained within fixed-layout documents can be of great semantic value and so requires a high localization accuracy, such as ID cards, invoices, cheques, and passports. Recently, algorithms based on deep convolutional networks achieve high performance on text detection tasks. However, for text localization in fixed-layout documents, such algorithms detect word bounding boxes individually, which ignores the layout information. This paper presents a novel architecture built on convolutional neural networks (CNNs). A global text localization network and a regional bounding-box regression network are introduced to tackle the problem in a coarse-to-fine manner. The text localization network simultaneously locates word bounding points, which takes the layout information into account. The bounding-box regression network inputs the features pooled from arbitrarily sized RoIs and refine the localizations. These two networks share their convolutional features and are trained jointly. A typical type of fixed-layout documents: ID cards, is selected to evaluate the effectiveness of the proposed system. These networks are trained on data cropped from nature scene images, and synthetic data produced by a synthetic text generation engine. Experiments show that our approach locates high accuracy word bounding boxes and achieves state-of-the-art performance.

Keywords: bounding box regression, convolutional networks, fixed-layout documents, text localization

Procedia PDF Downloads 167