Search results for: semantic relatedness
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 573

Search results for: semantic relatedness

183 Corpus-Based Description of Core English Nouns of Pakistani English, an EFL Learner Perspective at Secondary Level

Authors: Abrar Hussain Qureshi

Abstract:

Vocabulary has been highlighted as a key indicator in any foreign language learning program, especially English as a foreign language (EFL). It is often considered a potential tool in foreign language curriculum, and its deficiency impedes successful communication in the target language. The knowledge of the lexicon is very significant in getting communicative competence and performance. Nouns constitute a considerable bulk of English vocabulary. Rather, they are the bones of the English language and are the main semantic carrier in spoken and written discourse. As nouns dominate the bulk of the English lexicon, their role becomes all the more potential. The undertaken research is a systematic effort in this regard to work out a list of highly frequent list of Pakistani English nouns for the EFL learners at the secondary level. It will encourage autonomy for the EFL learners as well as will save their time. The corpus used for the research has been developed locally from leading English newspapers of Pakistan. Wordsmith Tools has been used to process the research data and to retrieve word list of frequent Pakistani English nouns. The retrieved list of core Pakistani English nouns is supposed to be useful for English language learners at the secondary level as it covers a wide range of speech events.

Keywords: corpus, EFL, frequency list, nouns

Procedia PDF Downloads 100
182 Salmonella Emerging Serotypes in Northwestern Italy: Genetic Characterization by Pulsed-Field Gel Electrophoresis

Authors: Clara Tramuta, Floris Irene, Daniela Manila Bianchi, Monica Pitti, Giulia Federica Cazzaniga, Lucia Decastelli

Abstract:

This work presents the results obtained by the Regional Reference Centre for Salmonella Typing (CeRTiS) in a retrospective study aimed to investigate, through Pulsed-field Gel Electrophoresis (PFGE) analysis, the genetic relatedness of emerging Salmonella serotypes of human origin circulating in North-West of Italy. Furthermore, the goal of this work was to create a Regional database to facilitate foodborne outbreak investigation and to monitor them at an earlier stage. A total of 112 strains, isolated from 2016 to 2018 in hospital laboratories, were included in this study. The isolates were previously identified as Salmonella according to standard microbiological techniques and serotyping was performed according to ISO 6579-3 and the Kaufmann-White scheme using O and H antisera (Statens Serum Institut®). All strains were characterized by PFGE: analysis was conducted according to a standardized PulseNet protocol. The restriction enzyme XbaI was used to generate several distinguishable genomic fragments on the agarose gel. PFGE was performed on a CHEF Mapper system, separating large fragments and generating comparable genetic patterns. The agarose gel was then stained with GelRed® and photographed under ultraviolet transillumination. The PFGE patterns obtained from the 112 strains were compared using Bionumerics version 7.6 software with the Dice coefficient with 2% band tolerance and 2% optimization. For each serotype, the data obtained with the PFGE were compared according to the geographical origin and the year in which they were isolated. Salmonella strains were identified as follow: S. Derby n. 34; S. Infantis n. 38; S. Napoli n. 40. All the isolates had appreciable restricted digestion patterns ranging from approximately 40 to 1100 kb. In general, a fairly heterogeneous distribution of pulsotypes has emerged in the different provinces. Cluster analysis indicated high genetic similarity (≥ 83%) among strains of S. Derby (n. 30; 88%), S. Infantis (n. 36; 95%) and S. Napoli (n. 38; 95%) circulating in north-western Italy. The study underlines the genomic similarities shared by the emerging Salmonella strains in Northwest Italy and allowed to create a database to detect outbreaks in an early stage. Therefore, the results confirmed that PFGE is a powerful and discriminatory tool to investigate the genetic relationships among strains in order to monitoring and control Salmonellosis outbreak spread. Pulsed-field gel electrophoresis (PFGE) still represents one of the most suitable approaches to characterize strains, in particular for the laboratories for which NGS techniques are not available.

Keywords: emerging Salmonella serotypes, genetic characterization, human strains, PFGE

Procedia PDF Downloads 105
181 An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory

Authors: Yin Yuanling

Abstract:

A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task.

Keywords: event relations, deep learning, DFRNN models, bi-directional long and short-term memory networks

Procedia PDF Downloads 143
180 The Impact of Financial News and Press Freedom on Abnormal Returns around Earnings Announcements in Greater China

Authors: Yu-Chen Wei, Yang-Cheng Lu, I-Chi Lin

Abstract:

This study examines the impacts of news sentiment and press freedom on abnormal returns during the earnings announcement in greater China including the Shanghai, Shenzhen and Taiwan stock markets. The news sentiment ratio is calculated by using the content analysis of semantic orientation. The empirical results show that news released prior to the event date may decrease the cumulative abnormal returns prior to the earnings announcement regardless of whether it is released in China or Taiwan. By contrast, companies with optimistic financial news may increase the cumulative abnormal returns during the announcement date. Furthermore, the difference in terms of press freedom is considered in greater China to compare the impact of press freedom on abnormal returns. The findings show that, the freer the press is, the more negatively significant will be the impact of news on the abnormal returns, which means that the press freedom may decrease the ability of the news to impact the abnormal returns. The intuition is that investors may receive alternative news related to each company in the market with greater press freedom, which proves the efficiency of the market and reduces the possible excess returns.

Keywords: news, press freedom, Greater China, earnings announcement, abnormal returns

Procedia PDF Downloads 392
179 How to Teach Italian Intransitive Verbs: Focusing on Unaccusatives and Unergatives

Authors: Joung Hyoun Lee

Abstract:

Intransitive verbs consist of two subclasses called unergatives and unaccusatives. However, traditionally Italian intransitive verbs have been taught regardless their semantic distinctions and any mention of grammatical terms such as unaccusatives and unergatives even though there is a huge gap between them. This paper aims to explore the teaching of Italian intransitive verbs categorizing them into unaccusatives and unergatives, which is compared with researches on the teaching of English unaccusative and unergative verbs. For this purpose, first, the study analyses various aspects of English vs. Italian unergatives and unaccusatives, and their properties of the constructions. Next, this study highlights the research trend on Korean students' learning errors, which is leaning toward causal analyses of the over passivization of English unaccusative verbs. In order to investigate these issues, 53 students of the Busan University of Foreign Studies, who are studying Italian language as a second language, were surveyed through a grammaticality judgment test divided into 9 sections. As expected, the findings confirmed that the test results of Italian unaccusatives and unergatives showed similar and different aspects comparing to those of English. Moreover, there was a highly affirmative demand for a more careful way of teaching which should be considered both syntactically and semantically according to the grammatical items. The research provides a framework of a more effective and systematic teaching method of Italian intransitive verbs for further research.

Keywords: unaccusative verbs, unergative verbs, agent, patient, theme, overpassivization

Procedia PDF Downloads 256
178 Definite Article Errors and Effect of L1 Transfer

Authors: Bimrisha Mali

Abstract:

The present study investigates the type of errors English as a second language (ESL) learners produce using the definite article ‘the’. The participants were provided a questionnaire on the learner's ability test. The questionnaire consists of three cloze tests and two free composition tests. Each participant's response was received in the form of written data. A total of 78 participants from three government schools participated in the study. The participants are high-school students from Rural Assam. Assam is a north-eastern state of India. Their age ranged between 14-15. The medium of instruction and the communication among the students take place in the local language, i.e., Assamese. Pit Corder’s steps for conducting error analysis have been followed for the analysis procedure. Four types of errors were found (1) deletion of the definite article, (2) use of the definite article as modifiers as adjectives, (3) incorrect use of the definite article with singular proper nouns, (4) substitution of the definite article by the indefinite article ‘a’. Classifiers in Assamese that express definiteness is used with nouns, adjectives, and numerals. It is found that native language (L1) transfer plays a pivotal role in the learners’ errors. The analysis reveals the learners' inability to acquire the semantic connotation of definiteness in English due to native language (L1) interference.

Keywords: definite article error, l1 transfer, error analysis, ESL

Procedia PDF Downloads 121
177 Laying the Proto-Ontological Conditions for Floating Architecture as a Climate Adaptation Solution for Rising Sea Levels: Conceptual Framework and Definition of a Performance Based Design

Authors: L. Calcagni, A. Battisti, M. Hensel, D. S. Hensel

Abstract:

Since the beginning of the 21st century, we have seen a dynamic growth of water-based (WB) architecture, mainly due to the increasing threat of floods caused by sea level rise and heavy rains, all correlated with climate change. At the same time, the shortage of land available for urban development also led architects, engineers, and policymakers to reclaim the seabed or to build floating structures. Furthermore, the drive to produce energy from renewable resources has expanded the sector of offshore research, mining, and energy industry which seeks new types of WB structures. In light of these considerations, the time is ripe to consider floating architecture as a full-fledged building typology. Currently, there is no universally recognized academic definition of a floating building. Research on floating architecture lacks a proper, commonly shared vocabulary and typology distinction. Moreover, there is no global international legal framework for urban development on water, and there is no structured performance based building design (PBBD) approach for floating architecture in most countries, let alone national regulatory systems. Thus, first of all, the research intends to overcome the semantic and typological issues through the conceptualization of floating architecture, laying the proto-ontological conditions for floating development, and secondly to identify the parameters to be considered in the definition of a specific PBBD framework, setting the scene for national planning strategies. The theoretical overview and re-semanticization process involve the attribution of a new meaning to the term floating architecture. This terminological work of semantic redetermination is carried out through a systematic literature review and involves quantitative and historical research as well as logical argumentation methods. As it is expected that floating urban development is most likely to take place as an extension of coastal areas, the needs and design criteria are definitely more similar to those of the urban environment than to those of the offshore industry. Therefore, the identification and categorization of parameters –looking towards the potential formation of a PBBD framework for floating development– takes the urban and architectural guidelines and regulations as the starting point, taking the missing aspects, such as hydrodynamics (i.e. stability and buoyancy) from the offshore and shipping regulatory frameworks. This study is carried out through an evidence-based assessment of regulatory systems that are effective in different countries around the world, addressing on-land and on-water architecture as well as offshore and shipping industries. It involves evidence-based research and logical argumentation methods. Overall, inhabiting water is proposed not only as a viable response to the problem of rising sea levels, thus as a resilient frontier for urban development, but also as a response to energy insecurity, clean water, and food shortages, environmental concerns, and urbanization, in line with Blue Economy principles and the Agenda 2030. This review shows how floating architecture is to all intents and purposes, an urban adaptation measure and a solution towards self-sufficiency and energy-saving objectives. Moreover, the adopted methodology is, to all extents, open to further improvements and integrations, thus not rigid and already completely determined. Along with new designs and functions that will come into play in the practice field, eventually, life on water will seem no more unusual than life on land, especially by virtue of the multiple advantages it provides not only to users but also to the environment.

Keywords: adaptation measures, building typology, floating architecture, performance based building design, rising sea levels

Procedia PDF Downloads 96
176 Selecting Answers for Questions with Multiple Answer Choices in Arabic Question Answering Based on Textual Entailment Recognition

Authors: Anes Enakoa, Yawei Liang

Abstract:

Question Answering (QA) system is one of the most important and demanding tasks in the field of Natural Language Processing (NLP). In QA systems, the answer generation task generates a list of candidate answers to the user's question, in which only one answer is correct. Answer selection is one of the main components of the QA, which is concerned with selecting the best answer choice from the candidate answers suggested by the system. However, the selection process can be very challenging especially in Arabic due to its particularities. To address this challenge, an approach is proposed to answer questions with multiple answer choices for Arabic QA systems based on Textual Entailment (TE) recognition. The developed approach employs a Support Vector Machine that considers lexical, semantic and syntactic features in order to recognize the entailment between the generated hypotheses (H) and the text (T). A set of experiments has been conducted for performance evaluation and the overall performance of the proposed method reached an accuracy of 67.5% with C@1 score of 80.46%. The obtained results are promising and demonstrate that the proposed method is effective for TE recognition task.

Keywords: information retrieval, machine learning, natural language processing, question answering, textual entailment

Procedia PDF Downloads 144
175 Francophone University Students' Attitudes Towards English Accents in Cameroon

Authors: Eric Agrie Ambele

Abstract:

The norms and models for learning pronunciation in relation to the teaching and learning of English pronunciation are key issues nowadays in English Language Teaching in ESL contexts. This paper discusses these issues based on a study on the attitudes of some Francophone university students in Cameroon towards three English accents spoken in Cameroon: Cameroon Francophone English (CamFE), Cameroon English (CamE), and Hyperlectal Cameroon English (near standard British English). With the desire to know more about the treatment that these English accents receive among these students, an aspect that had hitherto received little attention in the literature, a language attitude questionnaire, and the matched-guise technique was used to investigate this phenomenon. Two methods of data analysis were employed: (1) the percentage count procedure, and (2) the semantic differential scale. The findings reveal that the participants’ attitudes towards the selected accents vary in degree. Though Hyperlectal CamE emerged first, CamE second and CamFE third, no accent, on average, received a negative evaluation. It can be deduced from this findings that, first, CamE is gaining more and more recognition and can stand as an autonomous accent; second, that the participants all rated Hyperlectal CamE higher than CamE implies that they would be less motivated in a context where CamE is the learning model. By implication, in the teaching of English pronunciation to francophone learners learning English in Cameroon, Hyperlectal Cameroon English should be the model.

Keywords: teaching pronunciation, English accents, Francophone learners, attitudes

Procedia PDF Downloads 192
174 Genetic Diversity of Termite (Isoptera) Fauna of Western Ghats of India

Authors: A. S. Vidyashree, C. M. Kalleshwaraswamy, R. Asokan, H. M. Mahadevaswamy

Abstract:

Termites are very vital ecological thespians in tropical ecosystem, having been designated as “ecosystem engineers”, due to their significant role in providing soil ecosystem services. Despite their importance, our understanding of a number of their basic biological processes in termites is extremely limited. Developing a better understanding of termite biology is closely dependent upon consistent species identification. At present, identification of termites is relied on soldier castes. But for many species, soldier caste is not reported, that creates confusion in identification. The use of molecular markers may be helpful in estimating phylogenetic relatedness between the termite species and estimating genetic differentiation among local populations within each species. To understand this, termites samples were collected from various places of Western Ghats covering four states namely Karnataka, Kerala, Tamil Nadu, Maharashtra during 2013-15. Termite samples were identified based on their morphological characteristics, molecular characteristics, or both. Survey on the termite fauna in Karnataka, Kerala, Maharashtra and Tamil Nadu indicated the presence of a 16 species belongs to 4 subfamilies under two families viz., Rhinotermitidae and Termitidae. Termititidae was the dominant family which was belonging to 4 genera and four subfamilies viz., Macrotermitinae, Amitermitinae, Nasutitermitinae and Termitinae. Amitermitinae had three species namely, Microcerotermes fletcheri, M. pakistanicus and Speculitermes sinhalensis. Macrotermitinae had the highest number of species belonging two genera, namely Microtermes and Odontotermes. Microtermes genus was with only one species i.e., Microtermes obesi. The genus Odontotermes was represented by the highest number of species (07), namely, O. obesus was the dominant (41 per cent) and the most widely distributed species in Karnataka, Karala, Maharashtra and Tamil nadu followed by O. feae (19 per cent), O.assmuthi (11 per cent) and others like O. bellahunisensis O. horni O. redemanni, O. yadevi. Nasutitermitinae was represented by two genera namely Nasutitermes anamalaiensis and Trinervitermes biformis. Termitinae subfamily was represented by Labiocapritermes distortus. Rhinotermitidae was represented by single subfamily Heterotermetinae. In Heterotermetinae, two species namely Heterotermes balwanthi and H. malabaricus were recorded. Genetic relationship among termites collected from various locations of Western Ghats of India was characterized based on mitochondrial DNA sequences (12S, 16S, and COII). Sequence analysis and divergence among the species was assessed. These results suggest that the use of both molecular and morphological approaches is crucial in ensuring accurate species identification. Efforts were made to understand their evolution and to address the ambiguities in morphological taxonomy. The implication of the study in revising the taxonomy of Indian termites, their characterization and molecular comparisons between the sequences are discussed.

Keywords: isoptera, mitochondrial DNA sequences, rhinotermitidae, termitidae, Western ghats

Procedia PDF Downloads 265
173 Deep Supervision Based-Unet to Detect Buildings Changes from VHR Aerial Imagery

Authors: Shimaa Holail, Tamer Saleh, Xiongwu Xiao

Abstract:

Building change detection (BCD) from satellite imagery is an essential topic in urbanization monitoring, agricultural land management, and updating geospatial databases. Recently, methods for detecting changes based on deep learning have made significant progress and impressive results. However, it has the problem of being insensitive to changes in buildings with complex spectral differences, and the features being extracted are not discriminatory enough, resulting in incomplete buildings and irregular boundaries. To overcome these problems, we propose a dual Siamese network based on the Unet model with the addition of a deep supervision strategy (DS) in this paper. This network consists of a backbone (encoder) based on ImageNet pre-training, a fusion block, and feature pyramid networks (FPN) to enhance the step-by-step information of the changing regions and obtain a more accurate BCD map. To train the proposed method, we created a new dataset (EGY-BCD) of high-resolution and multi-temporal aerial images captured over New Cairo in Egypt to detect building changes for this purpose. The experimental results showed that the proposed method is effective and performs well with the EGY-BCD dataset regarding the overall accuracy, F1-score, and mIoU, which were 91.6 %, 80.1 %, and 73.5 %, respectively.

Keywords: building change detection, deep supervision, semantic segmentation, EGY-BCD dataset

Procedia PDF Downloads 118
172 A Novel Framework for User-Friendly Ontology-Mediated Access to Relational Databases

Authors: Efthymios Chondrogiannis, Vassiliki Andronikou, Efstathios Karanastasis, Theodora Varvarigou

Abstract:

A large amount of data is typically stored in relational databases (DB). The latter can efficiently handle user queries which intend to elicit the appropriate information from data sources. However, direct access and use of this data requires the end users to have an adequate technical background, while they should also cope with the internal data structure and values presented. Consequently the information retrieval is a quite difficult process even for IT or DB experts, taking into account the limited contributions of relational databases from the conceptual point of view. Ontologies enable users to formally describe a domain of knowledge in terms of concepts and relations among them and hence they can be used for unambiguously specifying the information captured by the relational database. However, accessing information residing in a database using ontologies is feasible, provided that the users are keen on using semantic web technologies. For enabling users form different disciplines to retrieve the appropriate data, the design of a Graphical User Interface is necessary. In this work, we will present an interactive, ontology-based, semantically enable web tool that can be used for information retrieval purposes. The tool is totally based on the ontological representation of underlying database schema while it provides a user friendly environment through which the users can graphically form and execute their queries.

Keywords: ontologies, relational databases, SPARQL, web interface

Procedia PDF Downloads 271
171 Using Closed Frequent Itemsets for Hierarchical Document Clustering

Authors: Cheng-Jhe Lee, Chiun-Chieh Hsu

Abstract:

Due to the rapid development of the Internet and the increased availability of digital documents, the excessive information on the Internet has led to information overflow problem. In order to solve these problems for effective information retrieval, document clustering in text mining becomes a popular research topic. Clustering is the unsupervised classification of data items into groups without the need of training data. Many conventional document clustering methods perform inefficiently for large document collections because they were originally designed for relational database. Therefore they are impractical in real-world document clustering and require special handling for high dimensionality and high volume. We propose the FIHC (Frequent Itemset-based Hierarchical Clustering) method, which is a hierarchical clustering method developed for document clustering, where the intuition of FIHC is that there exist some common words for each cluster. FIHC uses such words to cluster documents and builds hierarchical topic tree. In this paper, we combine FIHC algorithm with ontology to solve the semantic problem and mine the meaning behind the words in documents. Furthermore, we use the closed frequent itemsets instead of only use frequent itemsets, which increases efficiency and scalability. The experimental results show that our method is more accurate than those of well-known document clustering algorithms.

Keywords: FIHC, documents clustering, ontology, closed frequent itemset

Procedia PDF Downloads 397
170 Conceptual Model for Massive Open Online Blended Courses Based on Disciplines’ Concepts Capitalization and Obstacles’ Detection

Authors: N. Hammid, F. Bouarab-Dahmani, T. Berkane

Abstract:

Since its appearance, the MOOC (massive open online course) is gaining more and more intention of the educational communities over the world. Apart from the current MOOCs design and purposes, the creators of MOOC focused on the importance of the connection and knowledge exchange between individuals in learning. In this paper, we present a conceptual model for massive open online blended courses where teachers over the world can collaborate and exchange their experience to get a common efficient content designed as a MOOC opened to their students to live a better learning experience. This model is based on disciplines’ concepts capitalization and the detection of the obstacles met by their students when faced with problem situations (exercises, projects, case studies, etc.). This detection is possible by analyzing the frequently of semantic errors committed by the students. The participation of teachers in the design of the course and the attendance by their students can guarantee an efficient and extensive participation (an important number of participants) in the course, the learners’ motivation and the evaluation issues, in the way that the teachers designing the course assess their students. Thus, the teachers review, together with their knowledge, offer a better assessment and efficient connections to their students.

Keywords: massive open online course, MOOC, online learning, e-learning

Procedia PDF Downloads 266
169 English Grammatical Errors of Arabic Sentence Translations Done by Machine Translations

Authors: Muhammad Fathurridho

Abstract:

Grammar as a rule used by every language to be understood by everyone is always related to syntax and morphology. Arabic grammar is different with another languages’ grammars. It has more rules and difficulties. This paper aims to investigate and describe the English grammatical errors of machine translation systems in translating Arabic sentences, including declarative, exclamation, imperative, and interrogative sentences, specifically in year 2018 which can be supported with artificial intelligence’s role. The Arabic sample sentences which are divided into two; verbal and nominal sentence of several Arabic published texts will be examined as the source language samples. The translated sentences done by several popular online machine translation systems, including Google Translate, Microsoft Bing, Babylon, Facebook, Hellotalk, Worldlingo, Yandex Translate, and Tradukka Translate are the material objects of this research. Descriptive method that will be taken to finish this research will show the grammatical errors of English target language, and classify them. The conclusion of this paper has showed that the grammatical errors of machine translation results are varied and generally classified into morphological, syntactical, and semantic errors in all type of Arabic words (Noun, Verb, and Particle), and it will be one of the evaluations for machine translation’s providers to correct them in order to improve their understandable results.

Keywords: Arabic, Arabic-English translation, machine translation, grammatical errors

Procedia PDF Downloads 153
168 How Unicode Glyphs Revolutionized the Way We Communicate

Authors: Levi Corallo

Abstract:

Typed language made by humans on computers and cell phones has made a significant distinction from previous modes of written language exchanges. While acronyms remain one of the most predominant markings of typed language, another and perhaps more recent revolution in the way humans communicate has been with the use of symbols or glyphs, primarily Emojis—globally introduced on the iPhone keyboard by Apple in 2008. This paper seeks to analyze the use of symbols in typed communication from both a linguistic and machine learning perspective. The Unicode system will be explored and methods of encoding will be juxtaposed with the current machine and human perception. Topics in how typed symbol usage exists in conversation will be explored as well as topics across current research methods dealing with Emojis like sentiment analysis, predictive text models, and so on. This study proposes that sequential analysis is a significant feature for analyzing unicode characters in a corpus with machine learning. Current models that are trying to learn or translate the meaning of Emojis should be starting to learn using bi- and tri-grams of Emoji, as well as observing the relationship between combinations of different Emoji in tandem. The sociolinguistics of an entire new vernacular of language referred to here as ‘typed language’ will also be delineated across my analysis with unicode glyphs from both a semantic and technical perspective.

Keywords: unicode, text symbols, emojis, glyphs, communication

Procedia PDF Downloads 191
167 An Experiential Learning of Ontology-Based Multi-document Summarization by Removal Summarization Techniques

Authors: Pranjali Avinash Yadav-Deshmukh

Abstract:

Remarkable development of the Internet along with the new technological innovation, such as high-speed systems and affordable large storage space have led to a tremendous increase in the amount and accessibility to digital records. For any person, studying of all these data is tremendously time intensive, so there is a great need to access effective multi-document summarization (MDS) systems, which can successfully reduce details found in several records into a short, understandable summary or conclusion. For semantic representation of textual details in ontology area, as a theoretical design, our system provides a significant structure. The stability of using the ontology in fixing multi-document summarization problems in the sector of catastrophe control is finding its recommended design. Saliency ranking is usually allocated to each phrase and phrases are rated according to the ranking, then the top rated phrases are chosen as the conclusion. With regards to the conclusion quality, wide tests on a selection of media announcements are appropriate for “Jammu Kashmir Overflow in 2014” records. Ontology centered multi-document summarization methods using “NLP centered extraction” outshine other baselines. Our participation in recommended component is to implement the details removal methods (NLP) to enhance the results.

Keywords: disaster management, extraction technique, k-means, multi-document summarization, NLP, ontology, sentence extraction

Procedia PDF Downloads 384
166 Information Disclosure And Financial Sentiment Index Using a Machine Learning Approach

Authors: Alev Atak

Abstract:

In this paper, we aim to create a financial sentiment index by investigating the company’s voluntary information disclosures. We retrieve structured content from BIST 100 companies’ financial reports for the period 1998-2018 and extract relevant financial information for sentiment analysis through Natural Language Processing. We measure strategy-related disclosures and their cross-sectional variation and classify report content into generic sections using synonym lists divided into four main categories according to their liquidity risk profile, risk positions, intra-annual information, and exposure to risk. We use Word Error Rate and Cosin Similarity for comparing and measuring text similarity and derivation in sets of texts. In addition to performing text extraction, we will provide a range of text analysis options, such as the readability metrics, word counts using pre-determined lists (e.g., forward-looking, uncertainty, tone, etc.), and comparison with reference corpus (word, parts of speech and semantic level). Therefore, we create an adequate analytical tool and a financial dictionary to depict the importance of granular financial disclosure for investors to identify correctly the risk-taking behavior and hence make the aggregated effects traceable.

Keywords: financial sentiment, machine learning, information disclosure, risk

Procedia PDF Downloads 92
165 Number Variation of the Personal Pronoun we Used by Chinese English Learners

Authors: Qiong Hu, Ming Yue

Abstract:

Language variation signals the newest usage of language community, which might become the developmental trend of that language. However, language textbooks cannot keep up with these emergent usages. Most Chinese English learners nowadays are still exposed to traditional grammar prescribed in the textbook so that some variational usages cannot be acquired. The personal pronoun we is prescribed as a plural pronoun in the textbook grammar, but its number value is more flexible in actual use. Based on the Chinese Learner English Corpus (CLEC), and with the homemade Friends corpus as reference, the present research explores the number value of the first person pronoun we used by Chinese English learners. With consideration of the subjectivity of we, this paper annotated the number value of all the wes in “we+ PCU (Perception-cognation-utterance) verbs” collocations. Results show that though exposed to traditional textbooks which prescribe the plural reference of we, there still exists some unconventional usage (singular or vague in reference) in the writings of Chinese English learners, which is less frequent than that of the native speeches. Corpus data and results from manual semantic annotation show that this could be due to the impact of formulaic sequence on the learners and the positive transfer from their native language. An improved SLA model of native language, target language and interlanguage is put forward to recognize the existence of variation in second language acquisition, which should be given more attention during teaching.

Keywords: Chinese English learners, number, PCU verbs, Personal pronoun we

Procedia PDF Downloads 354
164 Synthetic Method of Contextual Knowledge Extraction

Authors: Olga Kononova, Sergey Lyapin

Abstract:

Global information society requirements are transparency and reliability of data, as well as ability to manage information resources independently; particularly to search, to analyze, to evaluate information, thereby obtaining new expertise. Moreover, it is satisfying the society information needs that increases the efficiency of the enterprise management and public administration. The study of structurally organized thematic and semantic contexts of different types, automatically extracted from unstructured data, is one of the important tasks for the application of information technologies in education, science, culture, governance and business. The objectives of this study are the contextual knowledge typologization, selection or creation of effective tools for extracting and analyzing contextual knowledge. Explication of various kinds and forms of the contextual knowledge involves the development and use full-text search information systems. For the implementation purposes, the authors use an e-library 'Humanitariana' services such as the contextual search, different types of queries (paragraph-oriented query, frequency-ranked query), automatic extraction of knowledge from the scientific texts. The multifunctional e-library «Humanitariana» is realized in the Internet-architecture in WWS-configuration (Web-browser / Web-server / SQL-server). Advantage of use 'Humanitariana' is in the possibility of combining the resources of several organizations. Scholars and research groups may work in a local network mode and in distributed IT environments with ability to appeal to resources of any participating organizations servers. Paper discusses some specific cases of the contextual knowledge explication with the use of the e-library services and focuses on possibilities of new types of the contextual knowledge. Experimental research base are science texts about 'e-government' and 'computer games'. An analysis of the subject-themed texts trends allowed to propose the content analysis methodology, that combines a full-text search with automatic construction of 'terminogramma' and expert analysis of the selected contexts. 'Terminogramma' is made out as a table that contains a column with a frequency-ranked list of words (nouns), as well as columns with an indication of the absolute frequency (number) and the relative frequency of occurrence of the word (in %% ppm). The analysis of 'e-government' materials showed, that the state takes a dominant position in the processes of the electronic interaction between the authorities and society in modern Russia. The media credited the main role in these processes to the government, which provided public services through specialized portals. Factor analysis revealed two factors statistically describing the used terms: human interaction (the user) and the state (government, processes organizer); interaction management (public officer, processes performer) and technology (infrastructure). Isolation of these factors will lead to changes in the model of electronic interaction between government and society. In this study, the dominant social problems and the prevalence of different categories of subjects of computer gaming in science papers from 2005 to 2015 were identified. Therefore, there is an evident identification of several types of contextual knowledge: micro context; macro context; dynamic context; thematic collection of queries (interactive contextual knowledge expanding a composition of e-library information resources); multimodal context (functional integration of iconographic and full-text resources through hybrid quasi-semantic algorithm of search). Further studies can be pursued both in terms of expanding the resource base on which they are held, and in terms of the development of appropriate tools.

Keywords: contextual knowledge, contextual search, e-library services, frequency-ranked query, paragraph-oriented query, technologies of the contextual knowledge extraction

Procedia PDF Downloads 357
163 Original and the Translated: A Comparative Evaluation of Native and Non-Native English Translations of Faiz

Authors: Anam Nawaz

Abstract:

The present study is an attempt to compare the translations of Faiz’s poetry made by native and non-native translators, to determine the role of the translator in terms of preserving the cultural ethos of the original text. Peter Newmark and Katharine Reiss’s approaches to translation criticism have been used to provide a theoretical framework for the study. This study also emphasizes those cultural and semantic aspects of the original which are translated more convincingly by a native translator, and contrasting those features which the non-natives can tackle more ably. The research also highlights the linguistic sockets, ignored by the interpreters in the translation process. The analysis showed that both native and non-native translators have made an admirable effort to stay as close to the original as possible. The natives with their advantage of belonging to the same culture have excelled in preserving the original subject matter, whereas the non-native renderings have been presented in a much rhythmic and poetic manner with an excellent choice of words. Though none of the four translators has been successfully able to recreate Faiz’s magic, however V. G. Kiernan and Sarvat Rahman’s translations can be regarded as the closest to the original. Whereas V. G. Kiernan with his outstanding command over English mesmerizes the readers, Sarvat Rahman’s profound understanding of cultural ties helps establish her translations as a brilliant example of faithful re-renderings.

Keywords: comparative translations, linguistic and cultural constraints, native translators, non-native translators, poetry and translation, Faiz Ahmad Faiz

Procedia PDF Downloads 259
162 Quantitative Analysis of the Quality of Housing and Land Use in the Built-up area of Croatian Coastal City of Zadar

Authors: Silvija Šiljeg, Ante Šiljeg, Branko Cavrić

Abstract:

Housing is considered as a basic human need and important component of the quality of life (QoL) in urban areas worldwide. In contemporary housing studies, the concept of the quality of housing (QoH) is considered as a multi-dimensional and multi-disciplinary field. It emphasizes connection between various aspects of the QoL which could be measured by quantitative and qualitative indicators at different spatial levels (e.g. local, city, metropolitan, regional). The main goal of this paper is to examine the QoH and compare results of quantitative analysis with the clutter land use categories derived for selected local communities in Croatian Coastal City of Zadar. The qualitative housing analysis based on the four housing indicators (out of total 24 QoL indicators) has provided identification of the three Zadar’s local communities with the highest estimated QoH ranking. Furthermore, by using GIS overlay techniques, the QoH was merged with the urban environment analysis and introduction of spatial metrics based on the three categories: the element, class and environment as a whole. In terms of semantic-content analysis, the research has also generated a set of indexes suitable for evaluation of “housing state of affairs” and future decision making aiming at improvement of the QoH in selected local communities.

Keywords: housing, quality, indicators, indexes, urban environment, GIS, element, class

Procedia PDF Downloads 406
161 Development of a French to Yorùbá Machine Translation System

Authors: Benjamen Nathaniel, Eludiora Safiriyu Ijiyemi, Egume Oneme Lucky

Abstract:

A review on machine translation systems shows that a lot of computational artefacts has been carried out to translate written or spoken texts from a source language to Yorùbá language through Machine Translation systems. However, there are no work on French to Yorùbá language machine translation system; hence, the study investigated the process involved in the translation of French-to-Yorùbá language equivalent with the view to adopting a rule- based MT approach to build a Machine Translation framework from simple sentences administered through questionnaire. Articles and relevant textbooks were reviewed with key speakers of both languages interviewed to find out the processes involved in the translation of French language and their equivalent in Yorùbálanguage simple sentences using home domain terminologies. Achieving this, a model was formulated using phrase grammar structure, re-write rule, parse tree, automata theory- based techniques, designed and implemented respectively with unified modeling language (UML) and python programming language. Analysing the result, it was observed when carrying out the result that, the Machine Translation system performed 18.45% above Experimental Subject Respondent and 2.7% below Linguistics Expert when analysed with word orthography, sentence syntax and semantic correctness of the sentences. And, when compared with Google Machine Translation system, it was noticed that the developed system performed better on lexicons of the target language.

Keywords: machine translation (MT), rule-based, French language, Yoru`ba´ language

Procedia PDF Downloads 76
160 Analysis Model for the Relationship of Users, Products, and Stores on Online Marketplace Based on Distributed Representation

Authors: Ke He, Wumaier Parezhati, Haruka Yamashita

Abstract:

Recently, online marketplaces in the e-commerce industry, such as Rakuten and Alibaba, have become some of the most popular online marketplaces in Asia. In these shopping websites, consumers can select purchase products from a large number of stores. Additionally, consumers of the e-commerce site have to register their name, age, gender, and other information in advance, to access their registered account. Therefore, establishing a method for analyzing consumer preferences from both the store and the product side is required. This study uses the Doc2Vec method, which has been studied in the field of natural language processing. Doc2Vec has been used in many cases to analyze the extraction of semantic relationships between documents (represented as consumers) and words (represented as products) in the field of document classification. This concept is applicable to represent the relationship between users and items; however, the problem is that one more factor (i.e., shops) needs to be considered in Doc2Vec. More precisely, a method for analyzing the relationship between consumers, stores, and products is required. The purpose of our study is to combine the analysis of the Doc2vec model for users and shops, and for users and items in the same feature space. This method enables the calculation of similar shops and items for each user. In this study, we derive the real data analysis accumulated in the online marketplace and demonstrate the efficiency of the proposal.

Keywords: Doc2Vec, online marketplace, marketing, recommendation systems

Procedia PDF Downloads 111
159 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 137
158 The Involvement of Visual and Verbal Representations Within a Quantitative and Qualitative Visual Change Detection Paradigm

Authors: Laura Jenkins, Tim Eschle, Joanne Ciafone, Colin Hamilton

Abstract:

An original working memory model suggested the separation of visual and verbal systems in working memory architecture, in which only visual working memory components were used during visual working memory tasks. It was later suggested that the visuo spatial sketch pad was the only memory component at use during visual working memory tasks, and components such as the phonological loop were not considered. In more recent years, a contrasting approach has been developed with the use of an executive resource to incorporate both visual and verbal representations in visual working memory paradigms. This was supported using research demonstrating the use of verbal representations and an executive resource in a visual matrix patterns task. The aim of the current research is to investigate the working memory architecture during both a quantitative and a qualitative visual working memory task. A dual task method will be used. Three secondary tasks will be used which are designed to hit specific components within the working memory architecture – Dynamic Visual Noise (visual components), Visual Attention (spatial components) and Verbal Attention (verbal components). A comparison of the visual working memory tasks will be made to discover if verbal representations are at use, as the previous literature suggested. This direct comparison has not been made so far in the literature. Considerations will be made as to whether a domain specific approach should be employed when discussing visual working memory tasks, or whether a more domain general approach could be used instead.

Keywords: semantic organisation, visual memory, change detection

Procedia PDF Downloads 594
157 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 369
156 Symbiotic Functioning, Photosynthetic Induction and Characterisation of Rhizobia Associated with Groundnut, Jack Bean and Soybean from Eswatini

Authors: Zanele D. Ngwenya, Mustapha Mohammed, Felix D. Dakora

Abstract:

Legumes are a major source of biological nitrogen, and therefore play a crucial role in maintaining soil productivity in smallholder agriculture in southern Africa. Through their ability to fix atmospheric nitrogen in root nodules, legumes are a better option for sustainable nitrogen supply in cropping systems than chemical fertilisers. For decades, farmers have been highly receptive to the use of rhizobial inoculants as a source of nitrogen due mainly to the availability of elite rhizobial strains at a much lower compared to chemical fertilisers. To improve the efficiency of the legume-rhizobia symbiosis in African soils would require the use of highly effective rhizobia capable of nodulating a wide range of host plants. This study assessed the morphogenetic diversity, photosynthetic functioning and relative symbiotic effectiveness (RSE) of groundnut, jack bean and soybean microsymbionts in Eswatini soils as a first step to identifying superior isolates for inoculant production. According to the manufacturer's instructions, rhizobial isolates were cultured in yeast-mannitol (YM) broth until the late log phase and the bacterial genomic DNA was extracted using GenElute bacterial genomic DNA kit. The extracted DNA was subjected to enterobacterial repetitive intergenic consensus-PCR (ERIC-PCR) and a dendrogram constructed from the band patterns to assess rhizobial diversity. To assess the N2-fixing efficiency of the authenticated rhizobia, photosynthetic rates (A), stomatal conductance (gs), and transpiration rates (E) were measured at flowering for plants inoculated with the test isolates. The plants were then harvested for nodulation assessment and measurement of plant growth as shoot biomass. The results of ERIC-PCR fingerprinting revealed the presence of high genetic diversity among the microsymbionts nodulating each of the three test legumes, with many of them showing less than 70% ERIC-PCR relatedness. The dendrogram generated from ERIC-PCR profiles grouped the groundnut isolates into 5 major clusters, while the jack bean and soybean isolates were grouped into 6 and 7 major clusters, respectively. Furthermore, the isolates also elicited variable nodule number per plant, nodule dry matter, shoot biomass and photosynthetic rates in their respective host plants under glasshouse conditions. Of the groundnut isolates tested, 38% recorded high relative symbiotic effectiveness (RSE >80), while 55% of the jack bean isolates and 93% of the soybean isolates recorded high RSE (>80) compared to the commercial Bradyrhizobium strains. About 13%, 27% and 83% of the top N₂-fixing groundnut, jack bean and soybean isolates, respectively, elicited much higher relative symbiotic efficiency (RSE) than the commercial strain, suggesting their potential for use in inoculant production after field testing. There was a tendency for both low and high N₂-fixing isolates to group together in the dendrogram from ERIC-PCR profiles, which suggests that RSE can differ significantly among closely related microsymbionts.

Keywords: genetic diversity, relative symbiotic effectiveness, inoculant, N₂-fixing

Procedia PDF Downloads 219
155 Transfer of Constraints or Constraints on Transfer? Syntactic Islands in Danish L2 English

Authors: Anne Mette Nyvad, Ken Ramshøj Christensen

Abstract:

In the syntax literature, it has standardly been assumed that relative clauses and complement wh-clauses are islands for extraction in English, and that constraints on extraction from syntactic islands are universal. However, the Mainland Scandinavian languages has been known to provide counterexamples. Previous research on Danish has shown that neither relative clauses nor embedded questions are strong islands in Danish. Instead, extraction from this type of syntactic environment is degraded due to structural complexity and it interacts with nonstructural factors such as the frequency of occurrence of the matrix verb, the possibility of temporary misanalysis leading to semantic incongruity and exposure over time. We argue that these facts can be accounted for with parametric variation in the availability of CP-recursion, resulting in the patterns observed, as Danish would then “suspend” the ban on movement out of relative clauses and embedded questions. Given that Danish does not seem to adhere to allegedly universal syntactic constraints, such as the Complex NP Constraint and the Wh-Island Constraint, what happens in L2 English? We present results from a study investigating how native Danish speakers judge extractions from island structures in L2 English. Our findings suggest that Danes transfer their native language parameter setting when asked to judge island constructions in English. This is compatible with the Full Transfer Full Access Hypothesis, as the latter predicts that Danish would have difficulties resetting their [+/- CP-recursion] parameter in English because they are not exposed to negative evidence.

Keywords: syntax, islands, second language acquisition, danish

Procedia PDF Downloads 122
154 Improving Topic Quality of Scripts by Using Scene Similarity Based Word Co-Occurrence

Authors: Yunseok Noh, Chang-Uk Kwak, Sun-Joong Kim, Seong-Bae Park

Abstract:

Scripts are one of the basic text resources to understand broadcasting contents. Since broadcast media wields lots of influence over the public, tools for understanding broadcasting contents are more required. Topic modeling is the method to get the summary of the broadcasting contents from its scripts. Generally, scripts represent contents descriptively with directions and speeches. Scripts also provide scene segments that can be seen as semantic units. Therefore, a script can be topic modeled by treating a scene segment as a document. Because scripts consist of speeches mainly, however, relatively small co-occurrences among words in the scene segments are observed. This causes inevitably the bad quality of topics based on statistical learning method. To tackle this problem, we propose a method of learning with additional word co-occurrence information obtained using scene similarities. The main idea of improving topic quality is that the information that two or more texts are topically related can be useful to learn high quality of topics. In addition, by using high quality of topics, we can get information more accurate whether two texts are related or not. In this paper, we regard two scene segments are related if their topical similarity is high enough. We also consider that words are co-occurred if they are in topically related scene segments together. In the experiments, we showed the proposed method generates a higher quality of topics from Korean drama scripts than the baselines.

Keywords: broadcasting contents, scripts, text similarity, topic model

Procedia PDF Downloads 318