Search results for: text information retrieval
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11759

Search results for: text information retrieval

11219 Regularizing Software for Aerosol Particles

Authors: Christine Böckmann, Julia Rosemann

Abstract:

We present an inversion algorithm that is used in the European Aerosol Lidar Network for the inversion of data collected with multi-wavelength Raman lidar. These instruments measure backscatter coefficients at 355, 532, and 1064 nm, and extinction coefficients at 355 and 532 nm. The algorithm is based on manually controlled inversion of optical data which allows for detailed sensitivity studies and thus provides us with comparably high quality of the derived data products. The algorithm allows us to derive particle effective radius, volume, surface-area concentration with comparably high confidence. The retrieval of the real and imaginary parts of the complex refractive index still is a challenge in view of the accuracy required for these parameters in climate change studies in which light-absorption needs to be known with high accuracy. Single-scattering albedo (SSA) can be computed from the retrieve microphysical parameters and allows us to categorize aerosols into high and low absorbing aerosols. From mathematical point of view the algorithm is based on the concept of using truncated singular value decomposition as regularization method. This method was adapted to work for the retrieval of the particle size distribution function (PSD) and is called hybrid regularization technique since it is using a triple of regularization parameters. The inversion of an ill-posed problem, such as the retrieval of the PSD, is always a challenging task because very small measurement errors will be amplified most often hugely during the solution process unless an appropriate regularization method is used. Even using a regularization method is difficult since appropriate regularization parameters have to be determined. Therefore, in a next stage of our work we decided to use two regularization techniques in parallel for comparison purpose. The second method is an iterative regularization method based on Pade iteration. Here, the number of iteration steps serves as the regularization parameter. We successfully developed a semi-automated software for spherical particles which is able to run even on a parallel processor machine. From a mathematical point of view, it is also very important (as selection criteria for an appropriate regularization method) to investigate the degree of ill-posedness of the problem which we found is a moderate ill-posedness. We computed the optical data from mono-modal logarithmic PSD and investigated particles of spherical shape in our simulations. We considered particle radii as large as 6 nm which does not only cover the size range of particles in the fine-mode fraction of naturally occurring PSD but also covers a part of the coarse-mode fraction of PSD. We considered errors of 15% in the simulation studies. For the SSA, 100% of all cases achieve relative errors below 12%. In more detail, 87% of all cases for 355 nm and 88% of all cases for 532 nm are well below 6%. With respect to the absolute error for non- and weak-absorbing particles with real parts 1.5 and 1.6 in all modes the accuracy limit +/- 0.03 is achieved. In sum, 70% of all cases stay below +/-0.03 which is sufficient for climate change studies.

Keywords: aerosol particles, inverse problem, microphysical particle properties, regularization

Procedia PDF Downloads 339
11218 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 133
11217 A Teaching Method for Improving Sentence Fluency in Writing

Authors: Manssour Habbash, Srinivasa Rao Idapalapati

Abstract:

Although writing is a multifaceted task, teaching writing is a demanding task basically for two reasons: Grammar and Syntax. This article provides a method of teaching writing that was found to be effective in improving students’ academic writing composition skill. The article explains the concepts of ‘guided-discovery’ and ‘guided-construction’ upon which a method of teaching writing is grounded and developed. Providing a brief commentary on what the core could mean primarily, the article presents an exposition of understanding and identifying the core and building upon the core that can demonstrate the way a teacher can make use of the concepts in teaching for improving the writing skills of their students. The method is an adaptation of grammar translation method that has been improvised to suit to a student-centered classroom environment. An intervention of teaching writing through this method was tried out with positive outcomes in formal classroom research setup, and in view of the content’s quality that relates more to the classroom practices and also in consideration of its usefulness to the practicing teachers the process and the findings are presented in a narrative form along with the results in tabular form.

Keywords: core of a text, guided construction, guided discovery, theme of a text

Procedia PDF Downloads 376
11216 Towards A New Maturity Model for Information System

Authors: Ossama Matrane

Abstract:

Information System has become a strategic lever for enterprises. It contributes effectively to align business processes on strategies of enterprises. It is regarded as an increase in productivity and effectiveness. So, many organizations are currently involved in implementing sustainable Information System. And, a large number of studies have been conducted the last decade in order to define the success factors of information system. Thus, many studies on maturity model have been carried out. Some of this study is referred to the maturity model of Information System. In this article, we report on development of maturity models specifically designed for information system. This model is built based on three components derived from Maturity Model for Information Security Management, OPM3 for Project Management Maturity Model and processes of COBIT for IT governance. Thus, our proposed model defines three maturity stages for corporate a strong Information System to support objectives of organizations. It provides a very practical structure with which to assess and improve Information System Implementation.

Keywords: information system, maturity models, information security management, OPM3, IT governance

Procedia PDF Downloads 443
11215 'Wandering Uterus': An Analogy of Perception of Women in Hippocratic Corpus and Post-Modern Times

Authors: Ankita Sharma

Abstract:

The study proposes to review the perception of women in the Classical Age (500-336 BC) when Greek Philosophy was in bloom. It was observed that women had very few rights and were still under the control of men. One of the possible reasons for this exclusion was woman’s biology that had a huge influence on her being seen as inferior to men. The text ‘Hippocratic Corpus’ focuses on the biological construct of the female body in classical Greek science that perpetuated the idea of women as second-class citizens and were considered inherently weaker than men. The research highlights the significance of the text that was used to encourage women of that time to get married and produce children and how till today the perception remains the same. The Greek belief of need for confinement and control of 'wandering uterus' has led to superior understanding of men. The pivotal emphasis of this research is to women and their bodies that are depicted in a misogynistic way which paved the way for Hippocratic writers to influence the society’s attitude towards women in their writings. It is intended to draw attention to the prevailing cultural assumptions and preconceived notions about female anatomy that had a pervasive influence in the following centuries with its roots being in ancient science.

Keywords: classical Greek theory, women, wandering womb, modern ideology

Procedia PDF Downloads 192
11214 High Secure Data Hiding Using Cropping Image and Least Significant Bit Steganography

Authors: Khalid A. Al-Afandy, El-Sayyed El-Rabaie, Osama Salah, Ahmed El-Mhalaway

Abstract:

This paper presents a high secure data hiding technique using image cropping and Least Significant Bit (LSB) steganography. The predefined certain secret coordinate crops will be extracted from the cover image. The secret text message will be divided into sections. These sections quantity is equal the image crops quantity. Each section from the secret text message will embed into an image crop with a secret sequence using LSB technique. The embedding is done using the cover image color channels. Stego image is given by reassembling the image and the stego crops. The results of the technique will be compared to the other state of art techniques. Evaluation is based on visualization to detect any degradation of stego image, the difficulty of extracting the embedded data by any unauthorized viewer, Peak Signal-to-Noise Ratio of stego image (PSNR), and the embedding algorithm CPU time. Experimental results ensure that the proposed technique is more secure compared with the other traditional techniques.

Keywords: steganography, stego, LSB, crop

Procedia PDF Downloads 266
11213 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 380
11212 Visualisation in Health Communication: Taking Weibo Interaction in COVD19 as the Example

Authors: Zicheng Zhang, Linli Zhang

Abstract:

As China's biggest social media platform, Weibo has taken on essential health communication responsibilities during the pandemic. This research takes 105 posters in 15 health-related official Weibo accounts as the analysis objects to explore COVID19 health information communication and visualisation. First, the interaction between the audiences and Weibo, including forwarding, comments, and likes, is statistically analysed. The comments about the information design are extracted manually, and then the sentiment analysis is carried out to verdict audiences' views about the poster's design. The forwarding and comments are quantified as the attention index for a reference to the degree of likes. In addition, this study also designed an evaluation scale based on the standards of Health Literacy Resource by the Centers for Medicare& Medicaid Services (US). Then designers scored all selected posters one by one. Finally, combining the data of the two parts, concluded that: 1. To a certain extent, people think that the posters do not deliver substantive and practical information; 2. Non-knowledge posters(i.e., cartoon posters) gained more Forwarding and Likes, such as Go, Wuhan poster; 3. The analysis of COVID posters is still mainly picture-oriented, mainly about encouraging people to overcome difficulties; 4. Posters for pandemic prevention usually contain more text and fewer illustrations and do not clearly show cultural differences. In conclusion, health communication usually involves a lot of professional knowledge, so visualising that knowledge in an accessible way for the general public is challenging. The relevant posters still have the problems of lack of effective communication, superficial design, and insufficient content accessibility.

Keywords: weibo, visualisation, covid posters, poster design

Procedia PDF Downloads 124
11211 Comics Scanlation and Publishing Houses Translation

Authors: Sharifa Alshahrani

Abstract:

Comics is a multimodal text wherein meaning is created by taking in all modes of expression at once. It uses two different semiotic modes, the verbal and the visual modes, together to make meaning and these different semiotic modes can be socially and culturally shaped to give meaning. Therefore, comics translation cannot treat comics as a monomodal text by translating only the verbal mode inside or outside the speech balloons as the cultural differences are encoded in the visual mode as well. Due to the development of the internet and editing software, comics translation is not anymore confined to the publishing houses and official translation as scanlation, or the fan translation took the initiative in translating comics for being emotionally attracted to the culture and genre. Scanlation is carried out by volunteering fans who translate out of passion. However, quality is one of the debatable issues relating to scanlation and fan translation. This study will investigate how the dynamic multimodal relationship in comics is exploited and interpreted in the translation by exploring the translation strategies and procedures adopted by the publishing houses and scanlation in interpreting comics into Arabic using three analytical frameworks; cultural references model, multimodal relation model and translation strategies and procedures models.

Keywords: comics, multimodality, translation, scanlation

Procedia PDF Downloads 208
11210 Linguistic Analysis of Argumentation Structures in Georgian Political Speeches

Authors: Mariam Matiashvili

Abstract:

Argumentation is an integral part of our daily communications - formal or informal. Argumentative reasoning, techniques, and language tools are used both in personal conversations and in the business environment. Verbalization of the opinions requires the use of extraordinary syntactic-pragmatic structural quantities - arguments that add credibility to the statement. The study of argumentative structures allows us to identify the linguistic features that make the text argumentative. Knowing what elements make up an argumentative text in a particular language helps the users of that language improve their skills. Also, natural language processing (NLP) has become especially relevant recently. In this context, one of the main emphases is on the computational processing of argumentative texts, which will enable the automatic recognition and analysis of large volumes of textual data. The research deals with the linguistic analysis of the argumentative structures of Georgian political speeches - particularly the linguistic structure, characteristics, and functions of the parts of the argumentative text - claims, support, and attack statements. The research aims to describe the linguistic cues that give the sentence a judgmental/controversial character and helps to identify reasoning parts of the argumentative text. The empirical data comes from the Georgian Political Corpus, particularly TV debates. Consequently, the texts are of a dialogical nature, representing a discussion between two or more people (most often between a journalist and a politician). The research uses the following approaches to identify and analyze the argumentative structures Lexical Classification & Analysis - Identify lexical items that are relevant in argumentative texts creating process - Creating the lexicon of argumentation (presents groups of words gathered from a semantic point of view); Grammatical Analysis and Classification - means grammatical analysis of the words and phrases identified based on the arguing lexicon. Argumentation Schemas - Describe and identify the Argumentation Schemes that are most likely used in Georgian Political Speeches. As a final step, we analyzed the relations between the above mentioned components. For example, If an identified argument scheme is “Argument from Analogy”, identified lexical items semantically express analogy too, and they are most likely adverbs in Georgian. As a result, we created the lexicon with the words that play a significant role in creating Georgian argumentative structures. Linguistic analysis has shown that verbs play a crucial role in creating argumentative structures.

Keywords: georgian, argumentation schemas, argumentation structures, argumentation lexicon

Procedia PDF Downloads 67
11209 The Processing of Implicit Stereotypes in Contexts of Reading, Using Eye-Tracking and Self-Paced Reading Tasks

Authors: Magali Mari, Misha Muller

Abstract:

The present study’s objectives were to determine how diverse implicit stereotypes affect the processing of written information and linguistic inferential processes, such as presupposition accommodation. When reading a text, one constructs a representation of the described situation, which is then updated, according to new outputs and based on stereotypes inscribed within society. If the new output contradicts stereotypical expectations, the representation must be corrected, resulting in longer reading times. A similar process occurs in cases of linguistic inferential processes like presupposition accommodation. Presupposition accommodation is traditionally regarded as fast, automatic processing of background information (e.g., ‘Mary stopped eating meat’ is quickly processed as Mary used to eat meat). However, very few accounts have investigated if this process is likely to be influenced by domains of social cognition, such as implicit stereotypes. To study the effects of implicit stereotypes on presupposition accommodation, adults were recorded while they read sentences in French, combining two methods, an eye-tracking task and a classic self-paced reading task (where participants read sentence segments at their own pace by pressing a computer key). In one condition, presuppositions were activated with the French definite articles ‘le/la/les,’ whereas in the other condition, the French indefinite articles ‘un/une/des’ was used, triggering no presupposition. Using a definite article presupposes that the object has already been uttered and is thus part of background information, whereas using an indefinite article is understood as the introduction of new information. Two types of stereotypes were under examination in order to enlarge the scope of stereotypes traditionally analyzed. Study 1 investigated gender stereotypes linked to professional occupations to replicate previous findings. Study 2 focused on nationality-related stereotypes (e.g. ‘the French are seducers’ versus ‘the Japanese are seducers’) to determine if the effects of implicit stereotypes on reading are generalizable to other types of implicit stereotypes. The results show that reading is influenced by the two types of implicit stereotypes; in the two studies, the reading pace slowed down when a counter-stereotype was presented. However, presupposition accommodation did not affect participants’ processing of information. Altogether these results show that (a) implicit stereotypes affect the processing of written information, regardless of the type of stereotypes presented, and (b) that implicit stereotypes prevail over the superficial linguistic treatment of presuppositions, which suggests faster processing for treating social information compared to linguistic information.

Keywords: eye-tracking, implicit stereotypes, reading, social cognition

Procedia PDF Downloads 194
11208 Death of the Author and Birth of the Adapter in a Literary Work

Authors: Slwa Al-Hammad

Abstract:

Adaptation studies have been closely aligned to translation studies as both deal with the process of rendering the meaning from one culture to another. These two disciplines are related to each other, but the theories are still being developed. This research aims to fill this gap and provide a contribution to the growing discipline of adaptation studies through a theoretical perspective while investigating how different cultural interpretations of adaptation influence the final literary product. This research focuses on the theoretical concepts of Barthes’s death of the author and Benjamin’s afterlife of the text in translation, which is believed to lead to the birth of the adapter in a literary work. That is, in adaptation, the ‘death’ of the author allows for the ‘birth’ of the adapter, offering them all the creative possibilities of authorship. It also explores the differences between the meanings of adaptation in the West and the Arab world through the analysis of adapted texts in Arabic initially deriving from the European and American literature of the 19th and 20th centuries. The methodology of this thesis is based upon qualitative literary analysis, in which original and adapted works are compared and contrasted, with the additional insights of literary and adaptation theories and prior scholarship. The main works discussed are the Arabic adaptations of William Faulkner’s novels. The analysis is guided by theories of adaptation studies to help in explaining the concepts of relocating, recreating, and rewriting in the process of adaptation. It draws on scholarship on adaptations to inquire into the status of the adapted texts in relation to the original texts. Also, these theories prove that adaptation is the process that is used to transfer text from source to adapted text, not some other analytical practice. Through the textual analysis, concepts of the death of the author and the birth of the adapter will be illustrated, as will the roles of the adapter and the task of rendering works for a different culture, and the understanding of adaptation and Arabization in Arabic literature.

Keywords: adaptation, Arabization, authorship, recreating, relocating

Procedia PDF Downloads 133
11207 Anaphora and Cataphora on the Selected State of the City Addresses of the Mayor of Dapitan

Authors: Mark Herman Sumagang Potoy

Abstract:

State of the City Address (SOCA) is a speech, modelled after the State of the Nation Address, given not as mandated by law but usually a matter of practice or tradition delivered before the chief executive’s constituents. Through this, the general public is made to know the performance of the local government unit and its agenda for the coming year. Therefore, it is imperative for SOCAs to clearly convey its message and carry out the myriad function of enlightening its readers which could be achieved through the proper use of reference. Anaphora and cataphora are the two major types of reference; the former refer back to something that has already been mentioned while the latter points forward to something which is yet to be said. This paper seeks to identify the types of reference employed on the SOCAs from 2014 to 2016 of Hon. Rosalina Garcia Jalosjos, Mayor of Dapitan City and look into how the references contribute to the clarity of the message of the text. The qualitative method of research is used in this study through an in-depth analysis of the corpus. As soon as the copies of the SOCAs are secured from the Office of the City Mayor, they are then analyzed using documentary technique categorizing the types of reference as to anaphora and cataphora, counting each of these types and describing the implications of the dominant types used in the addresses. After a thorough analysis, it is found out that the two reference types namely, anaphora and cataphora are both employed on the three SOCAs, the former being used more frequently than the latter accounting to 80% and 20% of actual usage, respectively. Moreover, the use of anaphors and cataphora on the three addresses helps in conveying the message clearly because they primarily become aids to avoid the repetition of the same element in the text especially when there wasn’t a need to emphasize a point. Finally, it is recommended that writers of State of the City Addresses should have a vast knowledge on how reference should be used and the functions they take in the text since this is a vital tool to clearly transmit a message. Moreover, English teachers should explicitly teach the proper usage of anaphora and cataphora, as instruments to develop cohesion in written discourse, to enable students to write not only with sense but also with fluidity in tying utterances together.

Keywords: anaphora, cataphora, reference, State of the City Address

Procedia PDF Downloads 190
11206 3D Modeling Approach for Cultural Heritage Structures: The Case of Virgin of Loreto Chapel in Cusco, Peru

Authors: Rony Reátegui, Cesar Chácara, Benjamin Castañeda, Rafael Aguilar

Abstract:

Nowadays, heritage building information modeling (HBIM) is considered an efficient tool to represent and manage information of cultural heritage (CH). The basis of this tool relies on a 3D model generally obtained from a cloud-to-BIM procedure. There are different methods to create an HBIM model that goes from manual modeling based on the point cloud to the automatic detection of shapes and the creation of objects. The selection of these methods depends on the desired level of development (LOD), level of information (LOI), grade of generation (GOG), as well as on the availability of commercial software. This paper presents the 3D modeling of a stone masonry chapel using Recap Pro, Revit, and Dynamo interface following a three-step methodology. The first step consists of the manual modeling of simple structural (e.g., regular walls, columns, floors, wall openings, etc.) and architectural (e.g., cornices, moldings, and other minor details) elements using the point cloud as reference. Then, Dynamo is used for generative modeling of complex structural elements such as vaults, infills, and domes. Finally, semantic information (e.g., materials, typology, state of conservation, etc.) and pathologies are added within the HBIM model as text parameters and generic models families, respectively. The application of this methodology allows the documentation of CH following a relatively simple to apply process that ensures adequate LOD, LOI, and GOG levels. In addition, the easy implementation of the method as well as the fact of using only one BIM software with its respective plugin for the scan-to-BIM modeling process means that this methodology can be adopted by a larger number of users with intermediate knowledge and limited resources since the BIM software used has a free student license.

Keywords: cloud-to-BIM, cultural heritage, generative modeling, HBIM, parametric modeling, Revit

Procedia PDF Downloads 140
11205 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 62
11204 Prosperous Digital Image Watermarking Approach by Using DCT-DWT

Authors: Prabhakar C. Dhavale, Meenakshi M. Pawar

Abstract:

In this paper, everyday tons of data is embedded on digital media or distributed over the internet. The data is so distributed that it can easily be replicated without error, putting the rights of their owners at risk. Even when encrypted for distribution, data can easily be decrypted and copied. One way to discourage illegal duplication is to insert information known as watermark, into potentially valuable data in such a way that it is impossible to separate the watermark from the data. These challenges motivated researchers to carry out intense research in the field of watermarking. A watermark is a form, image or text that is impressed onto paper, which provides evidence of its authenticity. Digital watermarking is an extension of the same concept. There are two types of watermarks visible watermark and invisible watermark. In this project, we have concentrated on implementing watermark in image. The main consideration for any watermarking scheme is its robustness to various attacks

Keywords: watermarking, digital, DCT-DWT, security

Procedia PDF Downloads 419
11203 Semantic Based Analysis in Complaint Management System with Analytics

Authors: Francis Alterado, Jennifer Enriquez

Abstract:

Semantic Based Analysis in Complaint Management System with Analytics is an enhanced tool of providing complaints by the clients as well as a mechanism for Palawan Polytechnic College to gather, process, and monitor status of these complaints. The study has a mobile application that serves as a remote facility of communication between the students and the school management on the issues encountered by the student and the solution of every complaint received. In processing the complaints, text mining and clustering algorithms were utilized. Every module of the systems was tested and based on the results; these are 100% free from error before integration was done. A system testing was also done by checking the expected functionality of the system which was 100% functional. The system was tested by 10 students by forwarding complaints to 10 departments. Based on results, the students were able to submit complaints, the system was able to process accordingly by identifying to which department the complaints are intended, and the concerned department was able to give feedback on the complaint received to the student. With this, the system gained 4.7 rating which means Excellent.

Keywords: technology adoption, emerging technology, issues challenges, algorithm, text mining, mobile technology

Procedia PDF Downloads 196
11202 Compilation and Statistical Analysis of an Arabic-English Legal Corpus in Sketch Engine

Authors: C. Brierley, H. El-Farahaty, A. Farhan

Abstract:

The Leeds Parallel Corpus of Arabic-English Constitutions is a parallel corpus for the Arabic legal domain. Analysis of legal language via Corpus Linguistics techniques is an important development. In legal proceedings, a corpus-based approach to disambiguating meaning is set to replace the dictionary as an interpretative tool, and legal scholarship in the States is now attuned to the potential for Text Analytics over vast quantities of text-based legal material, following the business and medical industries. This trend is reflected in Europe: the interdisciplinary research group in Computer Assisted Legal Linguistics mines big data collections of legal and non-legal texts to analyse: legal interpretations; legal discourse; the comprehensibility of legal texts; conflict resolution; and linguistic human rights. This paper focuses on ‘dignity’ as an important aspect of the overarching concept of human rights in current constitutions across the Arab world. We have compiled a parallel, Arabic-English raw text corpus (169,861 Arabic words and 205,893 English words) from reputable websites such as the World Intellectual Property Organisation and CONSTITUTE, and uploaded and queried our corpus in Sketch Engine. Our most challenging task was sentence-level alignment of Arabic-English data. This entailed manual intervention to ensure correspondence on a one-to-many basis since Arabic sentences differ from English in length and punctuation. We have searched for morphological variants of ‘dignity’ (رامة ك, karāma) in the Arabic data and inspected their English translation equivalents. The term occurs most frequently in the Sudanese constitution (10 instances), and not at all in the constitution of Palestine. Its most frequent collocate, determined via the logDice statistic in Sketch Engine, is ‘human’ as in ‘human dignity’.

Keywords: Arabic constitution, corpus-based legal linguistics, human rights, parallel Arabic-English legal corpora

Procedia PDF Downloads 177
11201 A Physical Theory of Information vs. a Mathematical Theory of Communication

Authors: Manouchehr Amiri

Abstract:

This article introduces a general notion of physical bit information that is compatible with the basics of quantum mechanics and incorporates the Shannon entropy as a special case. This notion of physical information leads to the Binary data matrix model (BDM), which predicts the basic results of quantum mechanics, general relativity, and black hole thermodynamics. The compatibility of the model with holographic, information conservation, and Landauer’s principles are investigated. After deriving the “Bit Information principle” as a consequence of BDM, the fundamental equations of Planck, De Broglie, Beckenstein, and mass-energy equivalence are derived.

Keywords: physical theory of information, binary data matrix model, Shannon information theory, bit information principle

Procedia PDF Downloads 168
11200 Chinese Event Detection Technique Based on Dependency Parsing and Rule Matching

Authors: Weitao Lin

Abstract:

To quickly extract adequate information from large-scale unstructured text data, this paper studies the representation of events in Chinese scenarios and performs the regularized abstraction. It proposes a Chinese event detection technique based on dependency parsing and rule matching. The method first performs dependency parsing on the original utterance, then performs pattern matching at the word or phrase granularity based on the results of dependent syntactic analysis, filters out the utterances with prominent non-event characteristics, and obtains the final results. The experimental results show the effectiveness of the method.

Keywords: natural language processing, Chinese event detection, rules matching, dependency parsing

Procedia PDF Downloads 135
11199 Lexical Semantic Analysis to Support Ontology Modeling of Maintenance Activities– Case Study of Offshore Riser Integrity

Authors: Vahid Ebrahimipour

Abstract:

Word representation and context meaning of text-based documents play an essential role in knowledge modeling. Business procedures written in natural language are meant to store technical and engineering information, management decision and operation experience during the production system life cycle. Context meaning representation is highly dependent upon word sense, lexical relativity, and sematic features of the argument. This paper proposes a method for lexical semantic analysis and context meaning representation of maintenance activity in a mass production system. Our approach constructs a straightforward lexical semantic approach to analyze facilitates semantic and syntactic features of context structure of maintenance report to facilitate translation, interpretation, and conversion of human-readable interpretation into computer-readable representation and understandable with less heterogeneity and ambiguity. The methodology will enable users to obtain a representation format that maximizes shareability and accessibility for multi-purpose usage. It provides a contextualized structure to obtain a generic context model that can be utilized during the system life cycle. At first, it employs a co-occurrence-based clustering framework to recognize a group of highly frequent contextual features that correspond to a maintenance report text. Then the keywords are identified for syntactic and semantic extraction analysis. The analysis exercises causality-driven logic of keywords’ senses to divulge the structural and meaning dependency relationships between the words in a context. The output is a word contextualized representation of maintenance activity accommodating computer-based representation and inference using OWL/RDF.

Keywords: lexical semantic analysis, metadata modeling, contextual meaning extraction, ontology modeling, knowledge representation

Procedia PDF Downloads 102
11198 The Development of Congeneric Elicited Writing Tasks to Capture Language Decline in Alzheimer Patients

Authors: Lise Paesen, Marielle Leijten

Abstract:

People diagnosed with probable Alzheimer disease suffer from an impairment of their language capacities; a gradual impairment which affects both their spoken and written communication. Our study aims at characterising the language decline in DAT patients with the use of congeneric elicited writing tasks. Within these tasks, a descriptive text has to be written based upon images with which the participants are confronted. A randomised set of images allows us to present the participants with a different task on every encounter, thus allowing us to avoid a recognition effect in this iterative study. This method is a revision from previous studies, in which participants were presented with a larger picture depicting an entire scene. In order to create the randomised set of images, existing pictures were adapted following strict criteria (e.g. frequency, AoA, colour, ...). The resulting data set contained 50 images, belonging to several categories (vehicles, animals, humans, and objects). A pre-test was constructed to validate the created picture set; most images had been used before in spoken picture naming tasks. Hence the same reaction times ought to be triggered in the typed picture naming task. Once validated, the effectiveness of the descriptive tasks was assessed. First, the participants (n=60 students, n=40 healthy elderly) performed a typing task, which provided information about the typing speed of each individual. Secondly, two descriptive writing tasks were carried out, one simple and one complex. The simple task contains 4 images (1 animal, 2 objects, 1 vehicle) and only contains elements with high frequency, a young AoA (<6 years), and fast reaction times. Slow reaction times, a later AoA (≥ 6 years) and low frequency were criteria for the complex task. This task uses 6 images (2 animals, 1 human, 2 objects and 1 vehicle). The data were collected with the keystroke logging programme Inputlog. Keystroke logging tools log and time stamp keystroke activity to reconstruct and describe text production processes. The data were analysed using a selection of writing process and product variables, such as general writing process measures, detailed pause analysis, linguistic analysis, and text length. As a covariate, the intrapersonal interkey transition times from the typing task were taken into account. The pre-test indicated that the new images lead to similar or even faster reaction times compared to the original images. All the images were therefore used in the main study. The produced texts of the description tasks were significantly longer compared to previous studies, providing sufficient text and process data for analyses. Preliminary analysis shows that the amount of words produced differed significantly between the healthy elderly and the students, as did the mean length of production bursts, even though both groups needed the same time to produce their texts. However, the elderly took significantly more time to produce the complex task than the simple task. Nevertheless, the amount of words per minute remained comparable between simple and complex. The pauses within and before words varied, even when taking personal typing abilities (obtained by the typing task) into account.

Keywords: Alzheimer's disease, experimental design, language decline, writing process

Procedia PDF Downloads 272
11197 Gastric Foreign Bodies in Dogs

Authors: Naglaa A. Abd Elkader, Haithem A. Farghali

Abstract:

The present study carried out on fifteen clinical cases of different species of dogs which admitted to surgical clinic of veterinary medicine with different symptoms (Acute vomiting, hematemesis and anorexia). There was diagnostic march which including plain radiograph and endoscopic examination. Treatment was including surgical interference and endoscopic retrieval followed by medicinal treatment. This study was aimed the detection of different foreign bodies by the most suitable method according to the type of the foreign bodies.

Keywords: stomach, endoscopy, foreign bodies, dogs

Procedia PDF Downloads 409
11196 The Effect of Supply Chain Integration on Information Sharing

Authors: Khlif Hamadi

Abstract:

Supply chain integration has become a potentially valuable way of securing shared information and improving supply chain performance since competition is no longer between organizations but among supply chains. This research conceptualizes and develops three dimensions of supply chain integration (integration with customers, integration with suppliers, and the interorganizational integration) and tests the relationships between supply chain integration, information sharing, and supply chain performance. Furthermore, the four types of information sharing namely; information sharing with customers, information sharing with suppliers, inter-functional information sharing, and intra-organizational information sharing; and the four constructs of Supply Chain Performance represents expenses of costs, asset utilization, supply chain reliability, and supply chain flexibility and responsiveness. The theoretical and practical implications of the study, as well as directions for future research, are discussed.

Keywords: supply chain integration, supply chain management, information sharing, supply chain performance

Procedia PDF Downloads 258
11195 1/Sigma Term Weighting Scheme for Sentiment Analysis

Authors: Hanan Alshaher, Jinsheng Xu

Abstract:

Large amounts of data on the web can provide valuable information. For example, product reviews help business owners measure customer satisfaction. Sentiment analysis classifies texts into two polarities: positive and negative. This paper examines movie reviews and tweets using a new term weighting scheme, called one-over-sigma (1/sigma), on benchmark datasets for sentiment classification. The proposed method aims to improve the performance of sentiment classification. The results show that 1/sigma is more accurate than the popular term weighting schemes. In order to verify if the entropy reflects the discriminating power of terms, we report a comparison of entropy values for different term weighting schemes.

Keywords: 1/sigma, natural language processing, sentiment analysis, term weighting scheme, text classification

Procedia PDF Downloads 199
11194 Integrating Critical Stylistics and Visual Grammar: A Multimodal Stylistic Approach to the Analysis of Non-Literary Texts

Authors: Shatha Khuzaee

Abstract:

The study develops multimodal stylistic approach to analyse a number of BBC online news articles reporting some key events from the so called ‘Arab Uprisings’. Critical stylistics (CS) and visual grammar (VG) provide insightful arguments to the ways ideology is projected through different verbal and visual modes, yet they are mode specific because they examine how each mode projects its meaning separately and do not attempt to clarify what happens intersemiotically when the two modes co-occur. Therefore, it is the task undertaken in this research to propose multimodal stylistic approach that addresses the issue of ideology construction when the two modes co-occur. Informed by functional grammar and social semiotics, the analysis attempts to integrate three linguistic models developed in critical stylistics, namely, transitivity choices, prioritizing and hypothesizing along with their visual equivalents adopted from visual grammar to investigate the way ideology is constructed, in multimodal text, when text/image participate and interrelate in the process of meaning making on the textual level of analysis. The analysis provides comprehensive theoretical and analytical elaborations on the different points of integration between CS linguistic models and VG equivalents which operate on the textual level of analysis to better account for ideology construction in news as non-literary multimodal texts. It is argued that the analysis well thought out a plan that would remark the first step towards the integration between the well-established linguistic models of critical stylistics and that of visual analysis to analyse multimodal texts on the textual level. Both approaches are compatible to produce multimodal stylistic approach because they intend to analyse text and image depending on whatever textual evidence is available. This supports the analysis maintain the rigor and replicability needed for a stylistic analysis like the one undertaken in this study.

Keywords: multimodality, stylistics, visual grammar, social semiotics, functional grammar

Procedia PDF Downloads 219
11193 Developing Students’ Academic Writing Skills through Scientific Reading: Using Questions and Answer Activities

Authors: Makhim Artikova, Shavkat Duschanov

Abstract:

So far, there have been a plethora of attempts to improve learners’ academic writing skills. However, this issue remains to be a real concern among the majority of students, especially those who are standing on their academic life threshold. The purpose of this research is improving students’ academic writing skills through 'Questions and Answer Reading' activities. Using well-prepared and well-chosen reading materials (from textbooks, scientific journals, or magazines) and applying questions and answer activities in the classroom facilitate learners to become great critical readers. Furthermore, it boosts their writing skills, which are the most crucial part of students’ personal and academic developments. In this activity, the class is divided into small groups of four. Then, the instructor will give students whether one section of the text or full text asking them to read and to find unfamiliar words within the group. After discovering the meaning of unknown words, each group has to share their findings with the class. In the next stage of the activity, students should be asked to create questions in a group based on the given reading material. Follow by each group should ask the other groups their questions which are an excellent opportunity to challenge leads to improve critical thinking skills. In the last part, the students are asked to write the text or article summary, which is the activity core that pilots to the writing skills perfection. This engaging activity highlights the effectiveness of incorporating reading materials into the classroom when it comes to improving students’ composition writings. Structural writing after every reading activity resulted in improving students’ coherence and cohesion in writing well-organized essays. Having experimented with high school 9th and 11th-grade students, implementing reading activities into the classroom is proved to be a productive tool to enhance one’s academic writing skills. In the future, this method planning to be implemented among university students.

Keywords: academic writing, coherence and cohesion, questions and answer activities, scientific reading

Procedia PDF Downloads 108
11192 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons

Authors: Said Boularouk, Didier Josselin, Eitan Altman

Abstract:

In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.

Keywords: TTS, ontology, open street map, visually impaired

Procedia PDF Downloads 294
11191 Literature as a Strategic Tool to Conscientise Africans: An Attempt by Postcolonial Writers and Critics to Reverse the Socio-Economics Imbalances of Colonialism

Authors: Lutendo Nendauni

Abstract:

Colonialism breaks things, colonisers exploded native cultural solidarity, producing the spiritual confusion, psychic wounding, and economic exploitation of a new and dominated ‘other’. Colonialism as the cultural and economic exploitation began when the West defended in their seizure of foreign territories for the exploitation of its natural resources; this resulted in brutal socio-economic imbalances. The Western profited at the detriment of the weak Africa. However, colonialism has since passed, but the effects are still evident culturally, socially, and economically. This paper explored how postcolonial writers and critics attempt to reverse the socio-economic imbalances resulting from the fragmentation of colonialism, with a focus on the play 'I will Marry When I Want' by Ngugi wa Thiong’o and Ngugi wa Mirii, as a primary text. Using qualitative discourse-textual analysis as the research methodology, the researcher purposively extracts discourse segments from the text for analysis and interpretation. The findings reveal that Postcolonial critics and writers attempt to reverse the socio-economic effects of colonialism through various counter discourses; their literature is concerned with the destruction of colonised identity, the search for this identity, and its assertion. It is manifest in the text that writers offer corrective views about Africans; they stress that they write their literary texts to conscientise their fellow Africans. Postcolonial writers and critics argue that language is a carrier of culture and that the only way to break free from colonial influence is by not adopting a foreign language. They further through their poems, novels, plays, and music strategically shine the spotlight on the previously nameless and destitute people so that they can develop the human spirit’s desire to overcome defeat, socio-political deprivation, and isolation.

Keywords: colonialism, postcoloniality, critics, socio-economic imbalances

Procedia PDF Downloads 153
11190 Cloud-Based Multiresolution Geodata Cube for Efficient Raster Data Visualization and Analysis

Authors: Lassi Lehto, Jaakko Kahkonen, Juha Oksanen, Tapani Sarjakoski

Abstract:

The use of raster-formatted data sets in geospatial analysis is increasing rapidly. At the same time, geographic data are being introduced into disciplines outside the traditional domain of geoinformatics, like climate change, intelligent transport, and immigration studies. These developments call for better methods to deliver raster geodata in an efficient and easy-to-use manner. Data cube technologies have traditionally been used in the geospatial domain for managing Earth Observation data sets that have strict requirements for effective handling of time series. The same approach and methodologies can also be applied in managing other types of geospatial data sets. A cloud service-based geodata cube, called GeoCubes Finland, has been developed to support online delivery and analysis of most important geospatial data sets with national coverage. The main target group of the service is the academic research institutes in the country. The most significant aspects of the GeoCubes data repository include the use of multiple resolution levels, cloud-optimized file structure, and a customized, flexible content access API. Input data sets are pre-processed while being ingested into the repository to bring them into a harmonized form in aspects like georeferencing, sampling resolutions, spatial subdivision, and value encoding. All the resolution levels are created using an appropriate generalization method, selected depending on the nature of the source data set. Multiple pre-processed resolutions enable new kinds of online analysis approaches to be introduced. Analysis processes based on interactive visual exploration can be effectively carried out, as the level of resolution most close to the visual scale can always be used. In the same way, statistical analysis can be carried out on resolution levels that best reflect the scale of the phenomenon being studied. Access times remain close to constant, independent of the scale applied in the application. The cloud service-based approach, applied in the GeoCubes Finland repository, enables analysis operations to be performed on the server platform, thus making high-performance computing facilities easily accessible. The developed GeoCubes API supports this kind of approach for online analysis. The use of cloud-optimized file structures in data storage enables the fast extraction of subareas. The access API allows for the use of vector-formatted administrative areas and user-defined polygons as definitions of subareas for data retrieval. Administrative areas of the country in four levels are available readily from the GeoCubes platform. In addition to direct delivery of raster data, the service also supports the so-called virtual file format, in which only a small text file is first downloaded. The text file contains links to the raster content on the service platform. The actual raster data is downloaded on demand, from the spatial area and resolution level required in each stage of the application. By the geodata cube approach, pre-harmonized geospatial data sets are made accessible to new categories of inexperienced users in an easy-to-use manner. At the same time, the multiresolution nature of the GeoCubes repository facilitates expert users to introduce new kinds of interactive online analysis operations.

Keywords: cloud service, geodata cube, multiresolution, raster geodata

Procedia PDF Downloads 133