Search results for: sentence analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 27986

Search results for: sentence analysis

27986 Sentence Variation in Academic Writing: A Contrastive Study of the Variation of Sentence Types between Male and Female ESL Writers

Authors: Fatima Muhammad Shitu

Abstract:

This paper focuses on the variation of sentence types in English academic writing. The major focus is on whether variation in sentence types can be attributable to the linguistic and most of all the gender of the writers. The objective of this paper is to analyze the sentence types produced by Male and Female ESL writers and to determine whether writers vary the frequency and use of sentence types across the text depending on the rhetorical choices of the writers to construct identity. This study is hinged on the functionalist approach to analyzing academic writing in use. For the purpose of this study, a corpus of 20 academic papers was created and the use of sentences types was analyzed. The data for the study was collated using percentages. In this case, the number of occurrences of the different sentence types were analyzed, calculated and then converted to percentages for each group i.e., male and female ESL writers. The results from these analyses were compared and contrasted in order to determine whether Male and Female ESL writer vary their sentence types, and, or employed the same or different sentence types in their texts. The conclusion is that Male and Female ESL writers not only vary in their use of sentence types in academic writings but also differ.

Keywords: sentence variation, ESL, gender, academic writing

Procedia PDF Downloads 328
27985 The Controversy of the English Sentence and Its Teaching Implication

Authors: Franklin Uakhemen Ajogbor

Abstract:

The issue of the English sentence has remained controversial from Traditional Grammar to modern linguistics. The English sentence occupies the highest rank in the hierarchy of grammatical units. Its consideration is therefore very necessary in learning English as a second language. Unfortunately, divergent views by grammarians on the concept of the English sentence have generated much controversy. There seems not to be a unanimous agreement on what actually constitute a sentence. Some schools of thought believe that a sentence must have a subject and a predicate while some believe that it should not. The types of sentence according to structure are also not devoid of controversy as the views of several linguists have not been properly harmonized. Findings have shown that serious effort and attention have not been paid by previous linguists to clear these ambiguities as it has a negative implication in the learning and teaching of English language. The variations on the concept of the English sentence have become particularly worrisome as a result of the widening patronage of English as a global language. The paper is therefore interested in the investigation of this controversy and suggesting a solution to the problem. In doing this, data was collected from students and scholars that show lack of uniformity in what a sentence is. Using the Systemic Functional Model as theoretical framework, the paper launches into the views held by these various schools of thought with the aim of reconciling these divergent views and also an attempt to open up further research on what actually constitute a sentence.

Keywords: traditional grammar, linguistics, controversy, sentence, grammatical units

Procedia PDF Downloads 295
27984 Linguistic Features for Sentence Difficulty Prediction in Aspect-Based Sentiment Analysis

Authors: Adrian-Gabriel Chifu, Sebastien Fournier

Abstract:

One of the challenges of natural language understanding is to deal with the subjectivity of sentences, which may express opinions and emotions that add layers of complexity and nuance. Sentiment analysis is a field that aims to extract and analyze these subjective elements from text, and it can be applied at different levels of granularity, such as document, paragraph, sentence, or aspect. Aspect-based sentiment analysis is a well-studied topic with many available data sets and models. However, there is no clear definition of what makes a sentence difficult for aspect-based sentiment analysis. In this paper, we explore this question by conducting an experiment with three data sets: ”Laptops”, ”Restaurants”, and ”MTSC” (Multi-Target-dependent Sentiment Classification), and a merged version of these three datasets. We study the impact of domain diversity and syntactic diversity on difficulty. We use a combination of classifiers to identify the most difficult sentences and analyze their characteristics. We employ two ways of defining sentence difficulty. The first one is binary and labels a sentence as difficult if the classifiers fail to correctly predict the sentiment polarity. The second one is a six-level scale based on how many of the top five best-performing classifiers can correctly predict the sentiment polarity. We also define 9 linguistic features that, combined, aim at estimating the difficulty at sentence level.

Keywords: sentiment analysis, difficulty, classification, machine learning

Procedia PDF Downloads 89
27983 Correlation and Correspondence between Clause and Sentence: An In-Class Observation in Jazan University English Department Context

Authors: Mohammad Mozammel Haque

Abstract:

A clause is a sentence or a part of a sentence having a subject and a principal verb; it may or may not express a complete thought. But, a sentence is a group of words arranged orderly, and it has a complete thought. Clause and sentence are interrelated with each other. It is really quite impossible to decide whether a sentence is simple, complex or compound without having an idea about clauses. Correspondingly, knowing whether a clause is main or subordinate without having an idea about sentence is equally not easy. It is even a task somewhat difficult task for a teacher to teach sentences and clauses in a classroom, unconnectedly or independently. When discussing types of sentences, the teacher must talk about clauses. Likewise, he/she must confer sentences when he teaches clauses in a classroom. This paper aims at discussing types of clauses and sentences in detail, and showing their interrelationship. It also shows that it is requisite to discuss clauses when teaching sentences in the same class, and that the students also have trouble understanding the one without having, at least, a little idea about the other. Ardent and practical paradigms from the books selected for various skill courses in the English Department of Jazan University have also been discussed in this paper.

Keywords: clause, correlation, dependent, independent, interrelationship, sentence

Procedia PDF Downloads 234
27982 A Sentence-to-Sentence Relation Network for Recognizing Textual Entailment

Authors: Isaac K. E. Ampomah, Seong-Bae Park, Sang-Jo Lee

Abstract:

Over the past decade, there have been promising developments in Natural Language Processing (NLP) with several investigations of approaches focusing on Recognizing Textual Entailment (RTE). These models include models based on lexical similarities, models based on formal reasoning, and most recently deep neural models. In this paper, we present a sentence encoding model that exploits the sentence-to-sentence relation information for RTE. In terms of sentence modeling, Convolutional neural network (CNN) and recurrent neural networks (RNNs) adopt different approaches. RNNs are known to be well suited for sequence modeling, whilst CNN is suited for the extraction of n-gram features through the filters and can learn ranges of relations via the pooling mechanism. We combine the strength of RNN and CNN as stated above to present a unified model for the RTE task. Our model basically combines relation vectors computed from the phrasal representation of each sentence and final encoded sentence representations. Firstly, we pass each sentence through a convolutional layer to extract a sequence of higher-level phrase representation for each sentence from which the first relation vector is computed. Secondly, the phrasal representation of each sentence from the convolutional layer is fed into a Bidirectional Long Short Term Memory (Bi-LSTM) to obtain the final sentence representations from which a second relation vector is computed. The relations vectors are combined and then used in then used in the same fashion as attention mechanism over the Bi-LSTM outputs to yield the final sentence representations for the classification. Experiment on the Stanford Natural Language Inference (SNLI) corpus suggests that this is a promising technique for RTE.

Keywords: deep neural models, natural language inference, recognizing textual entailment (RTE), sentence-to-sentence relation

Procedia PDF Downloads 348
27981 Sentence vs. Keyword Content Analysis in Intellectual Capital Disclosures Study

Authors: Martin Surya Mulyadi, Yunita Anwar, Rosinta Ria Panggabean

Abstract:

Major transformations in economic activity from an agricultural economy to knowledge economy have led to an increasing focus on intellectual capital (IC) that has been characterized by continuous innovation, the spread of digital and communication technologies, intangible and human factors. IC is defined as the possession of knowledge and experience, professional knowledge and skill, proper relationships and technological capacities, which when applied will give organizations a competitive advantage. All of IC report/disclosure could be captured from the corporate annual report as it is a communication device that allows a corporation to connect with various external and internal stakeholders. This study was conducted using sentence-content analysis of IC disclosure in the annual report. This research aims to analyze whether the keyword-content analysis is reliable research methodology for IC disclosure related research.

Keywords: intellectual capital, intellectual capital disclosure, content analysis, annual report, sentence analysis, keyword analysis

Procedia PDF Downloads 367
27980 Easily Memorable Strong Password Generation and Retrieval

Authors: Shatadru Das, Natarajan Vijayarangan

Abstract:

In this paper, a system and method for generating and recovering an authorization code has been designed and analyzed. The system creates an authorization code by accepting a base-sentence from a user. Based on the characters present in this base-sentence, the system computes a base-sentence matrix. The system also generates a plurality of patterns. The user can either select the pattern from the multiple patterns suggested by the system or can create his/her own pattern. The system then performs multiplications between the base-sentence matrix and the selected pattern matrix at different stages in the path forward, for obtaining a strong authorization code. In case the user forgets the base sentence, the system has a provision to manage and retrieve 'forgotten authorization code'. This is done by fragmenting the base sentence into different matrices and storing the fragmented matrices into a repository after computing matrix multiplication with a security question-answer approach and with a secret key provided by the user.

Keywords: easy authentication, key retrieval, memorable passwords, strong password generation

Procedia PDF Downloads 400
27979 Syntactic Analyzer for Tamil Language

Authors: Franklin Thambi Jose.S

Abstract:

Computational Linguistics is a branch of linguistics, which deals with the computer and linguistic levels. It is also said, as a branch of language studies which applies computer techniques to linguistics field. In Computational Linguistics, Natural Language Processing plays an important role. This came to exist because of the invention of Information Technology. In computational syntax, the syntactic analyser breaks a sentence into phrases and clauses and identifies the sentence with the syntactic information. Tamil is one of the major Dravidian languages, which has a very long written history of more than 2000 years. It is mainly spoken in Tamilnadu (in India), Srilanka, Malaysia and Singapore. It is an official language in Tamilnadu (in India), Srilanka, Malaysia and Singapore. In Malaysia Tamil speaking people are considered as an ethnic group. In Tamil syntax, the sentences in Tamil are classified into four for this research, namely: 1. Main Sentence 2. Interrogative Sentence 3. Equational Sentence 4. Elliptical Sentence. In computational syntax, the first step is to provide required information regarding the head and its constituent of each sentence. This information will be incorporated to the system using programming languages. Now the system can easily analyse a given sentence with the criteria or mechanisms given to it. Providing needful criteria or mechanisms to the computer to identify the basic types of sentences using Syntactic parser in Tamil language is the major objective of this paper.

Keywords: tamil, syntax, criteria, sentences, parser

Procedia PDF Downloads 517
27978 Hierarchical Tree Long Short-Term Memory for Sentence Representations

Authors: Xiuying Wang, Changliang Li, Bo Xu

Abstract:

A fixed-length feature vector is required for many machine learning algorithms in NLP field. Word embeddings have been very successful at learning lexical information. However, they cannot capture the compositional meaning of sentences, which prevents them from a deeper understanding of language. In this paper, we introduce a novel hierarchical tree long short-term memory (HTLSTM) model that learns vector representations for sentences of arbitrary syntactic type and length. We propose to split one sentence into three hierarchies: short phrase, long phrase and full sentence level. The HTLSTM model gives our algorithm the potential to fully consider the hierarchical information and long-term dependencies of language. We design the experiments on both English and Chinese corpus to evaluate our model on sentiment analysis task. And the results show that our model outperforms several existing state of the art approaches significantly.

Keywords: deep learning, hierarchical tree long short-term memory, sentence representation, sentiment analysis

Procedia PDF Downloads 349
27977 Semantic Textual Similarity on Contracts: Exploring Multiple Negative Ranking Losses for Sentence Transformers

Authors: Yogendra Sisodia

Abstract:

Researchers are becoming more interested in extracting useful information from legal documents thanks to the development of large-scale language models in natural language processing (NLP), and deep learning has accelerated the creation of powerful text mining models. Legal fields like contracts benefit greatly from semantic text search since it makes it quick and easy to find related clauses. After collecting sentence embeddings, it is relatively simple to locate sentences with a comparable meaning throughout the entire legal corpus. The author of this research investigated two pre-trained language models for this task: MiniLM and Roberta, and further fine-tuned them on Legal Contracts. The author used Multiple Negative Ranking Loss for the creation of sentence transformers. The fine-tuned language models and sentence transformers showed promising results.

Keywords: legal contracts, multiple negative ranking loss, natural language inference, sentence transformers, semantic textual similarity

Procedia PDF Downloads 107
27976 Generativism in Language Design and Their Effects on String of Constructions

Authors: Christian Uchechukwu Gilbert

Abstract:

Generativism in language design investigates the framework on which varying sentence structures are built in the English language. Propounded by Noam Chomsky in 1965, the theory transforms sentences from an active structure to a passive one by the application of established rules of the theory. Resident in the body of syntax, the rules include movement, insertion, substitution, and deletion rules. Using the movement rule, the analysis is armed with the qualitative research method, on which the works of scholars were duly consulted for more insight and in line with the academic practice in research activities. The investigation showed that the rules of competent grammar explain the formulation of sentences in a language and how transformation takes place among sentences from a deep structure to a surface structure with accurate results. The structural differences that could be got through dative movement and the deletion of the preposition; passivisation got from an active sentence by the insertion of the preposition “by” a “be verb” and the aspect tense marker “–en”, held as the creative aspect of language vocabulary and the subject-auxiliary inversion that exchanges the auxiliary of a sentence with the subject of the same sentence thereby transforming a kennel sentence to a polar question, viewed as an external argument under θ-theory. Generativism in language design, therefore, changes available types of sentences and relates one form of linguistic category with others in language design.

Keywords: language, generate, transformation, structure, design

Procedia PDF Downloads 68
27975 Psychiatric/Psychological Issues in the Criminal Courts In Australia

Authors: Judge Paul Smith

Abstract:

Abstract—This paper addresses the use and admissibility of psychiatric/psychological evidence in Australia Courts. There have been different approaches in the Courts to the acceptance of such expert evidence. It details how such expert evidence is admissible at trial and sentence. The methodology used is an examination of the decided cases and relevant legislative provisions which relate to the admission of such evidence. The major findings are that the evidence can be admissible if it is relevant to issues in a trial or sentence. It concludes that psychiatric/psychological evidence can be very useful and indeed may be essential at sentence or trial.

Keywords: criminal, law, psychological, evidence

Procedia PDF Downloads 53
27974 An Event-Related Potentials Study on the Processing of English Subjunctive Mood by Chinese ESL Learners

Authors: Yan Huang

Abstract:

Event-related potentials (ERPs) technique helps researchers to make continuous measures on the whole process of language comprehension, with an excellent temporal resolution at the level of milliseconds. The research on sentence processing has developed from the behavioral level to the neuropsychological level, which brings about a variety of sentence processing theories and models. However, the applicability of these models to L2 learners is still under debate. Therefore, the present study aims to investigate the neural mechanisms underlying English subjunctive mood processing by Chinese ESL learners. To this end, English subject clauses with subjunctive moods are used as the stimuli, all of which follow the same syntactic structure, “It is + adjective + that … + (should) do + …” Besides, in order to examine the role that language proficiency plays on L2 processing, this research deals with two groups of Chinese ESL learners (18 males and 22 females, mean age=21.68), namely, high proficiency group (Group H) and low proficiency group (Group L). Finally, the behavioral and neurophysiological data analysis reveals the following findings: 1) Syntax and semantics interact with each other on the SECOND phase (300-500ms) of sentence processing, which is partially in line with the Three-phase Sentence Model; 2) Language proficiency does affect L2 processing. Specifically, for Group H, it is the syntactic processing that plays the dominant role in sentence processing while for Group L, semantic processing also affects the syntactic parsing during the THIRD phase of sentence processing (500-700ms). Besides, Group H, compared to Group L, demonstrates a richer native-like ERPs pattern, which further demonstrates the role of language proficiency in L2 processing. Based on the research findings, this paper also provides some enlightenment for the L2 pedagogy as well as the L2 proficiency assessment.

Keywords: Chinese ESL learners, English subjunctive mood, ERPs, L2 processing

Procedia PDF Downloads 131
27973 Multi-Level Attentional Network for Aspect-Based Sentiment Analysis

Authors: Xinyuan Liu, Xiaojun Jing, Yuan He, Junsheng Mu

Abstract:

Aspect-based Sentiment Analysis (ABSA) has attracted much attention due to its capacity to determine the sentiment polarity of the certain aspect in a sentence. In previous works, great significance of the interaction between aspect and sentence has been exhibited in ABSA. In consequence, a Multi-Level Attentional Networks (MLAN) is proposed. MLAN consists of four parts: Embedding Layer, Encoding Layer, Multi-Level Attentional (MLA) Layers and Final Prediction Layer. Among these parts, MLA Layers including Aspect Level Attentional (ALA) Layer and Interactive Attentional (ILA) Layer is the innovation of MLAN, whose function is to focus on the important information and obtain multiple levels’ attentional weighted representation of aspect and sentence. In the experiments, MLAN is compared with classical TD-LSTM, MemNet, RAM, ATAE-LSTM, IAN, AOA, LCR-Rot and AEN-GloVe on SemEval 2014 Dataset. The experimental results show that MLAN outperforms those state-of-the-art models greatly. And in case study, the works of ALA Layer and ILA Layer have been proven to be effective and interpretable.

Keywords: deep learning, aspect-based sentiment analysis, attention, natural language processing

Procedia PDF Downloads 138
27972 Consequences of Sentence on Children's Socialization: Exploratory Study of Criminal Women of Punjab, Pakistan

Authors: Muhammad Shabbir

Abstract:

This paper inspects the effects of the sentenced criminal women upon the socialization of their children, in the Pakistani context. The objectives of the study are to find out the socio-psychological and cultural effects of the jail environment on the children and behavior of sentenced women towards their children as well as analyze the facilities provided by the jail authorities for the socialization of the women. Quantitative variables and qualitative thematic variables caused by the opinions through open-ended questionnaire were collected and analyze by applying statistical measures, e.g. Social Sciences Package for Social Sciences (SPSS), to reflect out the results. It was found that the sentence of women shatters the socialization process of their children which commonly leads them to criminality. The government should review the ongoing sentence policies for an improvement and betterment. For this purpose, the idea of socialization centers would be a healthy initiative.

Keywords: socialization, criminal women, sentence, socio-psychological and cultural

Procedia PDF Downloads 219
27971 The Patterns of Cross-Sentence: An Event-Related Potential Study of Mathematical Word Problem

Authors: Tien-Ching Yao, Ching-Ching Lu

Abstract:

Understanding human language processing is one of the main challenges of current cognitive neuroscience. The aims of the present study were to use a sentence decision task combined with event-related potentials to investigate the psychological reality of "cross-sentence patterns." Therefore, we take the math word problems the experimental materials and use the ERPs' P600 component to verify. In this study, the experimental material consisted of 200 math word problems with three different conditions were used ( multiplication word problems、division word problems type 1、division word problems type 2 ). Eighteen Mandarin native speakers participated in the ERPs study (14 of whom were female). The result of the grand average waveforms suggests a later posterior positivity at around 500ms - 900ms. These findings were tested statistically using repeated measures ANOVAs at the component caused by the stimulus type of different questions. Results suggest that three conditions present significant (P < 0.05) on the Mean Amplitude, Latency, and Peak Amplitude. The result showed the characteristic timing and posterior scalp distribution of a P600 effect. We interpreted these characteristic responses as the psychological reality of "cross-sentence patterns." These results provide insights into the sentence processing issues in linguistic theory and psycholinguistic models of language processing and advance our understanding of how people make sense of information during language comprehension.

Keywords: language processing, sentence comprehension, event-related potentials, cross-sentence patterns

Procedia PDF Downloads 148
27970 Sentence Structure for Free Word Order Languages in Context with Anaphora Resolution: A Case Study of Hindi

Authors: Pardeep Singh, Kamlesh Dutta

Abstract:

Many languages have fixed sentence structure and others are free word order. The accuracy of anaphora resolution of syntax based algorithm depends on structure of the sentence. So, it is important to analyze the structure of any language before implementing these algorithms. In this study, we analyzed the sentence structure exploiting the case marker in Hindi as well as some special tag for subject and object. We also investigated the word order for Hindi. Word order typology refers to the study of the order of the syntactic constituents of a language. We analyzed 165 news items of Ranchi Express from EMILEE corpus of plain text. It consisted of 1745 sentences. Eight file of dialogue based from the same corpus has been analyzed which will have 1521 sentences. The percentages of subject object verb structure (SOV) and object subject verb (OSV) are 66.90 and 33.10, respectively.

Keywords: anaphora resolution, free word order languages, SOV, OSV

Procedia PDF Downloads 472
27969 There Is No Meaningful Opportunity in Meaningless Data: Why It Is Unconstitutional to Use Life Expectancy Tables in Post-Graham Sentences

Authors: Stacie Nelson Colling, Adele Cummings

Abstract:

The United States Supreme Court recently announced that it is unconstitutional to sentence a child to life without parole for non-homicide offenses, and that each child so situated must be afforded a meaningful opportunity for release from prison in his lifetime. The Court also declared that it is unconstitutional to impose a mandatory sentence of life without parole on a child for homicide offenses. Across the United States, attorneys and advocates continue to litigate issues surrounding the implementation of these legal principles. Some states have held that any sentence to a finite term of years, no matter how long, is not the same as ‘life’ and therefore does not violate the constitution. Other states have held that a sentence to a term of years that is less than the expected life of that particular child is not unconstitutional. In Colorado, the courts have routinely looked to life expectancy estimates from governmental organizations to determine how long a particular child is expected to live. They then compare that the date that the child is expected to be eligible for parole, and if the child is expected to still be living when he is eligible for parole, the sentence is deemed constitutional. This paper argues that it is inappropriate, reckless, unconstitutional and not scientifically sound to use such estimates in determining whether a child will have a meaningful opportunity for release from prison and life outside of prison before he dies. This paper argues that the opportunity for release must mean more than a probability that a child will be released before his death, and that it must include an opportunity for a meaningful life outside of prison (not just the opportunity to be released and then die on the outside). The paper further argues that life expectancy estimates cannot guide a court or a legislature in determining whether a sentence is or is not constitutional.

Keywords: life without parole, life expectancy, juvenile sentencing, meaningful opportunity for release from prison

Procedia PDF Downloads 394
27968 Differences in the Processing of Sentences with Lexical Ambiguity and Structural Ambiguity: An Experimental Study

Authors: Mariana T. Teixeira, Joana P. Luz

Abstract:

This paper is based on assumptions of psycholinguistics and investigates the processing of ambiguous sentences in Brazilian Portuguese. Specifically, it aims to verify if there is a difference in processing time between sentences with lexical ambiguity and sentences with structural (or syntactic) ambiguity. We hypothesize, based on the Garden Path Theory, that the two types of ambiguity entail different cognitive efforts, since sentences with structural ambiguity require that two structures be processed, whereas ambiguous phrases whose root of ambiguity is in a word require the processing of a single structure, which admits a variation of punctual meaning, within the scope of only one lexical item. In order to test this hypothesis, 25 undergraduate students, whose average age was 27.66 years, native speakers of Brazilian Portuguese, performed a self-monitoring reading task of ambiguous sentences, which had lexical and structural ambiguity. The results suggest that unambiguous sentence processing is faster than ambiguous sentence processing, whether it has lexical or structural ambiguity. In addition, participants presented a mean reading time greater for sentences with syntactic ambiguity than for sentences with lexical ambiguity, evidencing a greater cognitive effort in sentence processing with structural ambiguity.

Keywords: Brazilian portuguese, lexical ambiguity, sentence processing, syntactic ambiguity

Procedia PDF Downloads 228
27967 Verb Bias in Mandarin: The Corpus Based Study of Children

Authors: Jou-An Chung

Abstract:

The purpose of this study is to investigate the verb bias of the Mandarin verbs in children’s reading materials and provide the criteria for categorization. Verb bias varies cross-linguistically. As Mandarin and English are typological different, this study hopes to shed light on Mandarin verb bias with the use of corpus and provide thorough and detailed criteria for analysis. Moreover, this study focuses on children’s reading materials since it is a significant issue in understanding children’s sentence processing. Therefore, investigating verb bias of Mandarin verbs in children’s reading materials is also an important issue and can provide further insights into children’s sentence processing. The small corpus is built up for this study. The corpus consists of the collection of school textbooks and Mandarin Daily News for children. The files are then segmented and POS tagged by JiebaR (Chinese segmentation with R). For the ease of analysis, the one-word character verbs and intransitive verbs are excluded beforehand. The total of 20 high frequency verbs are hand-coded and are further categorized into one of the three types, namely DO type, SC type and other category. If the frequency of taking Other Type exceeds the threshold of 25%, the verb is excluded from the study. The results show that 10 verbs are direct object bias verbs, and six verbs are sentential complement bias verbs. The paired T-test was done to assure the statistical significance (p = 0.0001062 for DO bias verb, p=0.001149 for SC bias verb). The result has shown that in children’s reading materials, the DO biased verbs are used more than the SC bias verbs since the simplest structure of sentences is easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child's reading materials but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context. Sentences are easier for children’s sentence comprehension or processing. In sum, this study not only discussed verb bias in child corpus, but also provided basic coding criteria for verb bias analysis in Mandarin and underscored the role of context.

Keywords: corpus linguistics, verb bias, child language, psycholinguistics

Procedia PDF Downloads 291
27966 Characteristic Sentence Stems in Academic English Texts: Definition, Identification, and Extraction

Authors: Jingjie Li, Wenjie Hu

Abstract:

Phraseological units in academic English texts have been a central focus in recent corpus linguistic research. A wide variety of phraseological units have been explored, including collocations, chunks, lexical bundles, patterns, semantic sequences, etc. This paper describes a special category of clause-level phraseological units, namely, Characteristic Sentence Stems (CSSs), with a view to describing their defining criteria and extraction method. CSSs are contiguous lexico-grammatical sequences which contain a subject-predicate structure and which are frame expressions characteristic of academic writing. The extraction of CSSs consists of six steps: Part-of-speech tagging, n-gram segmentation, structure identification, significance of occurrence calculation, text range calculation, and overlapping sequence reduction. Significance of occurrence calculation is the crux of this study. It includes the computing of both the internal association and the boundary independence of a CSS and tests the occurring significance of the CSS from both inside and outside perspectives. A new normalization algorithm is also introduced into the calculation of LocalMaxs for reducing overlapping sequences. It is argued that many sentence stems are so recurrent in academic texts that the most typical of them have become the habitual ways of making meaning in academic writing. Therefore, studies of CSSs could have potential implications and reference value for academic discourse analysis, English for Academic Purposes (EAP) teaching and writing.

Keywords: characteristic sentence stem, extraction method, phraseological unit, the statistical measure

Procedia PDF Downloads 166
27965 A Corpus-Based Study on the Styles of Three Translators

Authors: Wang Yunhong

Abstract:

The present paper is preoccupied with the different styles of three translators in their translating a Chinese classical novel Shuihu Zhuan. Based on a parallel corpus, it adopts a target-oriented approach to look into whether and what stylistic differences and shifts the three translations have revealed. The findings show that the three translators demonstrate different styles concerning their word choices and sentence preferences, which implies that identification of recurrent textual patterns may be a basic step for investigating the style of a translator.

Keywords: corpus, lexical choices, sentence characteristics, style

Procedia PDF Downloads 268
27964 Memory Retrieval and Implicit Prosody during Reading: Anaphora Resolution by L1 and L2 Speakers of English

Authors: Duong Thuy Nguyen, Giulia Bencini

Abstract:

The present study examined structural and prosodic factors on the computation of antecedent-reflexive relationships and sentence comprehension in native English (L1) and Vietnamese-English bilinguals (L2). Participants read sentences presented on the computer screen in one of three presentation formats aimed at manipulating prosodic parsing: word-by-word (RSVP), phrase-segment (self-paced), or whole-sentence (self-paced), then completed a grammaticality rating and a comprehension task (following Pratt & Fernandez, 2016). The design crossed three factors: syntactic structure (simple; complex), grammaticality (target-match; target-mismatch) and presentation format. An example item is provided in (1): (1) The actress that (Mary/John) interviewed at the awards ceremony (about two years ago/organized outside the theater) described (herself/himself) as an extreme workaholic). Results showed that overall, both L1 and L2 speakers made use of a good-enough processing strategy at the expense of more detailed syntactic analyses. L1 and L2 speakers’ comprehension and grammaticality judgements were negatively affected by the most prosodically disrupting condition (word-by-word). However, the two groups demonstrated differences in their performance in the other two reading conditions. For L1 speakers, the whole-sentence and the phrase-segment formats were both facilitative in the grammaticality rating and comprehension tasks; for L2, compared with the whole-sentence condition, the phrase-segment paradigm did not significantly improve accuracy or comprehension. These findings are consistent with the findings of Pratt & Fernandez (2016), who found a similar pattern of results in the processing of subject-verb agreement relations using the same experimental paradigm and prosodic manipulation with English L1 and L2 English-Spanish speakers. The results provide further support for a Good-Enough cue model of sentence processing that integrates cue-based retrieval and implicit prosodic parsing (Pratt & Fernandez, 2016) and highlights similarities and differences between L1 and L2 sentence processing and comprehension.

Keywords: anaphora resolution, bilingualism, implicit prosody, sentence processing

Procedia PDF Downloads 152
27963 Methodologies for Deriving Semantic Technical Information Using an Unstructured Patent Text Data

Authors: Jaehyung An, Sungjoo Lee

Abstract:

Patent documents constitute an up-to-date and reliable source of knowledge for reflecting technological advance, so patent analysis has been widely used for identification of technological trends and formulation of technology strategies. But, identifying technological information from patent data entails some limitations such as, high cost, complexity, and inconsistency because it rely on the expert’ knowledge. To overcome these limitations, researchers have applied to a quantitative analysis based on the keyword technique. By using this method, you can include a technological implication, particularly patent documents, or extract a keyword that indicates the important contents. However, it only uses the simple-counting method by keyword frequency, so it cannot take into account the sematic relationship with the keywords and sematic information such as, how the technologies are used in their technology area and how the technologies affect the other technologies. To automatically analyze unstructured technological information in patents to extract the semantic information, it should be transformed into an abstracted form that includes the technological key concepts. Specific sentence structure ‘SAO’ (subject, action, object) is newly emerged by representing ‘key concepts’ and can be extracted by NLP (Natural language processor). An SAO structure can be organized in a problem-solution format if the action-object (AO) states that the problem and subject (S) form the solution. In this paper, we propose the new methodology that can extract the SAO structure through technical elements extracting rules. Although sentence structures in the patents text have a unique format, prior studies have depended on general NLP (Natural language processor) applied to the common documents such as newspaper, research paper, and twitter mentions, so it cannot take into account the specific sentence structure types of the patent documents. To overcome this limitation, we identified a unique form of the patent sentences and defined the SAO structures in the patents text data. There are four types of technical elements that consist of technology adoption purpose, application area, tool for technology, and technical components. These four types of sentence structures from patents have their own specific word structure by location or sequence of the part of speech at each sentence. Finally, we developed algorithms for extracting SAOs and this result offer insight for the technology innovation process by providing different perspectives of technology.

Keywords: NLP, patent analysis, SAO, semantic-analysis

Procedia PDF Downloads 262
27962 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 95
27961 A Comparison between Bèi Passives and Yóu Passives in Mandarin Chinese

Authors: Rui-heng Ray Huang

Abstract:

This study compares the syntax and semantics of two kinds of passives in Mandarin Chinese: bèi passives and yóu passives. To express a Chinese equivalent for ‘The thief was taken away by the police,’ either bèi or yóu can be used, as in Xiǎotōu bèi/yóu jǐngchá dàizǒu le. It is shown in this study that bèi passives and yóu passives differ semantically and syntactically. The semantic observations are based on the theta theory, dealing with thematic roles. On the other hand, the syntactic analysis draws heavily upon the generative grammar, looking into thematic structures. The findings of this study are as follows. First, the core semantics of bèi passives is centered on the Patient NP in the subject position. This Patient NP is essentially an Affectee, undergoing the outcome or consequence brought up by the action represented by the predicate. This may explain why in the sentence Wǒde huà bèi/*yóu tā niǔqū le ‘My words have been twisted by him/her,’ only bèi is allowed. This is because the subject NP wǒde huà ‘my words’ suffers a negative consequence. Yóu passives, in contrast, place the semantic focus on the post-yóu NP, which is not an Affectee though. Instead, it plays a role which has to take certain responsibility without being affected in a way like an Affectee. For example, in the sentence Zhèbù diànyǐng yóu/*bèi tā dānrèn dǎoyǎn ‘This film is directed by him/her,’ only the use of yóu is possible because the post-yóu NP tā ‘s/he’ refers to someone in charge, who is not an Affectee, nor is the sentence-initial NP zhèbù diànyǐng ‘this film’. When it comes to the second finding, the syntactic structures of bèi passives and yóu passives differ in that the former involve a two-place predicate while the latter a three-place predicate. The passive morpheme bèi in a case like Xiǎotōu bèi jǐngchá dàizǒu le ‘The thief was taken away by the police’ has been argued by some Chinese syntacticians to be a two-place predicate which selects an Experiencer subject and an Event complement. Under this analysis, the initial NP xiǎotōu ‘the thief’ in the above example is a base-generated subject. This study, however, proposes that yóu passives fall into a three-place unergative structure. In the sentence Xiǎotōu yóu jǐngchá dàizǒu le ‘The thief was taken away by the police,’ the initial NP xiǎotōu ‘the thief’ is a topic which serves as a Patient taken by the verb dàizǒu ‘take away.’ The subject of the sentence is assumed to be an Agent, which is in a null form and may find its reference from the discourse or world knowledge. Regarding the post-yóu NP jǐngchá ‘the police,’ its status is dual. On the one hand, it is a Patient introduced by the light verb yóu; on the other, it is an Agent assigned by the verb dàizǒu ‘take away.’ It is concluded that the findings in this study contribute to better understanding of what makes the distinction between the two kinds of Chinese passives.

Keywords: affectee, passive, patient, unergative

Procedia PDF Downloads 273
27960 Benchmarking Bert-Based Low-Resource Language: Case Uzbek NLP Models

Authors: Jamshid Qodirov, Sirojiddin Komolov, Ravilov Mirahmad, Olimjon Mirzayev

Abstract:

Nowadays, natural language processing tools play a crucial role in our daily lives, including various techniques with text processing. There are very advanced models in modern languages, such as English, Russian etc. But, in some languages, such as Uzbek, the NLP models have been developed recently. Thus, there are only a few NLP models in Uzbek language. Moreover, there is no such work that could show which Uzbek NLP model behaves in different situations and when to use them. This work tries to close this gap and compares the Uzbek NLP models existing as of the time this article was written. The authors try to compare the NLP models in two different scenarios: sentiment analysis and sentence similarity, which are the implementations of the two most common problems in the industry: classification and similarity. Another outcome from this work is two datasets for classification and sentence similarity in Uzbek language that we generated ourselves and can be useful in both industry and academia as well.

Keywords: NLP, benchmak, bert, vectorization

Procedia PDF Downloads 54
27959 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li

Abstract:

The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 218
27958 Syntax-Related Problems of Translation

Authors: Anna Kesoyan

Abstract:

The present paper deals with the syntax-related problems of translation from English into Armenian. Although Syntax is a part of grammar, syntax-related problems of translation are studied separately during the process of translation. Translation from one language to another is widely accepted as a challenging problem. This becomes even more challenging when the source and target languages are widely different in structure and style, as is the case with English and Armenian. Syntax-related problems of translation from English into Armenian are mainly connected with the syntactical structures of these languages, and particularly, with the word order of the sentence. The word order of the sentence of the Armenian language, which is a synthetic language, is usually characterized as “rather free”, and the word order of the English language, which is an analytical language, is characterized “fixed”. The following research examines the main translation means, particularly, syntactical transformations as the translator has to take real steps while trying to solve certain syntax-related problems. Most of the means of translation are based on the transformation of grammatical components of the sentence, without changing the main information of the text. There are several transformations that occur during translation such as word order of the sentence, transformations of certain grammatical constructions like Infinitive participial construction, Nominative with the Infinitive and Elliptical constructions which have been covered in the following research.

Keywords: elliptical constructions, nominative with the infinitive constructions, fixed and free word order, syntactic structures

Procedia PDF Downloads 453
27957 Russian Spatial Impersonal Sentence Models in Translation Perspective

Authors: Marina Fomina

Abstract:

The paper focuses on the category of semantic subject within the framework of a functional approach to linguistics. The semantic subject is related to similar notions such as the grammatical subject and the bearer of predicative feature. It is the multifaceted nature of the category of subject that 1) triggers a number of issues that, syntax-wise, remain to be dealt with (cf. semantic vs. syntactic functions / sentence parts vs. parts of speech issues, etc.); 2) results in a variety of approaches to the category of subject, such as formal grammatical, semantic/syntactic (functional), communicative approaches, etc. Many linguists consider the prototypical approach to the category of subject to be the most instrumental as it reveals the integrity of denotative and linguistic components of the conceptual category. This approach relates to subject as a source of non-passive predicative feature, an element of subject-predicate-object situation that can take on a variety of semantic roles, cf.: 1) an agent (He carefully surveyed the valley stretching before him), 2) an experiencer (I feel very bitter about this), 3) a recipient (I received this book as a gift), 4) a causee (The plane broke into three pieces), 5) a patient (This stove cleans easily), etc. It is believed that the variety of roles stems from the radial (prototypical) structure of the category with some members more central than others. Translation-wise, the most “treacherous” subject types are the peripheral ones. The paper 1) features a peripheral status of spatial impersonal sentence models such as U menia v ukhe zvenit (lit. I-Gen. in ear buzzes) within the category of semantic subject, 2) makes a structural and semantic analysis of the models, 3) focuses on their Russian-English translation patterns, 4) reveals non-prototypical features of subjects in the English equivalents.

Keywords: bearer of predicative feature, grammatical subject, impersonal sentence model, semantic subject

Procedia PDF Downloads 370