Search results for: H. Noun

17 Determining the Gender of Korean Names for Pronoun Generation

Authors: Seong-Bae Park, Hee-Geun Yoon

Abstract:

It is an important task in Korean-English machine translation to classify the gender of names correctly. When a sentence is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus, in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method yields the improvement in accuracy over the simple looking-up of the collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed method is more plausible for the gender classification of the Korean names.

Keywords: machine translation, natural language processing, gender of proper nouns, statistical method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2366

16 Promoting Open Educational Resources (OER) in Theological/Religious Education in Nigeria

Authors: Miracle Ajah

Abstract:

One of the biggest challenges facing Theological/ Religious Education in Nigeria is access to quality learning materials. For instance at the Trinity (Union) Theological College, Umuahia, it was difficult for lecturers to access suitable and qualitative materials for instruction especially the ones that would suit the African context and stimulate a deep rooted interest among the students. Some textbooks written by foreign authors were readily available in the School Library, but were lacking in the College bookshops for students to own copies. Even when the College was able to order some of the books from abroad, it did not usher in the needed enthusiasm expected from the students because they were either very expensive or very difficult to understand during private studies. So it became necessary to develop contextual materials which were affordable and understandable, though with little success. The National Open University of Nigeria (NOUN)’s innovation in the development and sharing of learning resources through its Open Courseware is a welcome development and of great assistance to students. Apart from NOUN students who could easily access the materials, many others from various theological/religious institutes across the nation have benefited immensely. So, the thesis of this paper is that the promotion of open educational resources in theological/religious education in Nigeria would facilitate a better informed/equipped religious leadership, which would in turn impact its adherents for a healthier society and national development. Adopting a narrative and historical approach within the context of Nigeria’s educational system, the paper discusses: educational traditions in Nigeria; challenges facing theological/religious education in Nigeria; and benefits of open educational resources. The study goes further to making recommendations on how OER could positively influence theological/religious education in Nigeria. It is expected that theologians, religious educators, and ODL practitioners would find this work very useful.

Keywords: Nigeria, OER, religious education, theological education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2408

15 Quranic Braille System

Authors: Abdallah M. Abualkishik, Khairuddin Omar

Abstract:

This article concerned with the translation of Quranic verses to Braille symbols, by using Visual basic program. The system has the ability to translate the special vibration for the Quran. This study limited for the (Noun + Scoon) vibrations. It builds on an existing translation system that combines a finite state machine with left and right context matching and a set of translation rules. This allows to translate the Arabic language from text to Braille symbols after detect the vibration for the Quran verses.

Keywords: Braille, Quran vibration, Finite State Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3060

14 Error Analysis of English Inflection among Thai University Students

Authors: Suwaree Yordchim, Toby J. Gibbs

Abstract:

The linguistic competence of Thai university students majoring in Business English was examined in the context of knowledge of English language inflection, and also various linguistic elements. Errors analysis was applied to the results of the testing. Levels of errors in inflection, tense and linguistic elements were shown to be significantly high for all noun, verb and adjective inflections. Findings suggest that students do not gain linguistic competence in their use of English language inflection, because of interlanguage interference. Implications for curriculum reform and treatment of errors in the classroom are discussed.

Keywords: Interlanguage, error analysis, inflection, second language acquisition, Thai students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3629

13 The Impact of Grammatical Differences on English-Mandarin Chinese Simultaneous Interpreting

Authors: Miao Sabrina Wang

Abstract:

This paper examines the impact of grammatical differences on simultaneous interpreting from English into Mandarin Chinese by drawing upon an empirical study of professional and student interpreters. The research focuses on the effects of three grammatical categories including passives, adverbial components and noun phrases on simultaneous interpreting. For each category, interpretations of instances in which the grammatical structures are the same across the two languages are compared with interpretations of instances in which the grammatical structures differ across the two languages in terms of content accuracy and delivery appropriateness. The results indicate that grammatical differences have a significant impact on the interpreting performance of both professionals and students.

Keywords: Grammatical differences, simultaneous interpreting, content accuracy, delivery appropriateness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1733

12 Approach for a Safety Element out of Context for an Actuator Circuit Control Module

Authors: H. Noun, C. Urban-Seelmann, M. Abdelfattah, G. Zeller, G. Rajesh, I. Mozgova, R. Lachmayer

Abstract:

Several modules in automotive are usually modified and adapted for various project-specific applications. Due to a standardized safety concept a high reusability is accessible. A safety element out of context (SEooC) according to ISO 26262 can be a suitable approach. Based on the same safety concept and analysis, common modules can reach high reusability. For developing according to a module out of context, an appropriate and detailed development approach is required. This paper shows how to deduce this development processes for platform modules. Therefore, the detailed approach of the SEooC is derived. The aim is to create a detailed workflow for all phases of the development and integration of any kind of system modules. As an application example, an automotive project for an actuator control module is considered.

Keywords: Functional Safety, Safety Element out of Context, System Engineering, Hardware Engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 379

11 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2461

10 Words of Peace in the Speeches of the Egyptian President, Abdulfattah El-Sisi: A Corpus-Based Study

Authors: Mohamed S. Negm, Waleed S. Mandour

Abstract:

The present study aims primarily at investigating words of peace (lexemes of peace) in the formal speeches of the Egyptian president Abdulfattah El-Sisi in a two-year span of time, from 2018 to 2019. This paper attempts to shed light not only on the contextual use of the antonyms, war and peace, but also it underpins quantitative analysis through the current methods of corpus linguistics. As such, the researchers have deployed a corpus-based approach in collecting, encoding, and processing 30 presidential speeches over the stated period (23,411 words and 25,541 tokens in total). Further, semantic fields and collocational networkzs are identified and compared statistically. Results have shown a significant propensity of adopting peace, including its relevant collocation network, textually and therefore, ideationally, at the expense of war concept which in most cases surfaces euphemistically through the noun conflict. The president has not justified the action of war with an honorable cause or a valid reason. Such results, so far, have indicated a positive sociopolitical mindset the Egyptian president possesses and moreover, reveal national and international fair dealing on arising issues.

Keywords: Corpus-assisted discourse studies, critical discourse analysis, collocation network, corpus linguistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628

9 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813

8 Tagging by Combining Rules- Based Method and Memory-Based Learning

Authors: Tlili-Guiassa Yamina

Abstract:

Many natural language expressions are ambiguous, and need to draw on other sources of information to be interpreted. Interpretation of the e word تعاون to be considered as a noun or a verb depends on the presence of contextual cues. To interpret words we need to be able to discriminate between different usages. This paper proposes a hybrid of based- rules and a machine learning method for tagging Arabic words. The particularity of Arabic word that may be composed of stem, plus affixes and clitics, a small number of rules dominate the performance (affixes include inflexional markers for tense, gender and number/ clitics include some prepositions, conjunctions and others). Tagging is closely related to the notion of word class used in syntax. This method is based firstly on rules (that considered the post-position, ending of a word, and patterns), and then the anomaly are corrected by adopting a memory-based learning method (MBL). The memory_based learning is an efficient method to integrate various sources of information, and handling exceptional data in natural language processing tasks. Secondly checking the exceptional cases of rules and more information is made available to the learner for treating those exceptional cases. To evaluate the proposed method a number of experiments has been run, and in order, to improve the importance of the various information in learning.

Keywords: Arabic language, Based-rules, exceptions, Memorybased learning, Tagging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622

7 Addressing Scalability Issues of Named Entity Recognition Using Multi-Class Support Vector Machines

Authors: Mona Soliman Habib

Abstract:

This paper explores the scalability issues associated with solving the Named Entity Recognition (NER) problem using Support Vector Machines (SVM) and high-dimensional features. The performance results of a set of experiments conducted using binary and multi-class SVM with increasing training data sizes are examined. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. A simple machine learning approach is used that eliminates prior language knowledge such as part-of-speech or noun phrase tagging thereby allowing for its applicability across languages. No domain-specific knowledge is included. The accuracy measures achieved are comparable to those obtained using more complex approaches, which constitutes a motivation to investigate ways to improve the scalability of multiclass SVM in order to make the solution more practical and useable. Improving training time of multi-class SVM would make support vector machines a more viable and practical machine learning solution for real-world problems with large datasets. An initial prototype results in great improvement of the training time at the expense of memory requirements.

Keywords: Named entity recognition, support vector machines, language independence, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1689

6 Linguistic Devices Reflecting Violence in Border–Provinces of Southern Thailand on the Front Page of Local and National Newspapers

Authors: Chanokporn Angsuviriya

Abstract:

The objective of the study is to analyze linguistic devices reflecting the violence in the south border provinces; namely Pattani, Yala, Narathiwat and Songkla on 1,344 front pages of three local newspapers; namely ChaoTai, Focus PhakTai and Samila Time and of two national newspapers, including ThaiRath and Matichon, between 2004 and 2005, and 2011 and 2012. The study shows that there are two important linguistic devices: 1) lexical choices consisting of the use of verbs describing violence, the use of quantitative words and the use of words naming someone who committed violent acts, and 2) metaphors consisting of “A VIOLENT PROBLEM IS HEAT”, “A VICTIM IS A LEAF”, and “A TERRORIST IS A DOG”. Comparing linguistic devices between two types of newspapers, national newspapers choose to use words more violently than local newspapers do. Moreover, they create more negative images of the south of Thailand by using stative verbs. In addition, in term of metaphors “A TERRORIST IS A FOX.” is only found in national newspapers. As regards naming terrorists “southern insurgents”, this noun phrase which is collectively called by national newspapers has strongly negative meaning. Moreover, “southern insurgents” have been perceived by the Thais in the whole country while “insurgents” that are not modified have been only used by local newspapers.

Keywords: Linguistic Devices, Local Newspapers, National Newspapers, Violence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1287

5 Correction of Frequent English Writing Errors by Using Coded Indirect Corrective Feedback and Error Treatment

Authors: Chaiwat Tantarangsee

Abstract:

The purposes of this study are 1) to study the frequent English writing errors of students registering the course: Reading and Writing English for Academic Purposes II, and 2) to find out the results of writing error correction by using coded indirect corrective feedback and writing error treatments. Samples include 28 2nd year English Major students, Faculty of Education, Suan Sunandha Rajabhat University. Tool for experimental study includes the lesson plan of the course; Reading and Writing English for Academic Purposes II, and tool for data collection includes 4 writing tests of short texts. The research findings disclose that frequent English writing errors found in this course comprise 7 types of grammatical errors, namely Fragment sentence, Subject-verb agreement, Wrong form of verb tense, Singular or plural noun endings, Run-ons sentence, Wrong form of verb pattern and Lack of parallel structure. Moreover, it is found that the results of writing error correction by using coded indirect corrective feedback and error treatment reveal the overall reduction of the frequent English writing errors and the increase of students’ achievement in the writing of short texts with the significance at .05.

Keywords: Coded indirect corrective feedback, error correction, error treatment, frequent English writing errors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2197

4 A Corpus-Based Analysis on Code-Mixing Features in Mandarin-English Bilingual Children in Singapore

Authors: Xunan Huang, Caicai Zhang

Abstract:

This paper investigated the code-mixing features in Mandarin-English bilingual children in Singapore. First, it examined whether the code-mixing rate was different in Mandarin Chinese and English contexts. Second, it explored the syntactic categories of code-mixing in Singapore bilingual children. Moreover, this study investigated whether morphological information was preserved when inserting syntactic components into the matrix language. Data are derived from the Singapore Bilingual Corpus, in which the recordings and transcriptions of sixty English-Mandarin 5-to-6-year-old children were preserved for analysis. Results indicated that the rate of code-mixing was asymmetrical in the two language contexts, with the rate being significantly higher in the Mandarin context than that in the English context. The asymmetry is related to language dominance in that children are more likely to code-mix when using their nondominant language. Concerning the syntactic categories of code-mixing words in the Singaporean bilingual children, we found that noun-mixing, verb-mixing, and adjective-mixing are the three most frequently used categories in code-mixing in the Mandarin context. This pattern mirrors the syntactic categories of code-mixing in the Cantonese context in Cantonese-English bilingual children, and the general trend observed in lexical borrowing. Third, our results also indicated that English vocabularies that carry morphological information are embedded in bare forms in the Mandarin context. These findings shed light upon how bilingual children take advantage of the two languages in mixed utterances in a bilingual environment.

Keywords: Code-mixing, Mandarin Chinese, English, bilingual children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1121

3 Correction of Frequent English Writing Errors by Using Coded Indirect Corrective Feedback and Error Treatment: The Case of Reading and Writing English for Academic Purposes II

Authors: Chaiwat Tantarangsee

Abstract:

The purposes of this study are 1) to study the frequent English writing errors of students registering the course: Reading and Writing English for Academic Purposes II, and 2) to find out the results of writing error correction by using coded indirect corrective feedback and writing error treatments. Samples include 28 2nd year English Major students, Faculty of Education, Suan Sunandha Rajabhat University. Tool for experimental study includes the lesson plan of the course; Reading and Writing English for Academic Purposes II, and tool for data collection includes 4 writing tests of short texts. The research findings disclose that frequent English writing errors found in this course comprise 7 types of grammatical errors, namely Fragment sentence, Subject-verb agreement, Wrong form of verb tense, Singular or plural noun endings, Run-ons sentence, Wrong form of verb pattern and Lack of parallel structure. Moreover, it is found that the results of writing error correction by using coded indirect corrective feedback and error treatment reveal the overall reduction of the frequent English writing errors and the increase of students’ achievement in the writing of short texts with the significance at .05.

Keywords: Coded indirect corrective feedback, error correction, and error treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1093

2 Improved Dynamic Bayesian Networks Applied to Arabic on Line Characters Recognition

Authors: Redouane Tlemsani, Abdelkader Benyettou

Abstract:

Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology.

This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data.

Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables.

In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization.

The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.

Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781

1 Redundancy in Malay Morphology: School Grammar versus Corpus Grammar

Authors: Zaharani Ahmad, Nor Hashimah Jalaluddin

Abstract:

The aim of this paper is to examine and identify the issue of linguistic redundancy in two competing grammars of Malay, namely the school grammar and the corpus grammar. The former is a normative grammar which is formally and prescriptively taught in the classroom, whereas the latter is a descriptive grammar that is informally acquired and mastered by the students as native speakers of the language outside the classroom. Corpus grammar is depicted based on its actual used in natural occurring texts, as attested in the corpus. It is observed that the grammar taught in schools is incompatible with the grammar used in the corpus. For instance, a noun phrase containing nominal reduplicated form which denotes plurality (i.e. murid-murid ‘students’ which is derived from murid ‘student’) and a modifier categorized as quantifiers (i.e. semua ‘all’, seluruh ‘entire’, and kebanyakan ‘most’) is not acceptable in the school grammar because the formation (i.e. semua murid-murid ‘all the students’ kebanyakan pelajar-pelajar ‘most of the students’) is claimed to be redundant, and redundancy is prohibited in the grammar. Redundancy is generally construed as the property of speech and language by which more information is provided than is precisely required for the message to be understood, so that, if some information is omitted, the remaining information will still be sufficient for the message to be comprehended. Thus, the correct construction to be used is strictly the reduplicated form (i.e. murid-murid ‘students’) or the quantifier plus the root (i.e. semua murid ‘all the students’) with the intention that the grammatical meaning of plural is not repeated. Nevertheless, the so-called redundant form (i.e. kebanyakan pelajar-pelajar ‘most of the students’) is frequently used in the corpus grammar. This study shows that there are a number of redundant forms occur in the morphology of the language, particularly in affixation, reduplication and combination of both. Apparently, the so-called redundancy has grammatical and socio-cultural functions in communication that is to give emphasis and to stress the importance of the information delivered by the speakers or writers.

Keywords: Corpus grammar, morphology, redundancy, school grammar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1791