Search results for: LFG grammar
40 Redundancy in Malay Morphology: School Grammar versus Corpus Grammar
Authors: Zaharani Ahmad, Nor Hashimah Jalaluddin
Abstract:
The aim of this paper is to examine and identify the issue of linguistic redundancy in two competing grammars of Malay, namely the school grammar and the corpus grammar. The former is a normative grammar which is formally and prescriptively taught in the classroom, whereas the latter is a descriptive grammar that is informally acquired and mastered by the students as native speakers of the language outside the classroom. Corpus grammar is depicted based on its actual used in natural occurring texts, as attested in the corpus. It is observed that the grammar taught in schools is incompatible with the grammar used in the corpus. For instance, a noun phrase containing nominal reduplicated form which denotes plurality (i.e. murid-murid ‘students’ which is derived from murid ‘student’) and a modifier categorized as quantifiers (i.e. semua ‘all’, seluruh ‘entire’, and kebanyakan ‘most’) is not acceptable in the school grammar because the formation (i.e. semua murid-murid ‘all the students’ kebanyakan pelajar-pelajar ‘most of the students’) is claimed to be redundant, and redundancy is prohibited in the grammar. Redundancy is generally construed as the property of speech and language by which more information is provided than is precisely required for the message to be understood, so that, if some information is omitted, the remaining information will still be sufficient for the message to be comprehended. Thus, the correct construction to be used is strictly the reduplicated form (i.e. murid-murid ‘students’) or the quantifier plus the root (i.e. semua murid ‘all the students’) with the intention that the grammatical meaning of plural is not repeated. Nevertheless, the so-called redundant form (i.e. kebanyakan pelajar-pelajar ‘most of the students’) is frequently used in the corpus grammar. This study shows that there are a number of redundant forms occur in the morphology of the language, particularly in affixation, reduplication and combination of both. Apparently, the so-called redundancy has grammatical and socio-cultural functions in communication that is to give emphasis and to stress the importance of the information delivered by the speakers or writers.
Keywords: Corpus grammar, morphology, redundancy, school grammar.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179139 Models and Metamodels for Computer-Assisted Natural Language Grammar Learning
Authors: Evgeny Pyshkin, Maxim Mozgovoy, Vladislav Volkov
Abstract:
The paper follows a discourse on computer-assisted language learning. We examine problems of foreign language teaching and learning and introduce a metamodel that can be used to define learning models of language grammar structures in order to support teacher/student interaction. Special attention is paid to the concept of a virtual language lab. Our approach to language education assumes to encourage learners to experiment with a language and to learn by discovering patterns of grammatically correct structures created and managed by a language expert.
Keywords: Computer-assisted instruction, Language learning, Natural language grammar models, HCI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 219438 The Relationship between Iranian EFL Learners' Multiple Intelligences and Their Performance on Grammar Tests
Authors: Rose Shayeghi, Pejman Hosseinioun
Abstract:
The Multiple Intelligences theory characterizes human intelligence as a multifaceted entity that exists in all human beings with varying degrees. The most important contribution of this theory to the field of English Language Teaching (ELT) is its role in identifying individual differences and designing more learnercentered programs. The present study aims at investigating the relationship between different elements of multiple intelligence and grammar scores. To this end, 63 female Iranian EFL learner selected from among intermediate students participated in the study. The instruments employed were a Nelson English language test, Michigan Grammar Test, and Teele Inventory for Multiple Intelligences (TIMI). The results of Pearson Product-Moment Correlation revealed a significant positive correlation between grammatical accuracy and linguistic as well as interpersonal intelligence. The results of Stepwise Multiple Regression indicated that linguistic intelligence contributed to the prediction of grammatical accuracy.Keywords: Multiple intelligence, grammar, ELT, EFL, TIMI.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 242037 PIELG: A Protein Interaction Extraction Systemusing a Link Grammar Parser from Biomedical Abstracts
Authors: Rania A. Abul Seoud, Nahed H. Solouma, Abou-Baker M. Youssef, Yasser M. Kadah
Abstract:
Due to the ever growing amount of publications about protein-protein interactions, information extraction from text is increasingly recognized as one of crucial technologies in bioinformatics. This paper presents a Protein Interaction Extraction System using a Link Grammar Parser from biomedical abstracts (PIELG). PIELG uses linkage given by the Link Grammar Parser to start a case based analysis of contents of various syntactic roles as well as their linguistically significant and meaningful combinations. The system uses phrasal-prepositional verbs patterns to overcome preposition combinations problems. The recall and precision are 74.4% and 62.65%, respectively. Experimental evaluations with two other state-of-the-art extraction systems indicate that PIELG system achieves better performance. For further evaluation, the system is augmented with a graphical package (Cytoscape) for extracting protein interaction information from sequence databases. The result shows that the performance is remarkably promising.Keywords: Link Grammar Parser, Interaction extraction, protein-protein interaction, Natural language processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 225436 Pictorial Multimodal Analysis of Selected Paintings of Salvador Dali
Authors: Shaza Melies, Abeer Refky, Nihad Mansoor
Abstract:
Multimodality involves the communication between verbal and visual components in various discourses. A painting represents a form of communication between the artist and the viewer in terms of colors, shades, objects, and the title. This paper aims to present how multimodality can be used to decode the verbal and visual dimensions a painting holds. For that purpose, this study uses Kress and van Leeuwen’s theoretical framework of visual grammar for the analysis of the multimodal semiotic resources of selected paintings of Salvador Dali. This study investigates the visual decoding of the selected paintings of Salvador Dali and analyzing their social and political meanings using Kress and van Leeuwen’s framework of visual grammar. The paper attempts to answer the following questions: 1. How far can multimodality decode the verbal and non-verbal meanings of surrealistic art? 2. How can Kress and van Leeuwen’s theoretical framework of visual grammar be applied to analyze Dali’s paintings? 3. To what extent is Kress and van Leeuwen’s theoretical framework of visual grammar apt to deliver political and social messages of Dali? The paper reached the following findings: the framework’s descriptive tools (representational, interactive, and compositional meanings) can be used to analyze the paintings’ title and their visual elements. Social and political messages were delivered by appropriate usage of color, gesture, vectors, modality, and the way social actors were represented.
Keywords: Multimodality, multimodal analysis, paintings analysis, Salvador Dali, visual grammar.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 75235 A Thai to English Machine Translation System Using Thai LFG Tree Structure as Interlingua
Authors: Tawee Chimsuk, Surapong Auwatanamongkol
Abstract:
Machine Translation (MT) between the Thai and English languages has been a challenging research topic in natural language processing. Most research has been done on English to Thai machine translation, but not the other way around. This paper presents a Thai to English Machine Translation System that translates a Thai sentence into interlingua of a Thai LFG tree using LFG grammar and a bottom up parser. The Thai LFG tree is then transformed into the corresponding English LFG tree by pattern matching and node transformation. Finally, an equivalent English sentence is created using structural information prescribed by the English LFG tree. Based on results of experiments designed to evaluate the performance of the proposed system, it can be stated that the system has been proven to be effective in providing a useful translation from Thai to English.
Keywords: Interlingua, LFG grammar, Machine translation, Pattern matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 229634 Examining the Usefulness of an ESP Textbook for Information Technology: Learner Perspectives
Authors: Yun-Husan Huang
Abstract:
Many English for Specific Purposes (ESP) textbooks are distributed globally as the content development is often obliged to compromises between commercial and pedagogical demands. Therefore, the issue of regional application and usefulness of globally published ESP textbooks has received much debate. For ESP instructors, textbook selection is definitely a priority consideration for curriculum design. An appropriate ESP textbook can facilitate teaching and learning, while an inappropriate one may cause a disaster for both teachers and students. This study aims to investigate the regional application and usefulness of an ESP textbook for information technology (IT). Participants were 51 sophomores majoring in Applied Informatics and Multimedia at a university in Taiwan. As they were non-English majors, their English proficiency was mostly at elementary and elementary-to-intermediate levels. This course was offered for two semesters. The textbook selected was Oxford English for Information Technology. At class end, the students were required to complete a survey comprising five choices of Very Easy, Easy, Neutral, Difficult, and Very Difficult for each item. Based on the content design of the textbook, the survey investigated how the students viewed the difficulty of grammar, listening, speaking, reading, and writing materials of the textbook. In terms of difficulty, results reveal that only 22% of them found the grammar section difficult and very difficult. For listening, 71% responded difficult and very difficult. For general reading, 55% responded difficult and very difficult. For speaking, 56% responded difficult and very difficult. For writing, 78% responded difficult and very difficult. For advanced reading, 90% reported difficult and very difficult. These results indicate that, except the grammar section, more than half of the students found the textbook contents difficult in terms of listening, speaking, reading, and writing materials. Such contradictory results between the easy grammar section and the difficult four language skills sections imply that the textbook designers do not well understand the English learning background of regional ESP learners. For the participants, the learning contents of the grammar section were the general grammar level of junior high school, while the learning contents of the four language skills sections were more of the levels of college English majors. Implications from the findings are obtained for instructors and textbook designers. First of all, existing ESP textbooks for IT are few and thus textbook selections for instructors are insufficient. Second, existing globally published textbooks for IT cannot be applied to learners of all English proficiency levels, especially the low level. With limited textbook selections, third, instructors should modify the selected textbook contents or supplement extra ESP materials to meet the proficiency level of target learners. Fourth, local ESP publishers should collaborate with local ESP instructors who understand best the learning background of their students in order to develop appropriate ESP textbooks for local learners. Even though the instructor reduced learning contents and simplified tests in curriculum design, in conclusion, the students still found difficult. This implies that in addition to the instructor’s professional experience, there is a need to understand the usefulness of the textbook from learner perspectives.Keywords: ESP textbooks, ESP materials, ESP textbook design, learner perspectives on ESP textbooks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 189733 Syntactic Recognition of Distorted Patterns
Authors: Marek Skomorowski
Abstract:
In syntactic pattern recognition a pattern can be represented by a graph. Given an unknown pattern represented by a graph g, the problem of recognition is to determine if the graph g belongs to a language L(G) generated by a graph grammar G. The so-called IE graphs have been defined in [1] for a description of patterns. The IE graphs are generated by so-called ETPL(k) graph grammars defined in [1]. An efficient, parsing algorithm for ETPL(k) graph grammars for syntactic recognition of patterns represented by IE graphs has been presented in [1]. In practice, structural descriptions may contain pattern distortions, so that the assignment of a graph g, representing an unknown pattern, to a graph language L(G) generated by an ETPL(k) graph grammar G is rejected by the ETPL(k) type parsing. Therefore, there is a need for constructing effective parsing algorithms for recognition of distorted patterns. The purpose of this paper is to present a new approach to syntactic recognition of distorted patterns. To take into account all variations of a distorted pattern under study, a probabilistic description of the pattern is needed. A random IE graph approach is proposed here for such a description ([2]).Keywords: Syntactic pattern recognition, Distorted patterns, Random graphs, Graph grammars.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 139532 Inductive Grammar, Student-Centered Reading, and Interactive Poetry: The Effects of Teaching English with Fun in Schools of Two Villages in Lebanon
Authors: Talar Agopian
Abstract:
Teaching English as a Second Language (ESL) is a common practice in many Lebanese schools. However, ESL teaching is done in traditional ways. Methods such as constructivism are seldom used, especially in villages. Here lies the significance of this research which joins constructivism and Piaget’s theory of cognitive development in ESL classes in Lebanese villages. The purpose of the present study is to explore the effects of applying constructivist student-centered strategies in teaching grammar, reading comprehension, and poetry on students in elementary ESL classes in two villages in Lebanon, Zefta in South Lebanon and Boqaata in Mount Lebanon. 20 English teachers participated in a training titled “Teaching English with Fun”, which focused on strategies that create a student-centered class where active learning takes place and there is increased learner engagement and autonomy. The training covered three main areas in teaching English: grammar, reading comprehension, and poetry. After participating in the training, the teachers applied the new strategies and methods in their ESL classes. The methodology comprised two phases: in phase one, practice-based research was conducted as the teachers attended the training and applied the constructivist strategies in their respective ESL classes. Phase two included the reflections of the teachers on the effects of the application of constructivist strategies. The results revealed the educational benefits of constructivist student-centered strategies; the students of teachers who applied these strategies showed improved engagement, positive attitudes towards poetry, increased motivation, and a better sense of autonomy. Future research is required in applying constructivist methods in the areas of writing, spelling, and vocabulary in ESL classrooms of Lebanese villages.
Keywords: Active learning, constructivism, learner engagement, student-centered strategies.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 77031 3D Multi-User Virtual Environment in Language Teaching
Authors: Hana Maresova, Daniel Ecler, Miroslava Mensikova
Abstract:
This article focuses on the use of 3D multi-user virtual environment in language teaching and presents the results of a four-year research at the Palacky University Olomouc Faculty of Education (Czech Republic). Language teaching was conducted in an experimental form in the 3D virtual worlds of Second Life and Kitely (experimental group) and, in parallel to this, there was also traditional teaching conducted on identical topics in the form of lectures using a textbook (control group). The didactic test, which was presented to both of the groups in an identical form before the start of teaching and after its implementation, verified the effect of teaching in the experimental group by comparing the achieved results of both groups. Out of the three components of mother tongue teaching (grammar, literature, composition and communication education) students achieved partial better results (in the case of points focused on the visualization of the subject matter, these were statistically significant) in literature. Students from the control group performed better in grammar and composition. Based on the achieved results, we can state that the most appropriate use of multi-user virtual environment (MUVE) can be seen in teaching those topics that have the possibility of dramatization, experiential learning and group cooperation.
Keywords: 3D virtual reality, multiuser environments, online education, language education.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 47430 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model
Authors: Selvam M, Natarajan. A M, Thangarajan R
Abstract:
Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 364329 Teaching English to Engineers: Between English Language Teaching and Psychology
Authors: Irina-Ana Drobot
Abstract:
Teaching English to Engineers is part of English for Specific Purposes, a domain which is under the attention of English students especially under the current conditions of finding jobs and establishing partnerships outside Romania. The paper will analyse the existing textbooks together with the teaching strategies they adopt. Teaching English to Engineering students can intersect with domains such as psychology and cultural studies in order to teach them efficiently. Textbooks for students of ESP, ranging from those at the Faculty of Economics to those at the Faculty of Engineers, have shifted away from using specialized vocabulary, drills for grammar and reading comprehension questions and toward communicative methods and the practical use of language. At present, in Romania, grammar is neglected in favour of communicative methods. The current interest in translation studies may indicate a return to this type of method, since only translation specialists can distinguish among specialized terms and determine which are most suitable in a translation. Engineers are currently encouraged to learn English in order to do their own translations in their own field. This paper will analyse the issue of the extent to which it is useful to teach Engineering students to do translations in their field using cognitive psychology applied to language teaching, including issues such as motivation and social psychology. Teaching general English to engineering students can result in lack of interest, but they can be motivated by practical aspects which will help them in their field. This is why this paper needs to take into account an interdisciplinary approach to teaching English to Engineers.
Keywords: Cognition, ESP, motivation, psychology.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 312428 Generating a Functional Grammar for Architectural Design from Structural Hierarchy in Combination of Square and Equal Triangle
Authors: Sanaz Ahmadzadeh Siyahrood, Arghavan Ebrahimi, Mohammadjavad Mahdavinejad
Abstract:
Islamic culture was accountable for a plethora of development in astronomy and science in the medieval term, and in geometry likewise. Geometric patterns are reputable in a considerable number of cultures, but in the Islamic culture the patterns have specific features that connect the Islamic faith to mathematics. In Islamic art, three fundamental shapes are generated from the circle shape: triangle, square and hexagon. Originating from their quiddity, each of these geometric shapes has its own specific structure. Even though the geometric patterns were generated from such simple forms as the circle and the square, they can be combined, duplicated, interlaced, and arranged in intricate combinations. So in order to explain geometrical interaction principles between square and equal triangle, in the first definition step, all types of their linear forces individually and in the second step, between them, would be illustrated. In this analysis, some angles will be created from intersection of their directions. All angles are categorized to some groups and the mathematical expressions among them are analyzed. Since the most geometric patterns in Islamic art and architecture are based on the repetition of a single motif, the evaluation results which are obtained from a small portion, is attributable to a large-scale domain while the development of infinitely repeating patterns can represent the unchanging laws. Geometric ornamentation in Islamic art offers the possibility of infinite growth and can accommodate the incorporation of other types of architectural layout as well, so the logic and mathematical relationships which have been obtained from this analysis are applicable in designing some architecture layers and developing the plan design.
Keywords: Angle, architecture, design, equal triangle, generating, grammar, square and structural hierarchy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 89627 Age and Second Language Acquisition: A Case Study from Maldives
Authors: Aaidha Hammad
Abstract:
The age a child to be exposed to a second language is a controversial issue in communities such as the Maldives where English is taught as a second language. It has been observed that different stakeholders have different viewpoints towards the issue. Some believe that the earlier children are exposed to a second language, the better they learn, while others disagree with the notion. Hence, this case study investigates whether children learn a second language better when they are exposed at an earlier age or not. The spoken and written data collected confirm that earlier exposure helps in mastering the sound pattern and speaking fluency with more native-like accent, while a later age is better for learning more abstract and concrete aspects such as grammar and syntactic rules.Keywords: Age, development of language skills, fluency, second language acquisition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 363826 The Sign in the Communication Process
Authors: S. Pesina, T. Solonchak
Abstract:
In the process of information transmission (concept verbalization) we deal mostly with the substance (contents), and then pay attention to the form. Recalling events from the remote past, often we cannot exactly reproduce specific heard or pronounced words, as well as the syntactic structures. We remember events, feelings, images; we recall the general contents of the discourse. The thought gets a specific language form only during the concept verbalization phase. With minimum time for pondering, depending on the language competence level, the grammar and syntactic shaping often occurs automatically with the use of famous models and stereotypes. This means that the language form adapts itself to the consciousness, and not vice versa.
Keywords: Lexical eidos, phenomenology, noema, polysemantic word, semantic core.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 194225 Automatic Intelligent Analysis of Malware Behaviour
Authors: H. Dornhackl, K. Kadletz, R. Luh, P. Tavolato
Abstract:
In this paper, we describe the use of formal methods to model malware behaviour. The modelling of harmful behaviour rests upon syntactic structures that represent malicious procedures inside malware. The malicious activities are modelled by a formal grammar, where API calls’ components are the terminals and the set of API calls used in combination to achieve a goal are designated non-terminals. The combination of different non-terminals in various ways and tiers make up the attack vectors that are used by harmful software. Based on these syntactic structures a parser can be generated which takes execution traces as input for pattern recognition.
Keywords: Malware behaviour, modelling, parsing, search, pattern matching.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 152424 Definition of a Computing Independent Model and Rules for Transformation Focused on the Model-View-Controller Architecture
Authors: Vanessa Matias Leite, Jandira Guenka Palma, Flávio Henrique de Oliveira
Abstract:
This paper presents a model-oriented development approach to software development in the Model-View-Controller (MVC) architectural standard. This approach aims to expose a process of extractions of information from the models, in which through rules and syntax defined in this work, assists in the design of the initial model and its future conversions. The proposed paper presents a syntax based on the natural language, according to the rules agreed in the classic grammar of the Portuguese language, added to the rules of conversions generating models that follow the norms of the Object Management Group (OMG) and the Meta-Object Facility MOF.
Keywords: Model driven architecture, model-view-controller, bnf syntax, model, transformation, UML.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 92023 Real-Time 3D City Generation using Shape Grammars with LOD Variations
Authors: Pearl Goswell, Jun Jo
Abstract:
Creating3D environments, including characters and cities, is a significantly time consuming process due to a large amount of workinvolved in designing and modelling.There have been a number of attempts to automatically generate 3D objects employing shape grammars. However it is still too early to apply the mechanism to real problems such as real-time computer games.The purpose of this research is to introduce a time efficient and cost effective method to automatically generatevarious 3D objects for real-time 3D games. This Shape grammar-based real-time City Generation (RCG) model is a conceptual model for generating 3Denvironments in real-time and can be applied to 3D gamesoranimations. The RCG system can generate even a large cityby applying fundamental principles of shape grammars to building elementsin various levels of detailin real-time.Keywords: real-time city generation, shape grammars, 3D games, 3D modelling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 232622 Data Extraction of XML Files using Searching and Indexing Techniques
Authors: Sushma Satpute, Vaishali Katkar, Nilesh Sahare
Abstract:
XML files contain data which is in well formatted manner. By studying the format or semantics of the grammar it will be helpful for fast retrieval of the data. There are many algorithms which describes about searching the data from XML files. There are no. of approaches which uses data structure or are related to the contents of the document. In these cases user must know about the structure of the document and information retrieval techniques using NLPs is related to content of the document. Hence the result may be irrelevant or not so successful and may take more time to search.. This paper presents fast XML retrieval techniques by using new indexing technique and the concept of RXML. When indexing an XML document, the system takes into account both the document content and the document structure and assigns the value to each tag from file. To query the system, a user is not constrained about fixed format of query.
Keywords: XML Retrieval, Indexed Search, Information Retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 178321 Words Reordering based on Statistical Language Model
Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou
Abstract:
There are multiple reasons to expect that detecting the word order errors in a text will be a difficult problem, and detection rates reported in the literature are in fact low. Although grammatical rules constructed by computer linguists improve the performance of grammar checker in word order diagnosis, the repairing task is still very difficult. This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.Keywords: Permutations filtering, Statistical languagemodel N-grams, Word order errors
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 158720 The Code-Mixing of Japanese, English and Thai in Line Chat
Authors: Premvadee Na Nakornpanom
Abstract:
Code- mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study is an attempt to explore the linguistic characteristics of the mixing of Japanese, English and Thai in a mobile Line chat room by students with their background of English as L2, Japanese as L3 and Thai as mother tongue. The result found that insertion of Thai content words is a very common linguistic phenomenon embedded with the other two languages in the sentences. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotionally-related. A personal pronoun in Japanese is often mixed into the sentences. The Japanese sentence-final question particle か “ka” was added to the end of the sentence based on Thai grammar rules. Some unique characteristics were created while chatting.
Keywords: Code-mixing, Japanese, English, Thai, Line chat.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 344819 Natural Language Database Interface for Selection of Data Using Grammar and Parsing
Authors: N. D. Karande, G. A. Patil
Abstract:
Databases have become ubiquitous. Almost all IT applications are storing into and retrieving information from databases. Retrieving information from the database requires knowledge of technical languages such as Structured Query Language (SQL). However majority of the users who interact with the databases do not have a technical background and are intimidated by the idea of using languages such as SQL. This has led to the development of a few Natural Language Database Interfaces (NLDBIs). A NLDBI allows the user to query the database in a natural language. This paper highlights on architecture of new NLDBI system, its implementation and discusses on results obtained. In most of the typical NLDBI systems the natural language statement is converted into an internal representation based on the syntactic and semantic knowledge of the natural language. This representation is then converted into queries using a representation converter. A natural language query is translated to an equivalent SQL query after processing through various stages. The work has been experimented on primitive database queries with certain constraints.
Keywords: Natural language database interface, representation converter, syntactic and semantic knowledge
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 270518 Using Multi-Linguistic Techniques for Thailand Herb and Traditional Medicine Registration Systems
Authors: Thanapol Wisuttikul, Choochart Haruechaiyasak, Santipong Thaiprayoon
Abstract:
Thailand has evolved many unique culture and knowledge, and the leading is the Thai traditional medicine (TTM). Recently, a number of researchers have tried to save this indigenous knowledge. However, the system to do so has still been scant. To preserve this ancient knowledge, we therefore invented and integrated multi-linguistic techniques to create the system of the collected all of recipes. This application extracted the medical recipes from antique scriptures then normalized antiquarian words, primitive grammar and antiquated measurement of them to the modern ones. Then, we applied ingredient-duplication-calculation, proportion-similarity-calculation and score-ranking to examine duplicate recipes. We collected the questionnaires from registrants and people to investigate the users’ satisfaction. The satisfactory results were found. This application assists not only registrants to validating the copyright violation in TTM registration process but also people to cure their illness that aids both Thai people and all mankind to fight for intractable diseases.
Keywords: Medicine Registration, Search Engine, Text Approximation, Traditional Medicine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 200817 A Graphical Environment for Petri Nets INA Tool Based on Meta-Modelling and Graph Grammars
Authors: Raida El Mansouri, Elhillali Kerkouche, Allaoua Chaoui
Abstract:
The Petri net tool INA is a well known tool by the Petri net community. However, it lacks a graphical environment to cerate and analyse INA models. Building a modelling tool for the design and analysis from scratch (for INA tool for example) is generally a prohibitive task. Meta-Modelling approach is useful to deal with such problems since it allows the modelling of the formalisms themselves. In this paper, we propose an approach based on the combined use of Meta-modelling and Graph Grammars to automatically generate a visual modelling tool for INA for analysis purposes. In our approach, the UML Class diagram formalism is used to define a meta-model of INA models. The meta-modelling tool ATOM3 is used to generate a visual modelling tool according to the proposed INA meta-model. We have also proposed a graph grammar to automatically generate INA description of the graphically specified Petri net models. This allows the user to avoid the errors when this description is done manually. Then the INA tool is used to perform the simulation and the analysis of the resulted INA description. Our environment is illustrated through an example.Keywords: INA, Meta-modelling, Graph Grammars, AToM3, Automatic Code Generation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 186716 A Web-Based Self-Learning Grammar for Spoken Language Understanding
Authors: S. M. Biondi, V. Catania, R. Di Natale, A. R. Intilisano, D. Panno
Abstract:
One of the major goals of Spoken Dialog Systems (SDS) is to understand what the user utters. In the SDS domain, the Spoken Language Understanding (SLU) Module classifies user utterances by means of a pre-definite conceptual knowledge. The SLU module is able to recognize only the meaning previously included in its knowledge base. Due the vastity of that knowledge, the information storing is a very expensive process. Updating and managing the knowledge base are time-consuming and error-prone processes because of the rapidly growing number of entities like proper nouns and domain-specific nouns. This paper proposes a solution to the problem of Name Entity Recognition (NER) applied to a SDS domain. The proposed solution attempts to automatically recognize the meaning associated with an utterance by using the PANKOW (Pattern based Annotation through Knowledge On the Web) method at runtime. The method being proposed extracts information from the Web to increase the SLU knowledge module and reduces the development effort. In particular, the Google Search Engine is used to extract information from the Facebook social network.
Keywords: Spoken Dialog System, Spoken Language Understanding, Web Semantic, Name Entity Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 177615 BugCatcher.Net: Detecting Bugs and Proposing Corrective Solutions
Authors: Sheetal Chavan, P. J. Kulkarni, Vivek Shanbhag
Abstract:
Although achieving zero-defect software release is practically impossible, software industries should take maximum care to detect defects/bugs well ahead in time allowing only bare minimums to creep into released version. This is a clear indicator of time playing an important role in the bug detection. In addition to this, software quality is the major factor in software engineering process. Moreover, early detection can be achieved only through static code analysis as opposed to conventional testing. BugCatcher.Net is a static analysis tool, which detects bugs in .NET® languages through MSIL (Microsoft Intermediate Language) inspection. The tool utilizes a Parser based on Finite State Automata to carry out bug detection. After being detected, bugs need to be corrected immediately. BugCatcher.Net facilitates correction, by proposing a corrective solution for reported warnings/bugs to end users with minimum side effects. Moreover, the tool is also capable of analyzing the bug trend of a program under inspection.Keywords: Dependence, Early solution, Finite State Automata, Grammar, Late solution, Parser State Transition Diagram, StaticProgram Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 151014 Aspect Oriented Software Architecture
Authors: Pradip Peter Dey, Ronald F. Gonzales, Gordon W. Romney, Mohammad Amin, Bhaskar Raj Sinha
Abstract:
Natural language processing systems pose a unique challenge for software architectural design as system complexity has increased continually and systems cannot be easily constructed from loosely coupled modules. Lexical, syntactic, semantic, and pragmatic aspects of linguistic information are tightly coupled in a manner that requires separation of concerns in a special way in design, implementation and maintenance. An aspect oriented software architecture is proposed in this paper after critically reviewing relevant architectural issues. For the purpose of this paper, the syntactic aspect is characterized by an augmented context-free grammar. The semantic aspect is composed of multiple perspectives including denotational, operational, axiomatic and case frame approaches. Case frame semantics matured in India from deep thematic analysis. It is argued that lexical, syntactic, semantic and pragmatic aspects work together in a mutually dependent way and their synergy is best represented in the aspect oriented approach. The software architecture is presented with an augmented Unified Modeling Language.Keywords: Language engineering, parsing, software design, user experience.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 174313 Morpho-Phonological Modelling in Natural Language Processing
Authors: Eleni Galiotou, Angela Ralli
Abstract:
In this paper we propose a computational model for the representation and processing of morpho-phonological phenomena in a natural language, like Modern Greek. We aim at a unified treatment of inflection, compounding, and word-internal phonological changes, in a model that is used for both analysis and generation. After discussing certain difficulties cuase by well-known finitestate approaches, such as Koskenniemi-s two-level model [7] when applied to a computational treatment of compounding, we argue that a morphology-based model provides a more adequate account of word-internal phenomena. Contrary to the finite state approaches that cannot handle hierarchical word constituency in a satisfactory way, we propose a unification-based word grammar, as the nucleus of our strategy, which takes into consideration word representations that are based on affixation and [stem stem] or [stem word] compounds. In our formalism, feature-passing operations are formulated with the use of the unification device, and phonological rules modeling the correspondence between lexical and surface forms apply at morpheme boundaries. In the paper, examples from Modern Greek illustrate our approach. Morpheme structures, stress, and morphologically conditioned phoneme changes are analyzed and generated in a principled way.
Keywords: Morpho-Phonology, Natural Language Processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 213012 The Algorithm of Semi-Automatic Thai Spoonerism Words for Bi-Syllable
Authors: Nutthapat Kaewrattanapat, Wannarat Bunchongkien
Abstract:
The purposes of this research are to study and develop the algorithm of Thai spoonerism words by semi-automatic computer programs, that is to say, in part of data input, syllables are already separated and in part of spoonerism, the developed algorithm is utilized, which can establish rules and mechanisms in Thai spoonerism words for bi-syllables by utilizing analysis in elements of the syllables, namely cluster consonant, vowel, intonation mark and final consonant. From the study, it is found that bi-syllable Thai spoonerism has 1 case of spoonerism mechanism, namely transposition in value of vowel, intonation mark and consonant of both 2 syllables but keeping consonant value and cluster word (if any). From the study, the rules and mechanisms in Thai spoonerism word were applied to develop as Thai spoonerism word software, utilizing PHP program. the software was brought to conduct a performance test on software execution; it is found that the program performs bi-syllable Thai spoonerism correctly or 99% of all words used in the test and found faults on the program at 1% as the words obtained from spoonerism may not be spelling in conformity with Thai grammar and the answer in Thai spoonerism could be more than 1 answer.
Keywords: Algorithm, Spoonerism, Computational Linguistics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 235911 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes
Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari
Abstract:
The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.
Keywords: Arabic Language acquisition and learning, natural language processing, morphological analyzer, part-of-speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1047