Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 48

Search results for: sentences

48 Sentence Modality Recognition in French based on Prosody

Authors: Pavel Král, Jana Klečková, Christophe Cerisara

Abstract:

This paper deals with automatic sentence modality recognition in French. In this work, only prosodic features are considered. The sentences are recognized according to the three following modalities: declarative, interrogative and exclamatory sentences. This information will be used to animate a talking head for deaf and hearing-impaired children. We first statistically study a real radio corpus in order to assess the feasibility of the automatic modeling of sentence types. Then, we test two sets of prosodic features as well as two different classifiers and their combination. We further focus our attention on questions recognition, as this modality is certainly the most important one for the target application.

Keywords: Automatic sentences modality recognition (ASMR), fundamental frequency (F0), energy, modal corpus, prosody.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
47 SySRA: A System of a Continuous Speech Recognition in Arab Language

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.

Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1158
46 Self-Assembling Hypernetworks for Cognitive Learning of Linguistic Memory

Authors: Byoung-Tak Zhang, Chan-Hoon Park

Abstract:

Hypernetworks are a generalized graph structure representing higher-order interactions between variables. We present a method for self-organizing hypernetworks to learn an associative memory of sentences and to recall the sentences from this memory. This learning method is inspired by the “mental chemistry" model of cognition and the “molecular self-assembly" technology in biochemistry. Simulation experiments are performed on a corpus of natural-language dialogues of approximately 300K sentences collected from TV drama captions. We report on the sentence completion performance as a function of the order of word-interaction and the size of the learning corpus, and discuss the plausibility of this architecture as a cognitive model of language learning and memory.

Keywords: Linguistic recall memory, sentence completion task, self-organizing hypernetworks, cognitive learning and memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1240
45 Melodic and Temporal Structure of Indonesian Sentences of Sitcom "International Class" Actors: Prosodic Study with Experimental Phonetics Approach

Authors: Tri Sulistyaningtyas, Yani Suryani, Dana Waskita, Linda Handayani Sukaemi, Ferry Fauzi Hermawan

Abstract:

The enthusiasm of foreigners studying the Indonesian language by Foreign Speakers (BIPA) was documented in a sitcom "International Class". Tone and stress when they speak the Indonesian language is unique and different from Indonesian pronunciation. By using the Praat program, this research aims to describe prosodic Indonesian language which is spoken by ‘International Class” actors consisting of Abbas from Nigeria, Lee from Korea, and Kotaro from Japan. Data for the research are taken from the video sitcom "International Class" that aired on Indonesian television. The results of this study revealed that pitch movement that arises when pronouncing Indonesian sentences was up and down gradually, there is also a rise and fall sharply. In terms of stress, respondents tend to contain a lot of stress when pronouncing Indonesian sentences. Meanwhile, in terms of temporal structure, the duration pronouncing Indonesian sentences tends to be longer than that of Indonesian speakers.

Keywords: Melodic structure, temporal structure, prosody, experimental phonetics, international class.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 658
44 Jurisprudencial Analysis of Torture in Spain and in the European Human Rights System

Authors: María José Benítez Jiménez

Abstract:

Article 3 of the European Convention for the Protection of Human Rights and Fundamental Freedoms (E.C.H.R.) proclaims that no one may be subjected to torture, punishment or degrading treatment. The legislative correlate in Spain is embodied in Article 15 of the Spanish Constitution, and there must be an overlapping interpretation of both precepts on the ideal plane. While it is true that there are not many cases in which the European Court of Human Rights (E.C.t.H.R. (The Strasbourg Court)) has sanctioned Spain for its failure to investigate complaints of torture, it must be emphasized that the tendency to violate Article 3 of the Convention appears to be on the rise, being necessary to know possible factors that may be affecting it. This paper addresses the analysis of sentences that directly or indirectly reveal the violation of Article 3 of the European Convention. To carry out the analysis, sentences of the Strasbourg Court have been consulted from 2012 to 2016, being able to address any previous sentences to this period if it provided justified information necessary for the study. After the review it becomes clear that there are two key groups of subjects that request a response to the Strasbourg Court on the understanding that they have been tortured or degradingly treated. These are: immigrants and terrorists. Both phenomena, immigration and terrorism, respond to patterns that have mutated in recent years, and it is important for this study to know if national regulations begin to be dysfunctional.

Keywords: European convention for the protection of human rights and fundamental freedoms, European Court of Human Rights, sentences, Spanish Constitution, torture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 708
43 The Role of Paraphrase in Interpreting Students’ Writing

Authors: Maya Lisa Aryanti, S. S. M. Hum

Abstract:

To improve students’ skill, writing is the most challenging skill to be developed. The reason is that besides helping the students to develop their skill, this activity also helps them to express themselves. This paper depicts how paraphrasing is very helpful to interpret students’ writing. Syntactic units, used tenses and meanings will indeed change once the writings were paraphrased. The objectives of this research are to reveal the inappropriate structure of syntactic units, to show what types of sentences the students often make, and to show how paraphrasing can help to infer the message. The methodology of this research is descriptive qualitative research. In addition, theories of linguistics are also included. This includes theory of Syntax to describe syntactic units and tenses and theory of Semantics to describe theories of meaning and how paraphrasing works. The theories of general linguistics, grammar and writing are also provided to support the theories of Syntax and Semantics. The results of this research are concerned with how the message is received in the end. The message written in the students’ essay is not clear because of the improper structure of syntactic units and use of incorrect of tenses. The students tend to use simple sentences, compound sentences and complex sentences with a few mistakes in their writing. In addition, they tend to create unnecessary phrases. The last point is that this research shows how paraphrase works to attain complete meaning of a sentence.

Keywords: Paraphrase, meanings, syntactic units and tenses.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 714
42 N-Grams: A Tool for Repairing Word Order Errors in Ill-formed Texts

Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou, Konstantinos Mamouras

Abstract:

This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. A possible way for reordering the words is to use all the permutations. The problem is that for a sentence with length N words the number of all permutations is N!. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The confusion matrix technique has been designed in order to reduce the search space among permuted sentences. The limitation of search space is succeeded using the statistical inference of N-grams. The results of this technique are very interesting and prove that the number of permuted sentences can be reduced by 98,16%. For experimental purposes a test set of TOEFL sentences was used and the results show that more than 95% can be repaired using the proposed method.

Keywords: Permutations filtering, Statistical language model N-grams, Word order errors, TOEFL

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1461
41 The Phonology and Phonetics of Second Language Intonation in Case of “Downstep”

Authors: Tayebeh Norouzi

Abstract:

This study aims to investigate the acquisition process of intonation. It examines the intonation structure of Tokyo Japanese and its realization by Iranian learners of Japanese. Seven Iranian learners of Japanese, differing in fluency, and two Japanese speakers participated in the experiment. Two sentences were used to test the phonological and phonetic characteristics of lexical pitch-accent as well as the intonation patterns produced by the speakers. Both sentences consisted of similar words with the same number of syllables and lexical pitch-accents but different syntactic structure. Speakers were asked to read each sentence three times at normal speed, and the data were analyzed by Praat. The results show that lexical pitch-accent, Accentual Phrase (AP) and AP boundary tone realization vary depending on sentence type. For sentences of type XdeYwo, the lexical pitch-accent is realized properly. However, there is a rise in AP boundary tone regardless of speakers’ level of fluency. In contrast, in sentences of type XnoYwo, the lexical pitch-accent and AP boundary tone vary depending on the speakers’ fluency level. Advanced speakers are better at grouping words into phrases and produce more native-like intonation patterns, though they are not able to realize downstep properly. The non-native speakers tried to realize proper intonation patterns by making changes in lexical accent and boundary tone.

Keywords: Intonation, Iranian learners, Japanese prosody, lexical accent, second language acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 614
40 Maya Semantic Technique: A Mathematical Technique Used to Determine Partial Semantics for Declarative Sentences

Authors: Marcia T. Mitchell

Abstract:

This research uses computational linguistics, an area of study that employs a computer to process natural language, and aims at discerning the patterns that exist in declarative sentences used in technical texts. The approach is mathematical, and the focus is on instructional texts found on web pages. The technique developed by the author and named the MAYA Semantic Technique is used here and organized into four stages. In the first stage, the parts of speech in each sentence are identified. In the second stage, the subject of the sentence is determined. In the third stage, MAYA performs a frequency analysis on the remaining words to determine the verb and its object. In the fourth stage, MAYA does statistical analysis to determine the content of the web page. The advantage of the MAYA Semantic Technique lies in its use of mathematical principles to represent grammatical operations which assist processing and accuracy if performed on unambiguous text. The MAYA Semantic Technique is part of a proposed architecture for an entire web-based intelligent tutoring system. On a sample set of sentences, partial semantics derived using the MAYA Semantic Technique were approximately 80% accurate. The system currently processes technical text in one domain, namely Cµ programming. In this domain all the keywords and programming concepts are known and understood.

Keywords: Natural language understanding, computational linguistics, knowledge representation, linguistic theories.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438
39 Thematic Role Extraction Using Shallow Parsing

Authors: Mehrnoush Shamsfard, Maryam Sadr Mousavi

Abstract:

Extracting thematic (semantic) roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a rule-based approach to extract semantic roles from Persian sentences. The system exploits a twophase architecture to (1) identify the arguments and (2) label them for each predicate. For the first phase we developed a rule based shallow parser to chunk Persian sentences and for the second phase we developed a knowledge-based system to assign 16 selected thematic roles to the chunks. The experimental results of testing each phase are shown at the end of the paper.

Keywords: Natural Language Processing, Semantic RoleLabeling, Shallow parsing, Thematic Roles.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
38 Object Recognition Approach Based on Generalized Hough Transform and Color Distribution Serving in Generating Arabic Sentences

Authors: Nada Farhani, Naim Terbeh, Mounir Zrigui

Abstract:

The recognition of the objects contained in images has always presented a challenge in the field of research because of several difficulties that the researcher can envisage because of the variability of shape, position, contrast of objects, etc. In this paper, we will be interested in the recognition of objects. The classical Hough Transform (HT) presented a tool for detecting straight line segments in images. The technique of HT has been generalized (GHT) for the detection of arbitrary forms. With GHT, the forms sought are not necessarily defined analytically but rather by a particular silhouette. For more precision, we proposed to combine the results from the GHT with the results from a calculation of similarity between the histograms and the spatiograms of the images. The main purpose of our work is to use the concepts from recognition to generate sentences in Arabic that summarize the content of the image.

Keywords: Recognition of shape, generalized hough transformation, histogram, Spatiogram, learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 331
37 The Code-Mixing of Japanese, English and Thai in Line Chat

Authors: Premvadee Na Nakornpanom

Abstract:

Code- mixing in spontaneous speech has been widely discussed, but not in virtual situations; especially in context of the third language learning students. Thus, this study is an attempt to explore the linguistic characteristics of the mixing of Japanese, English and Thai in a mobile Line chat room by students with their background of English as L2, Japanese as L3 and Thai as mother tongue. The result found that insertion of Thai content words is a very common linguistic phenomenon embedded with the other two languages in the sentences. As chatting is to be ‘relational’ or ‘interactional’, it affected the style of lexical choices to be speech-like, more personal and emotionally-related. A personal pronoun in Japanese is often mixed into the sentences. The Japanese sentence-final question particle か “ka” was added to the end of the sentence based on Thai grammar rules. Some unique characteristics were created while chatting.

Keywords: Code-mixing, Japanese, English, Thai, Line chat.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3203
36 Event Template Generation for News Articles

Authors: A. Kowcika, E. Umamaheswari, T.V. Geetha

Abstract:

In this paper we focus on event extraction from Tamil news article. This system utilizes a scoring scheme for extracting and grouping event-specific sentences. Using this scoring scheme eventspecific clustering is performed for multiple documents. Events are extracted from each document using a scoring scheme based on feature score and condition score. Similarly event specific sentences are clustered from multiple documents using this scoring scheme. The proposed system builds the Event Template based on user specified query. The templates are filled with event specific details like person, location and timeline extracted from the formed clusters. The proposed system applies these methodologies for Tamil news articles that have been enconverted into UNL graphs using a Tamil to UNL-enconverter. The main intention of this work is to generate an event based template.

Keywords: Event Extraction, Score based Clustering, Segmentation, Template Generation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1399
35 Neuro-Fuzzy Based Model for Phrase Level Emotion Understanding

Authors: Vadivel Ayyasamy

Abstract:

The present approach deals with the identification of Emotions and classification of Emotional patterns at Phrase-level with respect to Positive and Negative Orientation. The proposed approach considers emotion triggered terms, its co-occurrence terms and also associated sentences for recognizing emotions. The proposed approach uses Part of Speech Tagging and Emotion Actifiers for classification. Here sentence patterns are broken into phrases and Neuro-Fuzzy model is used to classify which results in 16 patterns of emotional phrases. Suitable intensities are assigned for capturing the degree of emotion contents that exist in semantics of patterns. These emotional phrases are assigned weights which supports in deciding the Positive and Negative Orientation of emotions. The approach uses web documents for experimental purpose and the proposed classification approach performs well and achieves good F-Scores.

Keywords: Emotions, sentences, phrases, classification, patterns, fuzzy, positive orientation, negative orientation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 827
34 Automatic Extraction of Features and Opinion-Oriented Sentences from Customer Reviews

Authors: Khairullah Khan, Baharum B. Baharudin, Aurangzeb Khan, Fazal_e_Malik

Abstract:

Opinion extraction about products from customer reviews is becoming an interesting area of research. Customer reviews about products are nowadays available from blogs and review sites. Also tools are being developed for extraction of opinion from these reviews to help the user as well merchants to track the most suitable choice of product. Therefore efficient method and techniques are needed to extract opinions from review and blogs. As reviews of products mostly contains discussion about the features, functions and services, therefore, efficient techniques are required to extract user comments about the desired features, functions and services. In this paper we have proposed a novel idea to find features of product from user review in an efficient way. Our focus in this paper is to get the features and opinion-oriented words about products from text through auxiliary verbs (AV) {is, was, are, were, has, have, had}. From the results of our experiments we found that 82% of features and 85% of opinion-oriented sentences include AVs. Thus these AVs are good indicators of features and opinion orientation in customer reviews.

Keywords: Classification, Customer Reviews, Helping Verbs, Opinion Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1823
33 Examining the Value of Attribute Scores for Author-Supplied Keyphrases in Automatic Keyphrase Extraction

Authors: Vicky Min-How Lim, Siew Fan Wong, Tong Ming Lim

Abstract:

Automatic keyphrase extraction is useful in efficiently locating specific documents in online databases. While several techniques have been introduced over the years, improvement on accuracy rate is minimal. This research examines attribute scores for author-supplied keyphrases to better understand how the scores affect the accuracy rate of automatic keyphrase extraction. Five attributes are chosen for examination: Term Frequency, First Occurrence, Last Occurrence, Phrase Position in Sentences, and Term Cohesion Degree. The results show that First Occurrence is the most reliable attribute. Term Frequency, Last Occurrence and Term Cohesion Degree display a wide range of variation but are still usable with suggested tweaks. Only Phrase Position in Sentences shows a totally unpredictable pattern. The results imply that the commonly used ranking approach which directly extracts top ranked potential phrases from candidate keyphrase list as the keyphrases may not be reliable.

Keywords: Accuracy, Attribute Score, Author-supplied keyphrases, Automatic keyphrase extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1130
32 Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Keywords: Time-frequency plane, atomic decomposition, envelope sampling, Gabor atoms, matching pursuit, sparse dictionary learning, sparse autoencoder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1081
31 A Prevalence of Phonological Disorder in Children with Specific Language Impairment

Authors: Etim, Victoria Enefiok, Dada, Oluseyi Akintunde, Bassey Okon

Abstract:

Phonological disorder is a serious and disturbing issue to many parents and teachers. Efforts towards resolving the problem have been undermined by other specific disabilities which were hidden to many regular and special education teachers. It is against this background that this study was motivated to provide data on the prevalence of phonological disorders in children with specific language impairment (CWSLI) as the first step towards critical intervention. The study was a survey of 15 CWSLI from St. Louise Inclusive schools, Ikot Ekpene in Akwa Ibom State of Nigeria. Phonological Processes Diagnostic Scale (PPDS) with 17 short sentences, which cut across the five phonological processes that were examined, were validated by experts in test measurement, phonology and special education. The respondents were made to read the sentences with emphasis on the targeted sounds. Their utterances were recorded and analyzed in the language laboratory using Praat Software. Data were also collected through friendly interactions at different times from the clients. The theory of generative phonology was adopted for the descriptive analysis of the phonological processes. Data collected were analyzed using simple percentage and composite bar chart for better understanding of the result. The study found out that CWSLI exhibited the five phonological processes under investigation. It was revealed that 66.7%, 80%, 73.3%, 80%, and 86.7% of the respondents have severe deficit in fricative stopping, velar fronting, liquid gliding, final consonant deletion and cluster reduction, respectively. It was therefore recommended that a nationwide survey should be carried out to have national statistics of CWSLI with phonological deficits and develop intervention strategies for effective therapy to remediate the disorder.

Keywords: Language disorders, phonology, phonological processes, specific language impairment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 749
30 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 952
29 A Computational Model for Resolving Pronominal Anaphora in Turkish Using Hobbs- Naïve Algorithm

Authors: Pınar Tüfekçi, Yılmaz Kılıçaslan

Abstract:

In this paper we present a computational model for pronominal anaphora resolution in Turkish. The model is based on Hobbs’ Naїve Algorithm [4, 5, 6], which exploits only the surface syntax of sentences in a given text.

Keywords: Anaphora Resolution, Pronoun Resolution, Syntax based Algorithms, Naїve Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2988
28 Continuous Text Translation Using Text Modeling in the Thetos System

Authors: Nina Suszczanska, Przemyslaw Szmal, Slawomir Kulikow

Abstract:

In the paper a method of modeling text for Polish is discussed. The method is aimed at transforming continuous input text into a text consisting of sentences in so called canonical form, whose characteristic is, among others, a complete structure as well as no anaphora or ellipses. The transformation is lossless as to the content of text being transformed. The modeling method has been worked out for the needs of the Thetos system, which translates Polish written texts into the Polish sign language. We believe that the method can be also used in various applications that deal with the natural language, e.g. in a text summary generator for Polish.

Keywords: anaphora, machine translation, NLP, sign language, text syntax.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1426
27 Using Heuristic Rules from Sentence Decomposition of Experts- Summaries to Detect Students- Summarizing Strategies

Authors: Norisma Idris, Sapiyan Baba, Rukaini Abdullah

Abstract:

Summarizing skills have been introduced to English syllabus in secondary school in Malaysia to evaluate student-s comprehension for a given text where it requires students to employ several strategies to produce the summary. This paper reports on our effort to develop a computer-based summarization assessment system that detects the strategies used by the students in producing their summaries. Sentence decomposition of expert-written summaries is used to analyze how experts produce their summary sentences. From the analysis, we identified seven summarizing strategies and their rules which are then transformed into a set of heuristic rules on how to determine the summarizing strategies. We developed an algorithm based on the heuristic rules and performed some experiments to evaluate and support the technique proposed.

Keywords: Summarizing strategies, heuristic rules, sentencedecomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518
26 Computer Aided Language Learning System for Arabic for Second Language Learners

Authors: Osama Abufanas

Abstract:

This paper aims to build an Arabic learning language tool using Flash CS4 professional software with action script 3.0 programming language, based on the Computer Aided Language Learning (CALL) material. An extra intention is to provide a primary tool and focus on learning Arabic as a second language to adults. It contains letters, words and sentences at the first stage. This includes interactive practices, which evaluates learners’ comprehension of the Arabic language. The system was examined and it was found that the language structure was correct and learners were satisfied regarding the system tools. The learners found the system tools efficient and simple to use. The paper's main conclusion illustrates that CALL can be applied without any hesitation to second language learners

Keywords: Arabic Language, Computer Aided Language Learning (CALL), Learner, Material.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2460
25 Estimating Word Translation Probabilities for Thai – English Machine Translation using EM Algorithm

Authors: Chutchada Nusai, Yoshimi Suzuki, Haruaki Yamazaki

Abstract:

Selecting the word translation from a set of target language words, one that conveys the correct sense of source word and makes more fluent target language output, is one of core problems in machine translation. In this paper we compare the 3 methods of estimating word translation probabilities for selecting the translation word in Thai – English Machine Translation. The 3 methods are (1) Method based on frequency of word translation, (2) Method based on collocation of word translation, and (3) Method based on Expectation Maximization (EM) algorithm. For evaluation we used Thai – English parallel sentences generated by NECTEC. The method based on EM algorithm is the best method in comparison to the other methods and gives the satisfying results.

Keywords: Machine translation, EM algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1463
24 Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines

Authors: Sang-Soo Kim, Seong-Bae Park, Sang-Jo Lee

Abstract:

In this paper, we propose a method of resolving dependency ambiguities of Korean subordinate clauses based on Support Vector Machines (SVMs). Dependency analysis of clauses is well known to be one of the most difficult tasks in parsing sentences, especially in Korean. In order to solve this problem, we assume that the dependency relation of Korean subordinate clauses is the dependency relation among verb phrase, verb and endings in the clauses. As a result, this problem is represented as a binary classification task. In order to apply SVMs to this problem, we selected two kinds of features: static and dynamic features. The experimental results on STEP2000 corpus show that our system achieves the accuracy of 73.5%.

Keywords: Dependency analysis, subordinate clauses, binaryclassification, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1345
23 Hybrid Modeling Algorithm for Continuous Tamil Speech Recognition

Authors: M. Kalamani, S. Valarmathy, M. Krishnamoorthi

Abstract:

In this paper, Fuzzy C-Means clustering with Expectation Maximization-Gaussian Mixture Model based hybrid modeling algorithm is proposed for Continuous Tamil Speech Recognition. The speech sentences from various speakers are used for training and testing phase and objective measures are between the proposed and existing Continuous Speech Recognition algorithms. From the simulated results, it is observed that the proposed algorithm improves the recognition accuracy and F-measure up to 3% as compared to that of the existing algorithms for the speech signal from various speakers. In addition, it reduces the Word Error Rate, Error Rate and Error up to 4% as compared to that of the existing algorithms. In all aspects, the proposed hybrid modeling for Tamil speech recognition provides the significant improvements for speechto- text conversion in various applications.

Keywords: Speech Segmentation, Feature Extraction, Clustering, HMM, EM-GMM, CSR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
22 Word Recognition and Learning based on Associative Memories and Hidden Markov Models

Authors: Zöhre Kara Kayikci, Günther Palm

Abstract:

A word recognition architecture based on a network of neural associative memories and hidden Markov models has been developed. The input stream, composed of subword-units like wordinternal triphones consisting of diphones and triphones, is provided to the network of neural associative memories by hidden Markov models. The word recognition network derives words from this input stream. The architecture has the ability to handle ambiguities on subword-unit level and is also able to add new words to the vocabulary during performance. The architecture is implemented to perform the word recognition task in a language processing system for understanding simple command sentences like “bot show apple".

Keywords: Hebbian learning, hidden Markov models, neuralassociative memories, word recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
21 Specialized Translation Teaching Strategies: A Corpus-Based Approach

Authors: Yingying Ding

Abstract:

This study presents a methodology of specialized translation with the objective of helping teachers to improve the strategies in teaching translation. In order to allow students to acquire skills to translate specialized texts, they need to become familiar with the semantic and syntactic features of source texts and target texts. The aim of our study is to use a corpus-based approach in the teaching of specialized translation between Chinese and Italian. This study proposes to construct a specialized Chinese - Italian comparable corpus that consists of 50 economic contracts from the domain of food. With the help of AntConc, we propose to compile a comparable corpus in for translation teaching purposes. This paper attempts to provide insight into how teachers could benefit from comparable corpus in the teaching of specialized translation from Italian into Chinese and through some examples of passive sentences how students could learn to apply different strategies for translating appropriately the voice.

Keywords: Corpus-based approach, translation teaching, specialized translation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 819
20 On-line Speech Enhancement by Time-Frequency Masking under Prior Knowledge of Source Location

Authors: Min Ah Kang, Sangbae Jeong, Minsoo Hahn

Abstract:

This paper presents the source extraction system which can extract only target signals with constraints on source localization in on-line systems. The proposed system is a kind of methods for enhancing a target signal and suppressing other interference signals. But, the performance of proposed system is superior to any other methods and the extraction of target source is comparatively complete. The method has a beamforming concept and uses an improved time-frequency (TF) mask-based BSS algorithm to separate a target signal from multiple noise sources. The target sources are assumed to be in front and test data was recorded in a reverberant room. The experimental results of the proposed method was evaluated by the PESQ score of real-recording sentences and showed a noticeable speech enhancement.

Keywords: Beam forming, Non-stationary noise reduction, Source separation, TF mask.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
19 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: Analysis, data, diary, emotions, mood, sentiment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 750