Search results for: Corpus of Spoken Lithuanian
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 127

Search results for: Corpus of Spoken Lithuanian

97 The Study of Formal and Semantic Errors of Lexis by Persian EFL Learners

Authors: Mohammad J. Rezai, Fereshteh Davarpanah

Abstract:

Producing a text in a language which is not one’s mother tongue can be a demanding task for language learners. Examining lexical errors committed by EFL learners is a challenging area of investigation which can shed light on the process of second language acquisition. Despite the considerable number of investigations into grammatical errors, few studies have tackled formal and semantic errors of lexis committed by EFL learners. The current study aimed at examining Persian learners’ formal and semantic errors of lexis in English. To this end, 60 students at three different proficiency levels were asked to write on 10 different topics in 10 separate sessions. Finally, 600 essays written by Persian EFL learners were collected, acting as the corpus of the study. An error taxonomy comprising formal and semantic errors was selected to analyze the corpus. The formal category covered misselection and misformation errors, while the semantic errors were classified into lexical, collocational and lexicogrammatical categories. Each category was further classified into subcategories depending on the identified errors. The results showed that there were 2583 errors in the corpus of 9600 words, among which, 2030 formal errors and 553 semantic errors were identified. The most frequent errors in the corpus included formal error commitment (78.6%), which were more prevalent at the advanced level (42.4%). The semantic errors (21.4%) were more frequent at the low intermediate level (40.5%). Among formal errors of lexis, the highest number of errors was devoted to misformation errors (98%), while misselection errors constituted 2% of the errors. Additionally, no significant differences were observed among the three semantic error subcategories, namely collocational, lexical choice and lexicogrammatical. The results of the study can shed light on the challenges faced by EFL learners in the second language acquisition process.

Keywords: Collocational errors, lexical errors, Persian EFL learners, semantic errors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1188
96 Comparison among Various Question Generations for Decision Tree Based State Tying in Persian Language

Authors: Nasibeh Nasiri, Dawood Talebi Khanmiri

Abstract:

Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.

Keywords: Decision Tree, Markov Models, Speech Recognition, State Tying.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1694
95 Self-Assembling Hypernetworks for Cognitive Learning of Linguistic Memory

Authors: Byoung-Tak Zhang, Chan-Hoon Park

Abstract:

Hypernetworks are a generalized graph structure representing higher-order interactions between variables. We present a method for self-organizing hypernetworks to learn an associative memory of sentences and to recall the sentences from this memory. This learning method is inspired by the “mental chemistry" model of cognition and the “molecular self-assembly" technology in biochemistry. Simulation experiments are performed on a corpus of natural-language dialogues of approximately 300K sentences collected from TV drama captions. We report on the sentence completion performance as a function of the order of word-interaction and the size of the learning corpus, and discuss the plausibility of this architecture as a cognitive model of language learning and memory.

Keywords: Linguistic recall memory, sentence completion task, self-organizing hypernetworks, cognitive learning and memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476
94 Absence of Developmental Change in Epenthetic Vowel Duration in Japanese Speakers’ English

Authors: Takayuki Konishi, Kakeru Yazawa, Mariko Kondo

Abstract:

This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.

Keywords: Vowel epenthesis, Japanese learners of English, L2 speech corpus, speech rhythm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1098
93 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1160
92 Optimizing Dialogue Strategy Learning Using Learning Automata

Authors: G. Kumaravelan, R. Sivakumar

Abstract:

Modeling the behavior of the dialogue management in the design of a spoken dialogue system using statistical methodologies is currently a growing research area. This paper presents a work on developing an adaptive learning approach to optimize dialogue strategy. At the core of our system is a method formalizing dialogue management as a sequential decision making under uncertainty whose underlying probabilistic structure has a Markov Chain. Researchers have mostly focused on model-free algorithms for automating the design of dialogue management using machine learning techniques such as reinforcement learning. But in model-free algorithms there exist a dilemma in engaging the type of exploration versus exploitation. Hence we present a model-based online policy learning algorithm using interconnected learning automata for optimizing dialogue strategy. The proposed algorithm is capable of deriving an optimal policy that prescribes what action should be taken in various states of conversation so as to maximize the expected total reward to attain the goal and incorporates good exploration and exploitation in its updates to improve the naturalness of humancomputer interaction. We test the proposed approach using the most sophisticated evaluation framework PARADISE for accessing to the railway information system.

Keywords: Dialogue management, Learning automata, Reinforcement learning, Spoken dialogue system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580
91 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886
90 Evaluation of the Burden of Taxation Received by Households in Lithuania

Authors: V. Boguslauskas, G. Jakstonyte, L. Giriunas

Abstract:

Many foreign and Lithuanian scientists, analyzing the evaluation of the tax system in respect of the burden of taxation, agree that the latter, in principle, depends on how many individuals and what units of the residents constitute a household. Therefore, the aim of scientific research is to substantiate or to deny the significance of a household, but not a resident, as a statistical unit, during the evaluation of tax system, to be precise, determination of the value of the burden of taxation. A performed scientific research revealed that evaluation of the tax system in respect of a household, but not a resident, as a statistical unit, allows not only to evaluate the efficiency of the tax system more objectively, but also to forecast practicably existing poverty line, burden of taxation, and to capacitate the initiation of efficient decisions in social and tax fields creating the environment of existence.

Keywords: burden of taxation, household, tax systemevaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1432
89 Distributional Semantics Approach to Thai Word Sense Disambiguation

Authors: Sunee Pongpinigpinyo, Wanchai Rivepiboon

Abstract:

Word sense disambiguation is one of the most important open problems in natural language processing applications such as information retrieval and machine translation. Many approach strategies can be employed to resolve word ambiguity with a reasonable degree of accuracy. These strategies are: knowledgebased, corpus-based, and hybrid-based. This paper pays attention to the corpus-based strategy that employs an unsupervised learning method for disambiguation. We report our investigation of Latent Semantic Indexing (LSI), an information retrieval technique and unsupervised learning, to the task of Thai noun and verbal word sense disambiguation. The Latent Semantic Indexing has been shown to be efficient and effective for Information Retrieval. For the purposes of this research, we report experiments on two Thai polysemous words, namely  /hua4/ and /kep1/ that are used as a representative of Thai nouns and verbs respectively. The results of these experiments demonstrate the effectiveness and indicate the potential of applying vector-based distributional information measures to semantic disambiguation.

Keywords: Distributional semantics, Latent Semantic Indexing, natural language processing, Polysemous words, unsupervisedlearning, Word Sense Disambiguation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781
88 Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Authors: Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag

Abstract:

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning" ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as interesting/not interesting. From these examples, the ROGER algorithm learns a numerical function, inducing some ranking on the collocations. This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve). This approach uses a particular representation for the word collocations, namely the vector of values corresponding to the standard statistical interestingness measures attached to this collocation. As this representation is general (over corpora and natural languages), generality tests were performed by experimenting the ranking function learned from an English corpus in Biology, onto a French corpus of Curriculum Vitae, and vice versa, showing a good robustness of the approaches compared to the state-of-the-art Support Vector Machine (SVM).

Keywords: Text-mining, Terminology Extraction, Evolutionary algorithm, ROC Curve.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1637
87 Investigating Iraqi EFL University Students' Productive Knowledge of Grammatical Collocations in English

Authors: Adnan Z. Mkhelif

Abstract:

Grammatical collocations (GCs) are word combinations containing a preposition or a grammatical structure, such as an infinitive (e.g. smile at, interested in, easy to learn, etc.). Such collocations tend to be difficult for Iraqi EFL university students (IUS) to master. To help address this problem, it is important to identify the factors causing it. This study aims at investigating the effects of L2 proficiency, frequency of GCs and their transparency on IUSs’ productive knowledge of GCs. The study involves 112 undergraduate participants with different proficiency levels, learning English in formal contexts in Iraq. The data collection instruments include (but not limited to) a productive knowledge test (designed by the researcher using the British National Corpus (BNC)), as well as the grammar part of the Oxford Placement Test (OPT). The study findings have shown that all the above-mentioned factors have significant effects on IUSs’ productive knowledge of GCs. In addition to establishing evidence of which factors of L2 learning might be relevant to learning GCs, it is hoped that the findings of the present study will contribute to more effective methods of teaching that can better address and help overcome the problems IUSs encounter in learning GCs. The study is thus hoped to have significant theoretical and pedagogical implications for researchers, syllabus designers as well as teachers of English as a foreign/second language.

Keywords: Corpus linguistics, frequency, grammatical collocations, L2 vocabulary learning, productive knowledge, proficiency, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 819
86 A Corpus-Based Analysis on Code-Mixing Features in Mandarin-English Bilingual Children in Singapore

Authors: Xunan Huang, Caicai Zhang

Abstract:

This paper investigated the code-mixing features in Mandarin-English bilingual children in Singapore. First, it examined whether the code-mixing rate was different in Mandarin Chinese and English contexts. Second, it explored the syntactic categories of code-mixing in Singapore bilingual children. Moreover, this study investigated whether morphological information was preserved when inserting syntactic components into the matrix language. Data are derived from the Singapore Bilingual Corpus, in which the recordings and transcriptions of sixty English-Mandarin 5-to-6-year-old children were preserved for analysis. Results indicated that the rate of code-mixing was asymmetrical in the two language contexts, with the rate being significantly higher in the Mandarin context than that in the English context. The asymmetry is related to language dominance in that children are more likely to code-mix when using their nondominant language. Concerning the syntactic categories of code-mixing words in the Singaporean bilingual children, we found that noun-mixing, verb-mixing, and adjective-mixing are the three most frequently used categories in code-mixing in the Mandarin context. This pattern mirrors the syntactic categories of code-mixing in the Cantonese context in Cantonese-English bilingual children, and the general trend observed in lexical borrowing. Third, our results also indicated that English vocabularies that carry morphological information are embedded in bare forms in the Mandarin context. These findings shed light upon how bilingual children take advantage of the two languages in mixed utterances in a bilingual environment.

Keywords: Code-mixing, Mandarin Chinese, English, bilingual children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1081
85 Re-telling Goa's History: The Margin Narrative

Authors: Anna Beatriz Paula

Abstract:

This paper presents the first reflexions about Margaret Mascarenhas-s novel, “Skin", based on post-colonial critic perception of History and its agents. By doing so, this study will put light on a literary corpus of Indian Literatures: the Goan Literature whose cultural basis creates an unique historiographic metafiction conducted by different characters that one by one plays the narrator role.

Keywords: Goa, History, Literature, Metafiction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2105
84 A Corpus-Based Approach to Understanding Market Access in Fisheries and Aquaculture: A Systematic Literature Review

Authors: Cheryl Marie Cordeiro

Abstract:

Although fisheries and aquaculture studies might seem marginal to international business (IB) studies in general, fisheries and aquaculture IB (FAIB) management is currently facing increasing pressure to meet global demand and consumption for fish in the next coming decades. In part address to this challenge, the purpose of this systematic review of literature (SLR) study is to investigate the use of the term ‘market access’ in its context of use in the generic literature and business sector discourse, in comparison to the more specific literature and discourse in fisheries, aquaculture and seafood. This SLR aims to uncover the knowledge/interest gaps between the academic subject discourses and business sector practices. Corpus driven in methodology and using a triangulation method of three different text analysis software including AntConc, VOSviewer and Web of Science (WoS) analytics, the SLR results indicate a gap in conceptual knowledge and business practices in how ‘market access’ is conceived and used in the context of the pharmaceutical healthcare industry and FAIB research and practice. While it is acknowledged that the product orientation of different business sectors might differ, this SLR study works with the assumption that both business sectors are global in orientation. These business sectors are complex in their operations from product to market. This SLR suggests a conceptual model in understanding the challenges, the potential barriers as well as avenues for solutions to developing market access for FAIB.

Keywords: Market access, fisheries and aquaculture, international business, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
83 The Effects of Gender and Socioeconomic Status on Academic Motivation: The Case of Lithuania

Authors: Ausra Turcinskaite-Balciuniene, Jonas Balciunas, Gediminas Merkys

Abstract:

The problematic of gender and socioeconomic status biased differences in academic motivation patterns is discussed. Gender identity is understood according to symbolic interactionism perspective: as a result of reflected appraisals, social comparisons, self-attributions, and identifications, shaped by social environment and family context. The effects of socioeconomic status on academic motivation are conceptualized according to Bourdieu’s habitus concept, reflecting the role of unconscious and internalized cultural signals, proper to low and high socioeconomic status family contexts. Since families differ by various socioeconomic features, the hypothesis about possible impact of parents’ socioeconomic status on their children’s academic motivation interfering with gender socialization effects is held. The survey, aiming to seize gender differences in academic motivation and self-recorded improvementoriented efforts as a result of socialization processes operating in the families of low and high socioeconomic status, was designed. The results of Lithuanian higher education students’ survey are presented and discussed.

Keywords: Academic Motivation, Gender, Socialization, Socioeconomic Status.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2869
82 Age and Second Language Acquisition: A Case Study from Maldives

Authors: Aaidha Hammad

Abstract:

The age a child to be exposed to a second language is a controversial issue in communities such as the Maldives where English is taught as a second language. It has been observed that different stakeholders have different viewpoints towards the issue. Some believe that the earlier children are exposed to a second language, the better they learn, while others disagree with the notion. Hence, this case study investigates whether children learn a second language better when they are exposed at an earlier age or not. The spoken and written data collected confirm that earlier exposure helps in mastering the sound pattern and speaking fluency with more native-like accent, while a later age is better for learning more abstract and concrete aspects such as grammar and syntactic rules.

Keywords: Age, development of language skills, fluency, second language acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3597
81 Analysis of Combined Use of NN and MFCC for Speech Recognition

Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam

Abstract:

The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.

Keywords: Speech Recognition, MFCC, Neural Network, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3237
80 Named Entity Recognition using Support Vector Machine: A Language Independent Approach

Authors: Asif Ekbal, Sivaji Bandyopadhyay

Abstract:

Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered to be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, information extraction, question answering systems and others. This paper reports about the development of a NER system for Bengali and Hindi using Support Vector Machine (SVM). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Indian languages (ILs) is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named (NE) classes, such as Person name, Location name, Organization name and Miscellaneous name. We have used the annotated corpora of 122,467 tokens of Bengali and 502,974 tokens of Hindi tagged with the twelve different NE classes 1, defined as part of the IJCNLP-08 NER Shared Task for South and South East Asian Languages (SSEAL) 2. In addition, we have manually annotated 150K wordforms of the Bengali news corpus, developed from the web-archive of a leading Bengali newspaper. We have also developed an unsupervised algorithm in order to generate the lexical context patterns from a part of the unlabeled Bengali news corpus. Lexical patterns have been used as the features of SVM in order to improve the system performance. The NER system has been tested with the gold standard test sets of 35K, and 60K tokens for Bengali, and Hindi, respectively. Evaluation results have demonstrated the recall, precision, and f-score values of 88.61%, 80.12%, and 84.15%, respectively, for Bengali and 80.23%, 74.34%, and 77.17%, respectively, for Hindi. Results show the improvement in the f-score by 5.13% with the use of context patterns. Statistical analysis, ANOVA is also performed to compare the performance of the proposed NER system with that of the existing HMM based system for both the languages.

Keywords: Named Entity (NE), Named Entity Recognition (NER), Support Vector Machine (SVM), Bengali, Hindi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3350
79 Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System

Authors: Tomyslav Sledevič, Artūras Serackis, Gintautas Tamulevičius, Dalius Navakauskas

Abstract:

Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in speaker dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signal to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients gives best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfy the real-time requirements and is suitable for applications in embedded systems.

Keywords: Isolated word recognition, features extraction, MFCC, LFCC, LPCC, LPC, FPGA, DTW.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3517
78 Analysis of Steles with Libyan Inscriptions of Grande Kabylia, Algeria

Authors: Samia Ait Ali Yahia

Abstract:

Several steles with Libyan inscriptions were discovered in Grande Kabylia (Algeria), but very few researchers were interested in these inscriptions. Our work is to list, if possible all these steles in order to do a descriptive study of the corpus. The steles analysis will be focused on the iconographic and epigraphic level and on the different forms of Libyan characters in order to highlight the alphabet used by the Grande Kabylia.

Keywords: Epigraphy, stele, Libyan inscription, Grande Kabylia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1056
77 Effective Features for Disambiguation of Turkish Verbs

Authors: Zeynep Orhan, Zeynep Altan

Abstract:

This paper summarizes the results of some experiments for finding the effective features for disambiguation of Turkish verbs. Word sense disambiguation is a current area of investigation in which verbs have the dominant role. Generally verbs have more senses than the other types of words in the average and detecting these features for verbs may lead to some improvements for other word types. In this paper we have considered only the syntactical features that can be obtained from the corpus and tested by using some famous machine learning algorithms.

Keywords: Word sense disambiguation, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
76 Achieving Maximum Performance through the Practice of Entrepreneurial Ethics: Evidence from SMEs in Nigeria

Authors: S. B. Tende, H. L. Abubakar

Abstract:

It is acknowledged that small and medium enterprises (SMEs) may encounter different ethical issues and pressures that could affect the way in which they strategize or make decisions concerning the outcome of their business. Therefore, this research aimed at assessing entrepreneurial ethics in the business of SMEs in Nigeria. Secondary data were adopted as source of corpus for the analysis. The findings conclude that a sound entrepreneurial ethics system has a significant effect on the level of performance of SMEs in Nigeria. The Nigerian Government needs to provide both guiding and physical structures; as well as learning systems that could inculcate these entrepreneurial ethics.

Keywords: Entrepreneurial ethics, culture, performance, SME.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 921
75 Convergence and Divergence in Telephone Conversations: A Case of Persian

Authors: Anna Mirzaiyan, Vahid Parvaresh, Mahmoud Hashemian, Masoud Saeedi

Abstract:

People usually have a telephone voice, which means they adjust their speech to fit particular situations and to blend in with other interlocutors. The question is: Do we speak differently to different people? This possibility has been suggested by social psychologists within Accommodation Theory [1]. Converging toward the speech of another person can be regarded as a polite speech strategy while choosing a language not used by the other interlocutor can be considered as the clearest example of speech divergence [2]. The present study sets out to investigate such processes in the course of everyday telephone conversations. Using Joos-s [3] model of formality in spoken English, the researchers try to explore convergence to or divergence from the addressee. The results propound the actuality that lexical choice, and subsequently, patterns of style vary intriguingly in concordance with the person being addressed.

Keywords: Convergence, divergence, lexical formality, speechaccommodation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3491
74 Echo State Networks for Arabic Phoneme Recognition

Authors: Nadia Hmad, Tony Allen

Abstract:

This paper presents an ESN-based Arabic phoneme recognition system trained with supervised, forced and combined supervised/forced supervised learning algorithms. Mel-Frequency Cepstrum Coefficients (MFCCs) and Linear Predictive Code (LPC) techniques are used and compared as the input feature extraction technique. The system is evaluated using 6 speakers from the King Abdulaziz Arabic Phonetics Database (KAPD) for Saudi Arabia dialectic and 34 speakers from the Center for Spoken Language Understanding (CSLU2002) database of speakers with different dialectics from 12 Arabic countries. Results for the KAPD and CSLU2002 Arabic databases show phoneme recognition performances of 72.31% and 38.20% respectively.

Keywords: Arabic phonemes recognition, echo state networks (ESNs), neural networks (NNs), supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2390
73 Software Architectural Design Ontology

Authors: Muhammad Irfan Marwat, Sadaqat Jan, Syed Zafar Ali Shah

Abstract:

Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.

Keywords: Software Architecture Ontology, Semantic based Software Architecture, Software Architecture, Ontology, Software Engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4143
72 A Talking Head System for Korean Text

Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park

Abstract:

A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.

Keywords: Talking head, Lip sync, TTS, MPEG4.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1464
71 Video Quality Assessment using Visual Attention Approach for Sign Language

Authors: Julia Kucerova, Jaroslav Polec, Darina Tarcsiova

Abstract:

Visual information is very important in human perception of surrounding world. Video is one of the most common ways to capture visual information. The video capability has many benefits and can be used in various applications. For the most part, the video information is used to bring entertainment and help to relax, moreover, it can improve the quality of life of deaf people. Visual information is crucial for hearing impaired people, it allows them to communicate personally, using the sign language; some parts of the person being spoken to, are more important than others (e.g. hands, face). Therefore, the information about visually relevant parts of the image, allows us to design objective metric for this specific case. In this paper, we present an example of an objective metric based on human visual attention and detection of salient object in the observed scene.

Keywords: sign language, objective video quality, visual attention, saliency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
70 A Case Study of Reactive Focus on Form through Negotiation on Spoken Errors: Does It Work for All Learners?

Authors: Vahid Parvaresh, Zohre Kassaian, Saeed Ketabi, Masoud Saeedi

Abstract:

This case study investigates the effects of reactive focus on form through negotiation on the linguistic development of an adult EFL learner in an exclusive private EFL classroom. The findings revealed that in this classroom negotiated feedback occurred significantly more often than non-negotiated feedback. However, it was also found that in the long run the learner was significantly more successful in correcting his own errors when he had received nonnegotiated feedback than negotiated feedback. This study, therefore, argues that although negotiated feedback seems to be effective for some learners in the short run, it is non-negotiated feedback which seems to be more effective in the long run. This long lasting effect might be attributed to the impact of schooling system which is itself indicative of the dominant culture, or to the absence of other interlocutors in the course of interaction.

Keywords: error, feedback, focus on form, interaction, schooling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1203
69 Recognition by Online Modeling – a New Approach of Recognizing Voice Signals in Linear Time

Authors: Jyh-Da Wei, Hsin-Chen Tsai

Abstract:

This work presents a novel means of extracting fixedlength parameters from voice signals, such that words can be recognized in linear time. The power and the zero crossing rate are first calculated segment by segment from a voice signal; by doing so, two feature sequences are generated. We then construct an FIR system across these two sequences. The parameters of this FIR system, used as the input of a multilayer proceptron recognizer, can be derived by recursive LSE (least-square estimation), implying that the complexity of overall process is linear to the signal size. In the second part of this work, we introduce a weighting factor λ to emphasize recent input; therefore, we can further recognize continuous speech signals. Experiments employ the voice signals of numbers, from zero to nine, spoken in Mandarin Chinese. The proposed method is verified to recognize voice signals efficiently and accurately.

Keywords: Speech Recognition, FIR system, Recursive LSE, Multilayer Perceptron

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390
68 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1674