Search results for: Spoken dialogue system
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8403

Search results for: Spoken dialogue system

8403 Optimizing Dialogue Strategy Learning Using Learning Automata

Authors: G. Kumaravelan, R. Sivakumar

Abstract:

Modeling the behavior of the dialogue management in the design of a spoken dialogue system using statistical methodologies is currently a growing research area. This paper presents a work on developing an adaptive learning approach to optimize dialogue strategy. At the core of our system is a method formalizing dialogue management as a sequential decision making under uncertainty whose underlying probabilistic structure has a Markov Chain. Researchers have mostly focused on model-free algorithms for automating the design of dialogue management using machine learning techniques such as reinforcement learning. But in model-free algorithms there exist a dilemma in engaging the type of exploration versus exploitation. Hence we present a model-based online policy learning algorithm using interconnected learning automata for optimizing dialogue strategy. The proposed algorithm is capable of deriving an optimal policy that prescribes what action should be taken in various states of conversation so as to maximize the expected total reward to attain the goal and incorporates good exploration and exploitation in its updates to improve the naturalness of humancomputer interaction. We test the proposed approach using the most sophisticated evaluation framework PARADISE for accessing to the railway information system.

Keywords: Dialogue management, Learning automata, Reinforcement learning, Spoken dialogue system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
8402 On Dialogue Systems Based on Deep Learning

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

Nowadays, dialogue systems increasingly become the way for humans to access many computer systems. So, humans can interact with computers in natural language. A dialogue system consists of three parts: understanding what humans say in natural language, managing dialogue, and generating responses in natural language. In this paper, we survey deep learning based methods for dialogue management, response generation and dialogue evaluation. Specifically, these methods are based on neural network, long short-term memory network, deep reinforcement learning, pre-training and generative adversarial network. We compare these methods and point out the further research directions.

Keywords: Dialogue management, response generation, reinforcement learning, deep learning, evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 708
8401 User Satisfaction and Acceptability of Dialogue Systems for Detecting Counterfeit Drugs

Authors: Oyelami M. Olufemi

Abstract:

The menace of counterfeiting pharmaceuticals/drugs has become a major threat to consumers, healthcare providers, drug manufacturers and governments. It is a source of public health concern both in the developed and developing nations. Several solutions for detecting and authenticating counterfeit drugs have been adopted by different nations of the world. In this article, a dialogue system-based drug counterfeiting detection system was developed and the results of the user satisfaction and acceptability of the system are presented. The results show that the users were satisfied with the system and the system was widely accepted as a means of fighting counterfeited drugs.

Keywords: Counterfeiting, dialogue system, drugs, voice application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2016
8400 A Survey of WhatsApp as a Tool for Instructor-Learner Dialogue, Learner-Content Dialogue, and Learner-Learner Dialogue

Authors: Ebrahim Panah, Muhammad Yasir Babar

Abstract:

Thanks to the development of online technology and social networks, people are able to communicate as well as learn. WhatsApp is a popular social network which is growingly gaining popularity. This app can be used for communication as well as education. It can be used for instructor-learner, learner-learner, and learner-content interactions; however, very little knowledge is available on these potentials of WhatsApp. The current study was undertaken to investigate university students’ perceptions of WhatsApp used as a tool for instructor-learner dialogue, learner-content dialogue, and learner-learner dialogue. The study adopted a survey approach and distributed the questionnaire developed by Google Forms to 54 (11 males and 43 females) university students. The obtained data were analyzed using SPSS version 20. The result of data analysis indicates that students have positive attitudes towards WhatsApp as a tool for Instructor-Learner Dialogue: it easy to reach the lecturer (4.07), the instructor gives me valuable feedback on my assignment (4.02), the instructor is supportive during course discussion and offers continuous support with the class (4.00). Learner-Content Dialogue: WhatsApp allows me to academically engage with lecturers anytime, anywhere (4.00), it helps to send graphics such as pictures or charts directly to the students (3.98), it also provides out of class, extra learning materials and homework (3.96), and Learner-Learner Dialogue: WhatsApp is a good tool for sharing knowledge with others (4.09), WhatsApp allows me to academically engage with peers anytime, anywhere (4.07), and we can interact with others through the use of group discussion (4.02). It was also found that there are significant positive correlations between students’ perceptions of Instructor-Learner Dialogue (ILD), Learner-Content Dialogue (LCD), Learner-Learner Dialogue (LLD) and WhatsApp Application in classroom. The findings of the study have implications for lectures, policy makers and curriculum developers.

Keywords: Instructor-learner dialogue, learners-contents dialogue, learner-learner dialogue, WhatsApp.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 578
8399 A Web-Based Self-Learning Grammar for Spoken Language Understanding

Authors: S. M. Biondi, V. Catania, R. Di Natale, A. R. Intilisano, D. Panno

Abstract:

One of the major goals of Spoken Dialog Systems (SDS) is to understand what the user utters. In the SDS domain, the Spoken Language Understanding (SLU) Module classifies user utterances by means of a pre-definite conceptual knowledge. The SLU module is able to recognize only the meaning previously included in its knowledge base. Due the vastity of that knowledge, the information storing is a very expensive process. Updating and managing the knowledge base are time-consuming and error-prone processes because of the rapidly growing number of entities like proper nouns and domain-specific nouns. This paper proposes a solution to the problem of Name Entity Recognition (NER) applied to a SDS domain. The proposed solution attempts to automatically recognize the meaning associated with an utterance by using the PANKOW (Pattern based Annotation through Knowledge On the Web) method at runtime. The method being proposed extracts information from the Web to increase the SLU knowledge module and reduces the development effort. In particular, the Google Search Engine is used to extract information from the Facebook social network.

Keywords: Spoken Dialog System, Spoken Language Understanding, Web Semantic, Name Entity Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
8398 Dialogue Meetings as an Arena for Collaboration and Reflection among Researchers and Practitioners

Authors: Kerstin Grunden, Ann Svensson, Berit Forsman, Christina Karlsson, Ayman Obeid

Abstract:

The research question of the article is to explore whether the dialogue meetings method could be relevant for reflective learning among researchers and practitioners when welfare technology should be implemented in municipalities, or not. A testbed was planned to be implemented in a retirement home in a Swedish municipality, and the practitioners worked with a pre-study of that testbed. In the article, the dialogue between the researchers and the practitioners in the dialogue meetings is described and analyzed. The potential of dialogue meetings as an arena for learning and reflection among researchers and practitioners is discussed. The research methodology approach is participatory action research with mixed methods (dialogue meetings, focus groups, participant observations). The main findings from the dialogue meetings were that the researchers learned more about the use of traditional research methods, and the practitioners learned more about how they could improve their use of the methods to facilitate change processes in their organization. These findings have the potential both for the researchers and the practitioners to result in more relevant use of research methods in change processes in organizations. It is concluded that dialogue meetings could be relevant for reflective learning among researchers and practitioners when welfare technology should be implemented in a health care organization.

Keywords: Dialogue meetings, implementation, reflection, test bed, welfare technology, participatory action research.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 388
8397 The Effects of a Digital Dialogue Game on Higher Education Students’ Argumentation-Based Learning

Authors: Omid Noroozi

Abstract:

Digital dialogue games have opened up opportunities for learning skills by engaging students in complex problem solving that mimic real world situations, without importing unwanted constraints and risks of the real world. Digital dialogue games can be motivating and engaging to students for fun, creative thinking, and learning. This study explored how undergraduate students engage with argumentative discourse activities which have been designed to intensify debate. A pre-test, post-test design was used with students who were assigned to groups of four and asked to debate a controversial topic with the aim of exploring various 'pros and cons' on the 'Genetically Modified Organisms (GMOs)'. Findings reveal that the Digital dialogue game can facilitate argumentation-based learning. The digital Dialogue game was also evaluated positively in terms of students’ satisfaction and learning experiences.

Keywords: Argumentation, dialogue, digital game, learning, motivation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1140
8396 Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits

Authors: Ali Ganoun

Abstract:

This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.

Keywords: Speech Recognition, Spectrum Analysis, Burg Spectrum, Walsh Spectrum Analysis, Thomson Multitaper Spectrum, MFCC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529
8395 Dialogue Journals as an EFL Learning Strategy in the Preparatory Year Program: Learners' Attitudes and Perceptions

Authors: Asma Alyahya

Abstract:

This study attempts to elicit the perceptions and attitudes of EFL learners of the Preparatory Year Program at KSU towards dialogue journal writing as an EFL learning strategy. The descriptive research design used incorporated both qualitative and quantitative instruments to accomplish the objectives of the study. A learners’ attitude questionnaire and follow-up interviewswith learners from a randomly selected representative sample of the participants were employed. The participants were 55 female Saudi university students in the Preparatory Year Program at King Saud University. The analysis of the results indicated that the PYP learners had highly positive attitudes towards dialogue journal writing in their EFL classes and positive perceptions of the benefits of the use of dialogue journal writing as an EFL learning strategy. The results also revealed that dialogue journals are considered an effective EFL learning strategy since they fulfill various needs for both learners and instructors. Interestingly, the analysis of the results also revealed that Saudi university level students tend to write about personal topics in their dialogue journals more than academic ones.

Keywords: Dialogue journals, EFL, learning strategy, writing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977
8394 Grammatically Coded Corpus of Spoken Lithuanian: Methodology and Development

Authors: L. Kamandulytė-Merfeldienė

Abstract:

The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.

Keywords: CHILDES, Corpus of Spoken Lithuanian, grammatical annotation, grammatical disambiguation, lexicon, Lithuanian.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 894
8393 Roadmapping as a Collaborative Strategic Decision-Making Process: Shaping Social Dialogue Options for the European Banking Sector

Authors: Christos A. Ioannou, Panagiotis Panagiotopoulos, Lampros Stergioulas

Abstract:

The new status generated by technological advancements and changes in the global economy raises important issues on how communities and organisations need to innovate upon their traditional processes in order to adapt to the challenges of the Knowledge Society. The DialogoS+ European project aims to study the role of and promote social dialogue in the banking sector, strengthen the link between old and new members and make social dialogue at the European level a force for innovation and change, also given the context of the international crisis emerging in 2008- 2009. Under the scope of DialogoS+, this paper describes how the community of Europe-s banking sector trade unions attempted to adapt to the challenges of the Knowledge Society by exploiting the benefits of new channels of communication, learning, knowledge generation and diffusion focusing on the concept of roadmapping. Important dimensions of social dialogue such as collective bargaining and working conditions are addressed.

Keywords: Banking sector, knowledge society, road mapping, social dialogue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2075
8392 An Enhanced Tool for Implementing Dialogue Forms in Conversational Applications

Authors: Ilias Spais, George Bafas

Abstract:

Natural Language Understanding Systems (NLU) will not be widely deployed unless they are technically mature and cost effective to develop. Cost effective development hinges on the availability of tools and techniques enabling the rapid production of NLU applications through minimal human resources. Further, these tools and techniques should allow quick development of applications in a user friendly way and should be easy to upgrade in order to continuously follow the evolving technologies and standards. This paper presents a visual tool for the structuring and editing of dialog forms, the key element of driving conversation in NLU applications based on IBM technology. The main focus is given on the basic component used to describe Human – Machine interactions of that kind, the Dialogue Manager. In essence, the description of a tool that enables the visual representation of the Dialogue Manager mainly during the implementation phase is illustrated.

Keywords: Conversational Applications, Forms Dialogue Manager (FDM), Natural Language Processing, Natural Language Understanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1395
8391 Identifying Missing Component in the Bechdel Test Using Principal Component Analysis Method

Authors: Raghav Lakhotia, Chandra Kanth Nagesh, Krishna Madgula

Abstract:

A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.

Keywords: Bechdel test, dialogue genre, parts of speech tags, principal component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 728
8390 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1857
8389 Online Multilingual Dictionary Using Hamburg Notation for Avatar-Based Indian Sign Language Generation System

Authors: Sugandhi, Parteek Kumar, Sanmeet Kaur

Abstract:

Sign Language (SL) is used by deaf and other people who cannot speak but can hear or have a problem with spoken languages due to some disability. It is a visual gesture language that makes use of either one hand or both hands, arms, face, body to convey meanings and thoughts. SL automation system is an effective way which provides an interface to communicate with normal people using a computer. In this paper, an avatar based dictionary has been proposed for text to Indian Sign Language (ISL) generation system. This research work will also depict a literature review on SL corpus available for various SL s over the years. For ISL generation system, a written form of SL is required and there are certain techniques available for writing the SL. The system uses Hamburg sign language Notation System (HamNoSys) and Signing Gesture Mark-up Language (SiGML) for ISL generation. It is developed in PHP using Web Graphics Library (WebGL) technology for 3D avatar animation. A multilingual ISL dictionary is developed using HamNoSys for both English and Hindi Language. This dictionary will be used as a database to associate signs with words or phrases of a spoken language. It provides an interface for admin panel to manage the dictionary, i.e., modification, addition, or deletion of a word. Through this interface, HamNoSys can be developed and stored in a database and these notations can be converted into its corresponding SiGML file manually. The system takes natural language input sentence in English and Hindi language and generate 3D sign animation using an avatar. SL generation systems have potential applications in many domains such as healthcare sector, media, educational institutes, commercial sectors, transportation services etc. This research work will help the researchers to understand various techniques used for writing SL and generation of Sign Language systems.

Keywords: Avatar, dictionary, HamNoSys, hearing-impaired, Indian Sign Language, sign language.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1243
8388 A Survey of Response Generation of Dialogue Systems

Authors: Yifan Fan, Xudong Luo, Pingping Lin

Abstract:

An essential task in the field of artificial intelligence is to allow computers to interact with people through natural language. Therefore, researches such as virtual assistants and dialogue systems have received widespread attention from industry and academia. The response generation plays a crucial role in dialogue systems, so to push forward the research on this topic, this paper surveys various methods for response generation. We sort out these methods into three categories. First one includes finite state machine methods, framework methods, and instance methods. The second contains full-text indexing methods, ontology methods, vast knowledge base method, and some other methods. The third covers retrieval methods and generative methods. We also discuss some hybrid methods based knowledge and deep learning. We compare their disadvantages and advantages and point out in which ways these studies can be improved further. Our discussion covers some studies published in leading conferences such as IJCAI and AAAI in recent years.

Keywords: Retrieval, generative, deep learning, response generation, knowledge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1117
8387 Design of Cooperative Processes of Innovation

Authors: Suzanne Yaganeh, Janni Nielsen, Leif Bloch Rasmussen

Abstract:

This paper invites to dialogue and reflections on innovation and entrepreneurship by presenting concepts of innovation leading to the introduction of a complex theoretical framework; Cooperative Innovation (CO-IN). CO-IN is a didactic model enhancing and scaffolding processes of cooperation creating innovation drawing on a Scandinavian tradition. CO-IN is based on a cross-sectorial and multidisciplinary approach. We introduce the concept of complementarity to help capture the validity of diversity and we suggest the concept of “the space in between" to understand the creation of identity as a collective mind. We see dialogue and the use of multi modal techniques as essential tools for conceptualizations giving possibility for clarification of the complexity and diversity leading to decision making based on knowledge as commons. We introduce the didactic design and present our empirical findings from an innovation workshop in Argentina. In a final paragraph we reflect on the design as a support of the development of common ground, collective mind and collective action and the creation of knowledge as commons to facilitate innovation and entrepreneurship.

Keywords: CO-operative Innovation, didactic design, dialogue and ICT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1666
8386 A Talking Head System for Korean Text

Authors: Sang-Wan Kim, Hoon Lee, Kyung-Ho Choi, Soon-Young Park

Abstract:

A talking head system (THS) is presented to animate the face of a speaking 3D avatar in such a way that it realistically pronounces the given Korean text. The proposed system consists of SAPI compliant text-to-speech (TTS) engine and MPEG-4 compliant face animation generator. The input to the THS is a unicode text that is to be spoken with synchronized lip shape. The TTS engine generates a phoneme sequence with their duration and audio data. The TTS applies the coarticulation rules to the phoneme sequence and sends a mouth animation sequence to the face modeler. The proposed THS can make more natural lip sync and facial expression by using the face animation generator than those using the conventional visemes only. The experimental results show that our system has great potential for the implementation of talking head for Korean text.

Keywords: Talking head, Lip sync, TTS, MPEG4.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1437
8385 Users’ Preferences for Map Navigation Gestures

Authors: Y. Y. Pang, N. A. Ismail

Abstract:

Map is a powerful and convenient tool in helping us to navigate to different places, but the use of indirect devices often makes its usage cumbersome. This study intends to propose a new map navigation dialogue that uses hand gesture. A set of dialogue was developed from users’ perspective to provide users complete freedom for panning, zooming, rotate, tilt and find direction operations. A participatory design experiment was involved here where one hand gesture and two hand gesture dialogues had been analysed in the forms of hand gestures to develop a set of usable dialogues. The major finding was that users prefer one-hand gesture compared to two-hand gesture in map navigation.

Keywords: Hand gesture, map navigation, participatory design, intuitive interaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1378
8384 Recognition by Online Modeling – a New Approach of Recognizing Voice Signals in Linear Time

Authors: Jyh-Da Wei, Hsin-Chen Tsai

Abstract:

This work presents a novel means of extracting fixedlength parameters from voice signals, such that words can be recognized in linear time. The power and the zero crossing rate are first calculated segment by segment from a voice signal; by doing so, two feature sequences are generated. We then construct an FIR system across these two sequences. The parameters of this FIR system, used as the input of a multilayer proceptron recognizer, can be derived by recursive LSE (least-square estimation), implying that the complexity of overall process is linear to the signal size. In the second part of this work, we introduce a weighting factor λ to emphasize recent input; therefore, we can further recognize continuous speech signals. Experiments employ the voice signals of numbers, from zero to nine, spoken in Mandarin Chinese. The proposed method is verified to recognize voice signals efficiently and accurately.

Keywords: Speech Recognition, FIR system, Recursive LSE, Multilayer Perceptron

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1368
8383 Inventing a Method of Problem Solving: The Natural Movement of the Mind to Solve a Problem

Authors: Amir Farkhonde

Abstract:

The major objective of this study was to devise a method for solving mathematical problems. Three concepts including faculty of understanding, faculty of guess, and free mind or beginner's mind provided the foundation for this method. An explanatory approach along with a hermeneutic process was taken in this study to support the assumption that mathematical knowledge is constantly developing and it seems essential for students to solve math problems on their own using their faculty of understanding (interpretive dialogue) and faculty of guess. For doing so, a kind of movement from the mathematical problem to mathematical knowledge should be adopted for teaching students a new math topic. The research method of this paper is review, descriptive and conception development. This paper first reviews the research findings on the NRICH’S project (NRICH is part of the family of activities in the Millennium Mathematics Project) with the aim that these findings form the theoretical basis of the problem-solving method. Then, the curriculum, the conceptual structure of the new method, how to design the problem and an example of it are discussed. In this way, students are immersed in the story of discovering and understanding the problem formula, and interpretive dialogue with the text continues by following the questions posed by the problem and constantly reconstructing the answer to find a formula or solution to solve the problem.

Keywords: Interpretive dialogue, NRICH, inventing, a method of problem solving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 173
8382 Echo State Networks for Arabic Phoneme Recognition

Authors: Nadia Hmad, Tony Allen

Abstract:

This paper presents an ESN-based Arabic phoneme recognition system trained with supervised, forced and combined supervised/forced supervised learning algorithms. Mel-Frequency Cepstrum Coefficients (MFCCs) and Linear Predictive Code (LPC) techniques are used and compared as the input feature extraction technique. The system is evaluated using 6 speakers from the King Abdulaziz Arabic Phonetics Database (KAPD) for Saudi Arabia dialectic and 34 speakers from the Center for Spoken Language Understanding (CSLU2002) database of speakers with different dialectics from 12 Arabic countries. Results for the KAPD and CSLU2002 Arabic databases show phoneme recognition performances of 72.31% and 38.20% respectively.

Keywords: Arabic phonemes recognition, echo state networks (ESNs), neural networks (NNs), supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2362
8381 Analysis of Combined Use of NN and MFCC for Speech Recognition

Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam

Abstract:

The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.

Keywords: Speech Recognition, MFCC, Neural Network, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3214
8380 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: Neural networks, Noise, Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
8379 Architecture of Speech-based Registration System

Authors: Mayank Kumar, D B Mahesh Kumar, Ashwin S Kumar, N K Srinath

Abstract:

In this era of technology, fueled by the pervasive usage of the internet, security is a prime concern. The number of new attacks by the so-called “bots", which are automated programs, is increasing at an alarming rate. They are most likely to attack online registration systems. Technology, called “CAPTCHA" (Completely Automated Public Turing test to tell Computers and Humans Apart) do exist, which can differentiate between automated programs and humans and prevent replay attacks. Traditionally CAPTCHA-s have been implemented with the challenge involved in recognizing textual images and reproducing the same. We propose an approach where the visual challenge has to be read out from which randomly selected keywords are used to verify the correctness of spoken text and in turn detect the presence of human. This is supplemented with a speaker recognition system which can identify the speaker also. Thus, this framework fulfills both the objectives – it can determine whether the user is a human or not and if it is a human, it can verify its identity.

Keywords: CAPTCHA, automatic speech recognition, keyword spotting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1494
8378 A Proposal on the Educational Transactional Analysis as a Dialogical Vision of Culture: Conceptual Signposts and Practical Tools for Educators

Authors: Marina Sartor Hoffer

Abstract:

The multicultural composition of today's societies poses new challenges to educational contexts. Schools are therefore called first to develop dialogic aptitudes and communicative skills adapted to the complex reality of post-modern societies. It is indispensable for educators and for young people to learn theoretical and practical tools during their scholastic path, in order to allow the knowledge of themselves and of the others with the aim of recognizing the value of the others regardless of their culture. Dialogic Skills help to understand and manage individual differences by allowing the solution of problems and preventing conflicts. The Educational Sector of Eric Berne’s Transactional Analysis offers a range of methods and techniques for this purpose. Educational Transactional Analysis is firmly anchored in the Personalist Philosophy and deserves to be promoted as a theoretical frame suitable to face the challenges of contemporary education. The goal of this paper is therefore to outline some conceptual and methodological signposts for the education to dialogue by drawing concepts and methodologies from educational transactional analysis.

Keywords: Dialogic process, education to dialogue, educational transactional analysis, personalism, the good of the relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 850
8377 Automatic Lip Contour Tracking and Visual Character Recognition for Computerized Lip Reading

Authors: Harshit Mehrotra, Gaurav Agrawal, M.C. Srivastava

Abstract:

Computerized lip reading has been one of the most actively researched areas of computer vision in recent past because of its crime fighting potential and invariance to acoustic environment. However, several factors like fast speech, bad pronunciation, poor illumination, movement of face, moustaches and beards make lip reading difficult. In present work, we propose a solution for automatic lip contour tracking and recognizing letters of English language spoken by speakers using the information available from lip movements. Level set method is used for tracking lip contour using a contour velocity model and a feature vector of lip movements is then obtained. Character recognition is performed using modified k nearest neighbor algorithm which assigns more weight to nearer neighbors. The proposed system has been found to have accuracy of 73.3% for character recognition with speaker lip movements as the only input and without using any speech recognition system in parallel. The approach used in this work is found to significantly solve the purpose of lip reading when size of database is small.

Keywords: Contour Velocity Model, Lip Contour Tracking, LipReading, Visual Character Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2343
8376 Slovenian Text-to-Speech Synthesis for Speech User Interfaces

Authors: Jerneja Žganec Gros, Aleš Mihelič, Nikola Pavešić, Mario Žganec, Stanislav Gruden

Abstract:

The paper presents the design concept of a unitselection text-to-speech synthesis system for the Slovenian language. Due to its modular and upgradable architecture, the system can be used in a variety of speech user interface applications, ranging from server carrier-grade voice portal applications, desktop user interfaces to specialized embedded devices. Since memory and processing power requirements are important factors for a possible implementation in embedded devices, lexica and speech corpora need to be reduced. We describe a simple and efficient implementation of a greedy subset selection algorithm that extracts a compact subset of high coverage text sentences. The experiment on a reference text corpus showed that the subset selection algorithm produced a compact sentence subset with a small redundancy. The adequacy of the spoken output was evaluated by several subjective tests as they are recommended by the International Telecommunication Union ITU.

Keywords: text-to-speech synthesis, prosody modeling, speech user interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
8375 Assamese Numeral Speech Recognition using Multiple Features and Cooperative LVQ -Architectures

Authors: Manash Pratim Sarma, Kandarpa Kumar Sarma

Abstract:

A set of Artificial Neural Network (ANN) based methods for the design of an effective system of speech recognition of numerals of Assamese language captured under varied recording conditions and moods is presented here. The work is related to the formulation of several ANN models configured to use Linear Predictive Code (LPC), Principal Component Analysis (PCA) and other features to tackle mood and gender variations uttering numbers as part of an Automatic Speech Recognition (ASR) system in Assamese. The ANN models are designed using a combination of Self Organizing Map (SOM) and Multi Layer Perceptron (MLP) constituting a Learning Vector Quantization (LVQ) block trained in a cooperative environment to handle male and female speech samples of numerals of Assamese- a language spoken by a sizable population in the North-Eastern part of India. The work provides a comparative evaluation of several such combinations while subjected to handle speech samples with gender based differences captured by a microphone in four different conditions viz. noiseless, noise mixed, stressed and stress-free.

Keywords: Assamese, Recognition, LPC, Spectral, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1941
8374 Productivity and Energy Management in Desert Urban

Authors: Masoud Nasri, Rahele Hekmatpanah

Abstract:

Growing world population has fundamental impacts and often catastrophic on natural habitat. The immethodical consumption of energy, destruction of the forests and extinction of plant and animal species are the consequence of this experience. Urban sustainability and sustainable urban development, that is so spoken these days, should be considered as a strategy, goal and policy, beyond just considering environmental issues and protection. The desert-s climate has made a bunch of problems for its residents. Very hot and dry climate in summers of the Iranian desert areas, when there was no access to modern energy source and mechanical cooling systems in the past, made Iranian architects to design a natural ventilation system in their buildings. The structure, like a tower going upward the roof, besides its ornamental application and giving a beautiful view to the building, was used as a spontaneous ventilation system. In this paper, it has been tried to name the problems of the area and it-s inconvenience, then some answers has pointed out in order to solve the problems and as an alternative solution BADGIR (wind-catcher) has been introduced as a solution knowing that it has been playing a major role in dealing with the problems.

Keywords: Productivity, Sustainable development, hot aridzones, climate design, BADGIR (wind-catcher)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573