Search results for: speech user interface.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1844

Search results for: speech user interface.

1784 Assamese Numeral Corpus for Speech Recognition using Cooperative ANN Architecture

Authors: Mousmita Sarma, Krishna Dutta, Kandarpa Kumar Sarma

Abstract:

Speech corpus is one of the major components in a Speech Processing System where one of the primary requirements is to recognize an input sample. The quality and details captured in speech corpus directly affects the precision of recognition. The current work proposes a platform for speech corpus generation using an adaptive LMS filter and LPC cepstrum, as a part of an ANN based Speech Recognition System which is exclusively designed to recognize isolated numerals of Assamese language- a major language in the North Eastern part of India. The work focuses on designing an optimal feature extraction block and a few ANN based cooperative architectures so that the performance of the Speech Recognition System can be improved.

Keywords: Filter, Feature, LMS, LPC, Cepstrum, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2351
1783 The Capacity of Mel Frequency Cepstral Coefficients for Speech Recognition

Authors: Fawaz S. Al-Anzi, Dia AbuZeina

Abstract:

Speech recognition is of an important contribution in promoting new technologies in human computer interaction. Today, there is a growing need to employ speech technology in daily life and business activities. However, speech recognition is a challenging task that requires different stages before obtaining the desired output. Among automatic speech recognition (ASR) components is the feature extraction process, which parameterizes the speech signal to produce the corresponding feature vectors. Feature extraction process aims at approximating the linguistic content that is conveyed by the input speech signal. In speech processing field, there are several methods to extract speech features, however, Mel Frequency Cepstral Coefficients (MFCC) is the popular technique. It has been long observed that the MFCC is dominantly used in the well-known recognizers such as the Carnegie Mellon University (CMU) Sphinx and the Markov Model Toolkit (HTK). Hence, this paper focuses on the MFCC method as the standard choice to identify the different speech segments in order to obtain the language phonemes for further training and decoding steps. Due to MFCC good performance, the previous studies show that the MFCC dominates the Arabic ASR research. In this paper, we demonstrate MFCC as well as the intermediate steps that are performed to get these coefficients using the HTK toolkit.

Keywords: Speech recognition, acoustic features, Mel Frequency Cepstral Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927
1782 Voice Features as the Diagnostic Marker of Autism

Authors: Elena Lyakso, Olga Frolova, Yuri Matveev

Abstract:

The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.

Keywords: Autism spectrum disorders, biomarker of autism, child speech, voice features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 556
1781 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 976
1780 Eisenhower’s Farewell Speech: Initial and Continuing Communication Effects

Authors: B. Kuiper

Abstract:

When Dwight D. Eisenhower delivered his final Presidential speech in 1961, he was using the opportunity to bid farewell to America, but he was also trying to warn his fellow countrymen about deeper challenges threatening the country. In this analysis, Eisenhower’s speech is examined in light of the impact it had on American culture, communication concepts, and political ramifications. The paper initially highlights the previous literature on the speech, especially in light of its 50th anniversary, and reveals a man whose main concern was how the speech’s words would affect his beloved country. The painstaking approach to the wording of the speech to reveal the intent is key, particularly in light of analyzing the motivations according to “virtuous communication.” This philosophical construct indicates that Eisenhower’s Farewell Address was crafted carefully according to a departing President’s deepest values and concerns, concepts that he wanted to pass along to his successor, to his country, and even to the world.

Keywords: Eisenhower, mass communication, political speech, rhetoric.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1822
1779 The Study of Game Interface Improvement due to the Game Operation Dilemma of Player in the Side-Scrolling Shooting Game

Authors: Shih-Chieh Liao, Cheng-Yan Shuai

Abstract:

The feature of a side-scrolling shooting game is facing the surrounding enemy and barraging in an entire screen. The player will be in trouble when they are trying to do complicated operations because of the physical and system limitations of the joystick in the games. This study designed the prototype of a type of arcade stick by focus group and assessed by the expert. We selected the most representative joystick prototype and built the control system for the joystick. We conducted two experimental tests using time and bullet consumption as objective indicators, aiming to demonstrate its efficiency in the game. Finally, the prototype of L-1 solves the dilemma of scroll shooting games when the player uses the arcade stick and improves the function of the arcade stick.

Keywords: Joystick, user interface, side-scrolling shooting game, improved user experience.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 139
1778 Online Collaborative Learning System Using Speech Technology

Authors: Sid-Ahmed. Selouani, Tang-Ho Lê, Chadia Moghrabi, Benoit Lanteigne, Jean Roy

Abstract:

A Web-based learning tool, the Learn IN Context (LINC) system, designed and being used in some institution-s courses in mixed-mode learning, is presented in this paper. This mode combines face-to-face and distance approaches to education. LINC can achieve both collaborative and competitive learning. In order to provide both learners and tutors with a more natural way to interact with e-learning applications, a conversational interface has been included in LINC. Hence, the components and essential features of LINC+, the voice enhanced version of LINC, are described. We report evaluation experiments of LINC/LINC+ in a real use context of a computer programming course taught at the Université de Moncton (Canada). The findings show that when the learning material is delivered in the form of a collaborative and voice-enabled presentation, the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.

Keywords: E-leaning, Knowledge Network, Speech recognition, Speech synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
1777 Development of a Real-Time Brain-Computer Interface for Interactive Robot Therapy: An Exploration of EEG and EMG Features during Hypnosis

Authors: Maryam Alimardani, Kazuo Hiraki

Abstract:

This study presents a framework for development of a new generation of therapy robots that can interact with users by monitoring their physiological and mental states. Here, we focused on one of the controversial methods of therapy, hypnotherapy. Hypnosis has shown to be useful in treatment of many clinical conditions. But, even for healthy people, it can be used as an effective technique for relaxation or enhancement of memory and concentration. Our aim is to develop a robot that collects information about user’s mental and physical states using electroencephalogram (EEG) and electromyography (EMG) signals and performs costeffective hypnosis at the comfort of user’s house. The presented framework consists of three main steps: (1) Find the EEG-correlates of mind state before, during, and after hypnosis and establish a cognitive model for state changes, (2) Develop a system that can track the changes in EEG and EMG activities in real time and determines if the user is ready for suggestion, and (3) Implement our system in a humanoid robot that will talk and conduct hypnosis on users based on their mental states. This paper presents a pilot study in regard to the first stage, detection of EEG and EMG features during hypnosis.

Keywords: Hypnosis, EEG, robotherapy, brain-computer interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519
1776 Hybrid Modeling Algorithm for Continuous Tamil Speech Recognition

Authors: M. Kalamani, S. Valarmathy, M. Krishnamoorthi

Abstract:

In this paper, Fuzzy C-Means clustering with Expectation Maximization-Gaussian Mixture Model based hybrid modeling algorithm is proposed for Continuous Tamil Speech Recognition. The speech sentences from various speakers are used for training and testing phase and objective measures are between the proposed and existing Continuous Speech Recognition algorithms. From the simulated results, it is observed that the proposed algorithm improves the recognition accuracy and F-measure up to 3% as compared to that of the existing algorithms for the speech signal from various speakers. In addition, it reduces the Word Error Rate, Error Rate and Error up to 4% as compared to that of the existing algorithms. In all aspects, the proposed hybrid modeling for Tamil speech recognition provides the significant improvements for speechto- text conversion in various applications.

Keywords: Speech Segmentation, Feature Extraction, Clustering, HMM, EM-GMM, CSR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100
1775 Neural Network Based Speech to Text in Malay Language

Authors: H. F. A. Abdul Ghani, R. R. Porle

Abstract:

Speech to text in Malay language is a system that converts Malay speech into text. The Malay language recognition system is still limited, thus, this paper aims to investigate the performance of ten Malay words obtained from the online Malay news. The methodology consists of three stages, which are preprocessing, feature extraction, and speech classification. In preprocessing stage, the speech samples are filtered using pre emphasis. After that, feature extraction method is applied to the samples using Mel Frequency Cepstrum Coefficient (MFCC). Lastly, speech classification is performed using Feedforward Neural Network (FFNN). The accuracy of the classification is further investigated based on the hidden layer size. From experimentation, the classifier with 40 hidden neurons shows the highest classification rate which is 94%.  

Keywords: Feed-Forward Neural Network, FFNN, Malay speech recognition, Mel Frequency Cepstrum Coefficient, MFCC, speech-to-text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 678
1774 Usability Guidelines for Arab E-government Websites

Authors: Omyma Al Osaimi, Asma AlSumait

Abstract:

The website developer and designer should follow usability guidelines to provide a user-friendly interface. Many guidelines and heuristics have been developed by previous studies to help both the developer and designer in this task, but E-government websites are special cases that require specialized guidelines. This paper introduces a set of 18 guidelines for evaluating the usability of e-government websites in general and Arabic e-government websites specifically, along with a check list of how to apply them. The validity and effectiveness of these guidelines were evaluated against a variety of user characteristics. The results indicated that the proposed set of guidelines can be used to identify qualitative similarities and differences with user testing and that the new set is best suited for evaluating general and e-governmental usability.

Keywords: E-government, Human Computer Interaction, Usability Evaluation, Usability Guidelines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2923
1773 On the Effectivity of Different Pseudo-Noise and Orthogonal Sequences for Speech Encryption from Correlation Properties

Authors: V. Anil Kumar, Abhijit Mitra, S. R. Mahadeva Prasanna

Abstract:

We analyze the effectivity of different pseudo noise (PN) and orthogonal sequences for encrypting speech signals in terms of perceptual intelligence. Speech signal can be viewed as sequence of correlated samples and each sample as sequence of bits. The residual intelligibility of the speech signal can be reduced by removing the correlation among the speech samples. PN sequences have random like properties that help in reducing the correlation among speech samples. The mean square aperiodic auto-correlation (MSAAC) and the mean square aperiodic cross-correlation (MSACC) measures are used to test the randomness of the PN sequences. Results of the investigation show the effectivity of large Kasami sequences for this purpose among many PN sequences.

Keywords: Speech encryption, pseudo-noise codes, maximallength, Gold, Barker, Kasami, Walsh-Hadamard, autocorrelation, crosscorrelation, figure of merit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2009
1772 2-Dimensional Finger Gesture Based Mobile Robot Control Using Touch Screen

Authors: O. Ejale, N.B. Siddique, R. Seals

Abstract:

The purpose of this study was to present a reliable mean for human-computer interfacing based on finger gestures made in two dimensions, which could be interpreted and adequately used in controlling a remote robot's movement. The gestures were captured and interpreted using an algorithm based on trigonometric functions, in calculating the angular displacement from one point of touch to another as the user-s finger moved within a time interval; thereby allowing for pattern spotting of the captured gesture. In this paper the design and implementation of such a gesture based user interface was presented, utilizing the aforementioned algorithm. These techniques were then used to control a remote mobile robot's movement. A resistive touch screen was selected as the gesture sensor, then utilizing a programmed microcontroller to interpret them respectively.

Keywords: 2-Dimensional interface, finger gesture, mobile robot control, touch screen.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887
1771 Styling Influence to the Loyalty for Knowledge Sharing on WikID

Authors: Regine W. Vroom, Bart Bleijerveld, Joost Schulze

Abstract:

WikID is a wiki for industrial design engineers. An important aspect for the viability of a wiki is the loyalty of the user community to share their information and knowledge by adding this knowledge to the wiki. For the initiators of a wiki it is therefore important to use every aspect to stimulate the user community to actively participate. In this study the focus is on the styling of the website. The central question is: How could the WikID website be visually designed to achieve a user experience which will incite the user to actively participate in the WikID community? After a literature study on the influencing factors of a website, a new interface has been designed by applying the rules found, in order to expand this website-s active user community. An online questionnaire regarding the old or the new website gave insights in the opinions of users. As expected, the new website was rated more positively than the old website. However, the differences are limited.

Keywords: Industrial Design Engineering Knowledge, Wiki, Stimulate Knowledge Sharing, Influence of a wiki styling to thewillingness of users to participate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909
1770 A Novel RLS Based Adaptive Filtering Method for Speech Enhancement

Authors: Pogula Rakesh, T. Kishore Kumar

Abstract:

Speech enhancement is a long standing problem with numerous applications like teleconferencing, VoIP, hearing aids and speech recognition. The motivation behind this research work is to obtain a clean speech signal of higher quality by applying the optimal noise cancellation technique. Real-time adaptive filtering algorithms seem to be the best candidate among all categories of the speech enhancement methods. In this paper, we propose a speech enhancement method based on Recursive Least Squares (RLS) adaptive filter of speech signals. Experiments were performed on noisy data which was prepared by adding AWGN, Babble and Pink noise to clean speech samples at -5dB, 0dB, 5dB and 10dB SNR levels. We then compare the noise cancellation performance of proposed RLS algorithm with existing NLMS algorithm in terms of Mean Squared Error (MSE), Signal to Noise ratio (SNR) and SNR Loss. Based on the performance evaluation, the proposed RLS algorithm was found to be a better optimal noise cancellation technique for speech signals.

Keywords: Adaptive filter, Adaptive Noise Canceller, Mean Squared Error, Noise reduction, NLMS, RLS, SNR, SNR Loss.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3140
1769 A Modified Speech Enhancement Using Adaptive Gain Equalizer with Non linear Spectral Subtraction for Robust Speech Recognition

Authors: C. Ganesh Babu, P. T. Vanathi

Abstract:

In this paper we present an enhanced noise reduction method for robust speech recognition using Adaptive Gain Equalizer with Non linear Spectral Subtraction. In Adaptive Gain Equalizer method (AGE), the input signal is divided into a number of subbands that are individually weighed in time domain, in accordance to the short time Signal-to-Noise Ratio (SNR) in each subband estimation at every time instant. Instead of focusing on suppression the noise on speech enhancement is focused. When analysis was done under various noise conditions for speech recognition, it was found that Adaptive Gain Equalizer method algorithm has an obvious failing point for a SNR of -5 dB, with inadequate levels of noise suppression for SNR less than this point. This work proposes the implementation of AGE when coupled with Non linear Spectral Subtraction (AGE-NSS) for robust speech recognition. The experimental result shows that out AGE-NSS performs the AGE when SNR drops below -5db level.

Keywords: Adaptive Gain Equalizer, Non Linear Spectral Subtraction, Speech Enhancement, and Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
1768 Evaluation of AR-4BL-MAST with Multiple Markers Interaction Technique for Augmented Reality Based Engineering Application

Authors: Waleed Maqableh, Ahmad Al-Hamad, Manjit Sidhu

Abstract:

Augmented reality (AR) technology has the capability to provide many benefits in the field of education as a modern technology which aided learning and improved the learning experience. This paper evaluates AR based application with multiple markers interaction technique (touch-to-print) which is designed for analyzing the kinematics of 4BL mechanism in mechanical engineering. The application is termed as AR-4BL-MAST and it allows the users to touch the symbols on a paper in natural way of interaction. The evaluation of this application was performed with mechanical engineering students and human–computer interaction (HCI) experts to test its effectiveness as a tangible user interface application where the statistical results show its ability as an interaction technique, and it gives the users more freedom in interaction with the virtual mechanical objects.

Keywords: Augmented reality, engineering, four-bar linkage, Multimedia, user interface, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390
1767 A Model-Driven Approach of User Interface for MVP Rich Internet Application

Authors: Sarra Roubi, Mohammed Erramdani, Samir Mbarki

Abstract:

This paper presents an approach for the model-driven generating of Rich Internet Application (RIA) focusing on the graphical aspect. We used well known Model-Driven Engineering (MDE) frameworks and technologies, such as Eclipse Modeling Framework (EMF), Graphical Modeling Framework (GMF), Query View Transformation (QVTo) and Acceleo to enable the design and the code automatic generation of the RIA. During the development of the approach, we focused on the graphical aspect of the application in terms of interfaces while opting for the Model View Presenter pattern that is designed for graphics interfaces. The paper describes the process followed to define the approach, the supporting tool and presents the results from a case study.

Keywords: Code generation, Design Pattern, metamodel, Model Driven Engineering, MVP, Rich Internet Application, transformation, User Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1711
1766 Hand Gesture Interpretation Using Sensing Glove Integrated with Machine Learning Algorithms

Authors: Aqsa Ali, Aleem Mushtaq, Attaullah Memon, Monna

Abstract:

In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.

Keywords: American sign language, assistive hand gesture interpreter, human-machine interface, machine learning, sensing glove.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2683
1765 Speech Acts and Politeness Strategies in an EFL Classroom in Georgia

Authors: Tinatin Kurdghelashvili

Abstract:

The paper deals with the usage of speech acts and politeness strategies in an EFL classroom in Georgia (Rep of). It explores the students’ and the teachers’ practice of the politeness strategies and the speech acts of apology, thanking, request, compliment / encouragement, command, agreeing / disagreeing, addressing and code switching. The research method includes observation as well as a questionnaire. The target group involves the students from Georgian public schools and two certified, experienced local English teachers. The analysis is based on Searle’s Speech Act Theory and Brown and Levinson’s politeness strategies. The findings show that the students have certain knowledge regarding politeness yet they fail to apply them in English communication. In addition, most of the speech acts from the classroom interaction are used by the teachers and not the students. Thereby, it is suggested that teachers should cultivate the students’ communicative competence and attempt to give them opportunities to practise more English speech acts than they do today.

Keywords: English as a foreign language, Georgia, politeness principles, speech acts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6130
1764 Voice Driven Applications in Non-stationary and Chaotic Environment

Authors: C. Kwan, X. Li, D. Lao, Y. Deng, Z. Ren, B. Raj, R. Singh, R. Stern

Abstract:

Automated operations based on voice commands will become more and more important in many applications, including robotics, maintenance operations, etc. However, voice command recognition rates drop quite a lot under non-stationary and chaotic noise environments. In this paper, we tried to significantly improve the speech recognition rates under non-stationary noise environments. First, 298 Navy acronyms have been selected for automatic speech recognition. Data sets were collected under 4 types of noisy environments: factory, buccaneer jet, babble noise in a canteen, and destroyer. Within each noisy environment, 4 levels (5 dB, 15 dB, 25 dB, and clean) of Signal-to-Noise Ratio (SNR) were introduced to corrupt the speech. Second, a new algorithm to estimate speech or no speech regions has been developed, implemented, and evaluated. Third, extensive simulations were carried out. It was found that the combination of the new algorithm, the proper selection of language model and a customized training of the speech recognizer based on clean speech yielded very high recognition rates, which are between 80% and 90% for the four different noisy conditions. Fourth, extensive comparative studies have also been carried out.

Keywords: Non-stationary, speech recognition, voice commands.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1498
1763 A Semi- One Time Pad Using Blind Source Separation for Speech Encryption

Authors: Long Jye Sheu, Horng-Shing Chiou, Wei Ching Chen

Abstract:

We propose a new perspective on speech communication using blind source separation. The original speech is mixed with key signals which consist of the mixing matrix, chaotic signals and a random noise. However, parts of the keys (the mixing matrix and the random noise) are not necessary in decryption. In practice implement, one can encrypt the speech by changing the noise signal every time. Hence, the present scheme obtains the advantages of a One Time Pad encryption while avoiding its drawbacks in key exchange. It is demonstrated that the proposed scheme is immune against traditional attacks.

Keywords: one time pad, blind source separation, independentcomponent analysis, speech encryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1531
1762 WAF: an Interface Web Agent Framework

Authors: Xizhi Li, Qinming He

Abstract:

A trend in agent community or enterprises is that they are shifting from closed to open architectures composed of a large number of autonomous agents. One of its implications could be that interface agent framework is getting more important in multi-agent system (MAS); so that systems constructed for different application domains could share a common understanding in human computer interface (HCI) methods, as well as human-agent and agent-agent interfaces. However, interface agent framework usually receives less attention than other aspects of MAS. In this paper, we will propose an interface web agent framework which is based on our former project called WAF and a Distributed HCI template. A group of new functionalities and implications will be discussed, such as web agent presentation, off-line agent reference, reconfigurable activation map of agents, etc. Their enabling techniques and current standards (e.g. existing ontological framework) are also suggested and shown by examples from our own implementation in WAF.

Keywords: HCI, Interface agent, MAS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1606
1761 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7916
1760 Platform-as-a-Service Sticky Policies for Privacy Classification in the Cloud

Authors: Maha Shamseddine, Amjad Nusayr, Wassim Itani

Abstract:

In this paper, we present a Platform-as-a-Service (PaaS) model for controlling the privacy enforcement mechanisms applied on user data when stored and processed in Cloud data centers. The proposed architecture consists of establishing user configurable ‘sticky’ policies on the Graphical User Interface (GUI) data-bound components during the application development phase to specify the details of privacy enforcement on the contents of these components. Various privacy classification classes on the data components are formally defined to give the user full control on the degree and scope of privacy enforcement including the type of execution containers to process the data in the Cloud. This not only enhances the privacy-awareness of the developed Cloud services, but also results in major savings in performance and energy efficiency due to the fact that the privacy mechanisms are solely applied on sensitive data units and not on all the user content. The proposed design is implemented in a real PaaS cloud computing environment on the Microsoft Azure platform.

Keywords: Privacy enforcement, Platform-as-a-Service privacy awareness, cloud computing privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 709
1759 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: Binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 971
1758 Investigation of Combined use of MFCC and LPC Features in Speech Recognition Systems

Authors: К. R. Aida–Zade, C. Ardil, S. S. Rustamov

Abstract:

Statement of the automatic speech recognition problem, the assignment of speech recognition and the application fields are shown in the paper. At the same time as Azerbaijan speech, the establishment principles of speech recognition system and the problems arising in the system are investigated. The computing algorithms of speech features, being the main part of speech recognition system, are analyzed. From this point of view, the determination algorithms of Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients expressing the basic speech features are developed. Combined use of cepstrals of MFCC and LPC in speech recognition system is suggested to improve the reliability of speech recognition system. To this end, the recognition system is divided into MFCC and LPC-based recognition subsystems. The training and recognition processes are realized in both subsystems separately, and recognition system gets the decision being the same results of each subsystems. This results in decrease of error rate during recognition. The training and recognition processes are realized by artificial neural networks in the automatic speech recognition system. The neural networks are trained by the conjugate gradient method. In the paper the problems observed by the number of speech features at training the neural networks of MFCC and LPC-based speech recognition subsystems are investigated. The variety of results of neural networks trained from different initial points in training process is analyzed. Methodology of combined use of neural networks trained from different initial points in speech recognition system is suggested to improve the reliability of recognition system and increase the recognition quality, and obtained practical results are shown.

Keywords: Speech recognition, cepstral analysis, Voice activation detection algorithm, Mel Frequency Cepstral Coefficients, features of speech, Cepstral Mean Subtraction, neural networks, Linear Predictive Coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869
1757 Computer Software for Calculating Electron Mobility of Semiconductors Compounds; Case Study for N-Gan

Authors: Emad A. Ahmed

Abstract:

Computer software to calculate electron mobility with respect to different scattering mechanism has been developed. This software is adopted completely Graphical User Interface (GUI) technique and its interface has been designed by Microsoft Visual basic 6.0. As a case study the electron mobility of n-GaN was performed using this software. The behavior of the mobility for n-GaN due to elastic scattering processes and its relation to temperature and doping concentration were discussed. The results agree with other available theoretical and experimental data.

Keywords: Electron mobility, relaxation time, GaN, Scattering, Computer software, computation physics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3827
1756 Ontology and CDSS Based Intelligent Health Data Management in Health Care Server

Authors: Eun-Jung Ko, Hyung-Jik Lee, Jeun-Woo Lee

Abstract:

In ubiqutious healthcare environment, user's health data are transfered to the remote healthcare server by the user's wearable system or mobile phone. These collected user's health data should be managed and analyzed in the healthcare server, so that care giver or user can monitor user's physiological state. In this paper, we designed and developed the intelligent Healthcare Server to manage the user's health data using CDSS and ontology. Our system can analyze user's health data semantically using CDSS and ontology, and report the result of user's physiological raw data to the user and care giver.

Keywords: u-healthcare, CDSS, healthcare server, health data, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2200
1755 A Study on Cancer-Cell Invasion Based On the Diffuse Interface Model

Authors: Zhang Linan, Jihwan Song, Dongchoul Kim

Abstract:

In this study, a three-dimensional haptotaxis model to simulate the migration of a population of cancer cells has been proposed. The invasion of cancer cells is related with the hapto-attractant and the effect of the interface energies between the cells and the ECM. The diffuse interface model, which incorporates the haptotaxis mechanism and interface energies, is employed. The semi-implicit Fourier spectral scheme is adopted for efficient evaluation of the simulation. The simulation results thoroughly reveal the dynamics of cancer-cell migration.

Keywords: Haptotaxis, Cancer Cells, Cell Migration, Interface Energy, Diffuse Interface Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387