Search results for: voice commands.

138 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: Neural networks, Noise, Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1915

137 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769

136 The Methodology of Flip Chip Using Astro Place and Route Tool

Authors: Rohaya Abdul Wahab, Raja Mohd Fuad Tengku Aziz, Nazaliza Othman, Sharifah Saleh, Nabihah Razali, Rozaimah Baharim, Md Hanif Md Nasir

Abstract:

This paper will discuss flip chip methodology, in which I/O pads, standard cells, macros and bump cells array are placed in the floorplan, then routed using Astro place and route tool. Final DRC and LVS checking is done using Calibre verification tool. The design vehicle to run this methodology is an OpenRISC design targeted to Silterra 0.18 micrometer technology with 6 metal layers for routing. Astro has extensive support for flip chip placement and routing. Astro tool commands for flip chip are straightforward approach like the conventional standard wire bond packaging. However since we do not have flip chip commands in our Astro tool, no LEF file for bump cell and no LEF file for flip chip I/O pad, we create our own methodology to prepare for future flip chip tapeout.

Keywords: Astro, bump cell, Calibre, flip chip, LEF, methodology, SCHEME, TCL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2730

135 Voice Features as the Diagnostic Marker of Autism

Authors: Elena Lyakso, Olga Frolova, Yuri Matveev

Abstract:

The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.

Keywords: Autism spectrum disorders, biomarker of autism, child speech, voice features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 578

134 The Performance of an 802.11g/Wi-Fi Network Whilst Streaming Voice Content

Authors: P. O. Umenne, Odhiambo Marcel O.

Abstract:

A simple network model is developed in OPNET to study the performance of the Wi-Fi protocol. The model is simulated in OPNET and performance factors such as load, throughput and delay are analysed from the model. Four applications such as oracle, http, ftp and voice are applied over the Wireless LAN network to determine the throughput. The voice application utilises a considerable amount of bandwidth of up to 5Mbps, as a result the 802.11g standard of the Wi-Fi protocol was chosen which can support a data rate of up to 54Mbps. Results indicate that when the load in the Wi-Fi network is increased the queuing delay on the point-to-point links in the Wi-Fi network significantly reduces until it is comparable to that of WiMAX. In conclusion, the queuing delay of the Wi-Fi protocol for the network model simulated was about 0.00001secs comparable to WiMAX network values.

Keywords: WLAN-Wireless Local Area Network, MIMO-Multiple Input Multiple Output, Queuing delay, Throughput, AP-Access Point, IP-Internet protocol, TOS-Type of Service.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2104

133 Hand Gesture Recognition: Sign to Voice System (S2V)

Authors: Oi Mean Foong, Tan Jung Low, Satrio Wibowo

Abstract:

Hand gesture is one of the typical methods used in sign language for non-verbal communication. It is most commonly used by people who have hearing or speech problems to communicate among themselves or with normal people. Various sign language systems have been developed by manufacturers around the globe but they are neither flexible nor cost-effective for the end users. This paper presents a system prototype that is able to automatically recognize sign language to help normal people to communicate more effectively with the hearing or speech impaired people. The Sign to Voice system prototype, S2V, was developed using Feed Forward Neural Network for two-sequence signs detection. Different sets of universal hand gestures were captured from video camera and utilized to train the neural network for classification purpose. The experimental results have shown that neural network has achieved satisfactory result for sign-to-voice translation.

Keywords: Hand gesture detection, neural network, signlanguage, sequence detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1821

132 Bandwidth Estimation Algorithms for the Dynamic Adaptation of Voice Codec

Authors: Davide Pierattoni, Ivan Macor, Pier Luca Montessoro

Abstract:

In the recent years multimedia traffic and in particular VoIP services are growing dramatically. We present a new algorithm to control the resource utilization and to optimize the voice codec selection during SIP call setup on behalf of the traffic condition estimated on the network path. The most suitable methodologies and the tools that perform realtime evaluation of the available bandwidth on a network path have been integrated with our proposed algorithm: this selects the best codec for a VoIP call in function of the instantaneous available bandwidth on the path. The algorithm does not require any explicit feedback from the network, and this makes it easily deployable over the Internet. We have also performed intensive tests on real network scenarios with a software prototype, verifying the algorithm efficiency with different network topologies and traffic patterns between two SIP PBXs. The promising results obtained during the experimental validation of the algorithm are now the basis for the extension towards a larger set of multimedia services and the integration of our methodology with existing PBX appliances.

Keywords: Integrated voice-data communication, computernetwork performance, resource optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1666

131 Implementation of the SIP Express Router with Mediaproxy Method on VoIP

Authors: Heru Nurwarsito, R. Arief Setyawan, Rakhmadhany Primananda

Abstract:

Voice Over IP (VoIP) is a technology that could pass the voice traffic and data packet form over an IP network. Network can be used for intranet or Internet. Phone calls using VoIP has advantages in terms of cheaper cost of PSTN phone to more than half, because the cost is calculated by the cost of the global nature of the Internet. Session Initiation Protocol (SIP) is a signaling protocol at the application layer which serves to establish, modify, and terminate a multimedia session involving one or more users. This SIP signaling has SIP message in text form that is used for session management by the SIP components, such as User Agent, Registrar, Redirect Server, and Proxy Server. To build a SIP communication is required SIP Express Router (SER) to be able to receive SIP messages, for handling the basic functions of SIP messages. Problems occur when the NAT through which affects the voice communication will be blocked starting from the sound that is not sent or one side of the sound are sent (half duplex). How that could be used to penetrate NAT is to use a given mediaproxy random RTP port to penetrate NAT.

Keywords: VoIP, SIP, SIP Express Router, NAT, Mediaproxy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2535

130 Speech Coding and Recognition

Authors: M. Satya Sai Ram, P. Siddaiah, M. Madhavi Latha

Abstract:

This paper investigates the performance of a speech recognizer in an interactive voice response system for various coded speech signals, coded by using a vector quantization technique namely Multi Switched Split Vector Quantization Technique. The process of recognizing the coded output can be used in Voice banking application. The recognition technique used for the recognition of the coded speech signals is the Hidden Markov Model technique. The spectral distortion performance, computational complexity, and memory requirements of Multi Switched Split Vector Quantization Technique and the performance of the speech recognizer at various bit rates have been computed. From results it is found that the speech recognizer is showing better performance at 24 bits/frame and it is found that the percentage of recognition is being varied from 100% to 93.33% for various bit rates.

Keywords: Linear predictive coding, Speech Recognition, Voice banking, Multi Switched Split Vector Quantization, Hidden Markov Model, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808

129 Automation System for Optimization of Electrical and Thermal Energy Production in Cogenerative Gas Power Plants

Authors: Ion Miciu

Abstract:

The system is made with main distributed components: First Level: Industrial Computers placed in Control Room (monitors thermal and electrical processes based on the data provided by the second level); Second Level: PLCs which collects data from process and transmits information on the first level; also takes commands from this level which are further, passed to execution elements from third level; Third Level: field elements consisting in 3 categories: data collecting elements; data transfer elements from the third level to the second; execution elements which take commands from the second level PLCs and executes them after which transmits the confirmation of execution to them. The purpose of the automatic functioning is the optimization of the co-generative electrical energy commissioning in the national energy system and the commissioning of thermal energy to the consumers. The integrated system treats the functioning of all the equipments and devices as a whole: Gas Turbine Units (GTU); MT 20kV Medium Voltage Station (MVS); 0,4 kV Low Voltage Station (LVS); Main Hot Water Boilers (MHW); Auxiliary Hot Water Boilers (AHW); Gas Compressor Unit (GCU); Thermal Agent Circulation Pumping Unit (TPU); Water Treating Station (WTS).

Keywords: Automation System, Cogenerative Power Plant, Control, Monitoring, Real Time

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1952

128 e-Learning Program with Voice Assistance for a Tactile Braille

Authors: Yutaka Takaoka, Mika Ohta, Aki Sugano, Tsuyoshi Oda, Eiichi Maeda, Sumiyo Hanaoka, Masako Matsuura

Abstract:

Along with the increased morbidity of glaucoma or diabetic retinitis pigmentosa, etc., number of people with vision loss is also increasing in Japan. It is difficult for the visually impaired to learn and acquire braille because most of them are middle-aged. In addition, number of braille teachers are not sufficient and reducing in Japan, and this situation makes more difficult for the visually impaired. Therefore, we research and develop a Web-based e-learning program for tactile braille, that cooperate with braille display and voice assistance.

Keywords: Acquired visually impaired, Braille, e-learning, Tactile braille

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672

127 Efficient DTW-Based Speech Recognition System for Isolated Words of Arabic Language

Authors: Khalid A. Darabkh, Ala F. Khalifeh, Baraa A. Bathech, Saed W. Sabah

Abstract:

Despite the fact that Arabic language is currently one of the most common languages worldwide, there has been only a little research on Arabic speech recognition relative to other languages such as English and Japanese. Generally, digital speech processing and voice recognition algorithms are of special importance for designing efficient, accurate, as well as fast automatic speech recognition systems. However, the speech recognition process carried out in this paper is divided into three stages as follows: firstly, the signal is preprocessed to reduce noise effects. After that, the signal is digitized and hearingized. Consequently, the voice activity regions are segmented using voice activity detection (VAD) algorithm. Secondly, features are extracted from the speech signal using Mel-frequency cepstral coefficients (MFCC) algorithm. Moreover, delta and acceleration (delta-delta) coefficients have been added for the reason of improving the recognition accuracy. Finally, each test word-s features are compared to the training database using dynamic time warping (DTW) algorithm. Utilizing the best set up made for all affected parameters to the aforementioned techniques, the proposed system achieved a recognition rate of about 98.5% which outperformed other HMM and ANN-based approaches available in the literature.

Keywords: Arabic speech recognition, MFCC, DTW, VAD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4050

126 Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech

Authors: E. Krasnova, E. Bulgakova, V. Shchemelinin

Abstract:

The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.

Keywords: Speaker identification, acoustic-spectrographic method, non-native speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 845

125 Voice Over IP Technology Development in Offshore Industry: System Dynamics Approach

Authors: B. Kiyani, R. H. Amiri, S. H. Hosseini, A. Bourouni, A. Karimi

Abstract:

Nowadays, offshore's complicated facilities need their own communications requirements. Nevertheless, developing and real-world applications of new communications technology are faced with tremendous problems for new technology users, developers and implementers. Traditional systems engineering cannot be capable to develop a new technology effectively because it does not consider the dynamics of the process. This paper focuses on the design of a holistic model that represents the dynamics of new communication technology development within offshore industry. The model shows the behavior of technology development efforts. Furthermore, implementing this model, results in new and useful insights about the policy option analysis for developing a new communications technology in offshore industry.

Keywords: Technology development, Offshore industry, Systemdynamics, Voice Over IP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1616

124 Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction

Authors: Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai

Abstract:

Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.

Keywords: Human Machine Interaction, Human Computer Interaction, Voice Recognition, Acoustic Model Compensation, Acoustic Speech Enhancement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1854

123 Tele-Operated Anthropomorphic Arm and Hand Design

Authors: Namal A. Senanayake, Khoo B. How, Quah W. Wai

Abstract:

In this project, a tele-operated anthropomorphic robotic arm and hand is designed and built as a versatile robotic arm system. The robot has the ability to manipulate objects such as pick and place operations. It is also able to function by itself, in standalone mode. Firstly, the robotic arm is built in order to interface with a personal computer via a serial servo controller circuit board. The circuit board enables user to completely control the robotic arm and moreover, enables feedbacks from user. The control circuit board uses a powerful integrated microcontroller, a PIC (Programmable Interface Controller). The PIC is firstly programmed using BASIC (Beginner-s All-purpose Symbolic Instruction Code) and it is used as the 'brain' of the robot. In addition a user friendly Graphical User Interface (GUI) is developed as the serial servo interface software using Microsoft-s Visual Basic 6. The second part of the project is to use speech recognition control on the robotic arm. A speech recognition circuit board is constructed with onboard components such as PIC and other integrated circuits. It replaces the computers- Graphical User Interface. The robotic arm is able to receive instructions as spoken commands through a microphone and perform operations with respect to the commands such as picking and placing operations.

Keywords: Tele-operated Anthropomorphic Robotic Arm and Hand, Robot Motion System, Serial Servo Controller, Speech Recognition Controller.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734

122 Quality of Service in Multioperator GPON Access Networks with Triple-Play Services

Authors: Germán Santos-Boada, Jordi Domingo-Pascual

Abstract:

Recently, in some places, optical-fibre access networks have been used with GPON technology belonging to organizations (in most cases public bodies) that act as neutral operators. These operators simultaneously provide network services to various telecommunications operators that offer integrated voice, data and television services. This situation creates new problems related to quality of service, since the interests of the users are intermingled with the interests of the operators. In this paper, we analyse this problem and consider solutions that make it possible to provide guaranteed quality of service for voice over IP, data services and interactive digital television.

Keywords: GPON networks, multioperator, quality of service, triple-play services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3403

121 Normalized Cumulative Spectral Distribution in Music

Authors: Young-Hwan Song, Hyung-Jun Kwon, Myung-Jin Bae

Abstract:

As the remedy used music becomes active and meditation effect through the music is verified, people take a growing interest about psychological balance or remedy given by music. From traditional studies, it is verified that the music of which spectral envelop varies approximately as 1/f (f is frequency) down to a frequency of low frequency bandwidth gives psychological balance. In this paper, we researched signal properties of music which gives psychological balance. In order to find this, we derived the property from voice. Music composed by voice shows large value in NCSD. We confirmed the degree of deference between music by curvature of normalized cumulative spectral distribution. In the music that gives psychological balance, the curvature shows high value, otherwise, the curvature shows low value.

Keywords: Cognitive Psychology, Normalized Cumulative Spectral Distribution, Curvature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170

120 Blind Speech Separation Using SRP-PHAT Localization and Optimal Beamformer in Two-Speaker Environments

Authors: Hai Quang Hong Dam, Hai Ho, Minh Hoang Le Ngo

Abstract:

This paper investigates the problem of blind speech separation from the speech mixture of two speakers. A voice activity detector employing the Steered Response Power - Phase Transform (SRP-PHAT) is presented for detecting the activity information of speech sources and then the desired speech signals are extracted from the speech mixture by using an optimal beamformer. For evaluation, the algorithm effectiveness, a simulation using real speech recordings had been performed in a double-talk situation where two speakers are active all the time. Evaluations show that the proposed blind speech separation algorithm offers a good interference suppression level whilst maintaining a low distortion level of the desired signal.

Keywords: Blind speech separation, voice activity detector, SRP-PHAT, optimal beamformer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1364

119 Music in the Early Stages of Life: Considerations from Working with Groups of Mothers and Babies

Authors: Ana Paula Melchiors Stahlschmidt

Abstract:

This paper discusses the role of music as a ludic activity and constituent element of voice in the construction and consolidation of the relationship of the baby and his/her mother or caretaker, evaluating its implications in his/her psychic structure and constitution as a subject. The work was based on the research developed as part of the author’s doctoral activities carried out from her insertion in a project of the Music Department of Federal University of Rio Grande do Sul - UFRGS, which objective was the development of musical activities with groups of babies from 0 to 24 months old and their caretakers. Observations, video recordings of the meetings, audio testemonies, and evaluation tools applied to group participants were used as instruments for this research. Information was collected on the participation of 195 babies, among which 8 were more focused on through interviews with their mothers or caretakers. These interviews were analyzed based on the referential of French Discourse Analysis, Psychoanalysis, Psychology of Development and Musical Education. The results of the research were complemented by other posterior experiences that the author developed with similar groups, in a context of a private clinic. The information collected allowed the observation of the ludic and structural functions of musical activities, when developed in a structured environment, as well as the importance of the musicality of the mother’s voice to the psychical structuring of the baby, allowing his/her insertion in the language and his/her constitution as a subject.

Keywords: Music and babies, maternal voice, Psychoanalysis and music, Psychology and music.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2837

118 Power Line Carrier Equipment Supporting IP Traffic Transmission in the Enterprise Networks of Energy Companies

Authors: M. S. Anton Merkulov

Abstract:

This article discusses the questions concerning of creating small packet networks for energy companies with application of high voltage power line carrier equipment (PLC) with functionality of IP traffic transmission. The main idea is to create converged PLC links between substations and dispatching centers where packet data and voice are transmitted in one data flow. The article contents description of basic conception of the network, evaluation of voice traffic transmission parameters, and discussion of header compression techniques in relation to PLC links. The results of exploration show us, that convergent packet PLC links can be very useful in the construction of small packet networks between substations in remote locations, such as deposits or low populated areas.

Keywords: packet PLC, VoIP, time delay, packet traffic, overhead compression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2136

117 Riding the Crest of the Wave: Inclusive Education in New Zealand

Authors: Barbara A. Perry

Abstract:

In 1996, the New Zealand government and the Ministry of Education announced that they were setting up a "world class system of inclusive education". As a parent of a son with high and complex needs, a teacher, school Principal and Disability studies Lecturer, this author will track the changes in the journey towards inclusive education over the last 20 years. Strategies for partnering with families to ensure educational success along with insights from one of those on the crest of the wave will be presented. Using a narrative methodology the author will illuminate how far New Zealand has come towards this world class system of inclusion promised and share from personal experience some of the highlights and risks in the system. This author has challenged the old structures and been part of the setting up of new structures particularly for providing parent voice and insight; this paper provides a unique view from an insider’s voice as well as a professional in the system.

Keywords: Disability studies, inclusive education, special education, working with families with children with disability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1210

116 Online Collaborative Learning System Using Speech Technology

Authors: Sid-Ahmed. Selouani, Tang-Ho Lê, Chadia Moghrabi, Benoit Lanteigne, Jean Roy

Abstract:

A Web-based learning tool, the Learn IN Context (LINC) system, designed and being used in some institution-s courses in mixed-mode learning, is presented in this paper. This mode combines face-to-face and distance approaches to education. LINC can achieve both collaborative and competitive learning. In order to provide both learners and tutors with a more natural way to interact with e-learning applications, a conversational interface has been included in LINC. Hence, the components and essential features of LINC+, the voice enhanced version of LINC, are described. We report evaluation experiments of LINC/LINC+ in a real use context of a computer programming course taught at the Université de Moncton (Canada). The findings show that when the learning material is delivered in the form of a collaborative and voice-enabled presentation, the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.

Keywords: E-leaning, Knowledge Network, Speech recognition, Speech synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687

115 Real-Time Implementation of STANAG 4539 High-Speed HF Modem

Authors: S. Saraç, F. Kara, C.Vural

Abstract:

High-frequency (HF) communications have been used by military organizations for more than 90 years. The opportunity of very long range communications without the need for advanced equipment makes HF a convenient and inexpensive alternative of satellite communications. Besides the advantages, voice and data transmission over HF is a challenging task, because the HF channel generally suffers from Doppler shift and spread, multi-path, cochannel interference, and many other sources of noise. In constructing an HF data modem, all these effects must be taken into account. STANAG 4539 is a NATO standard for high-speed data transmission over HF. It allows data rates up to 12800 bps over an HF channel of 3 kHz. In this work, an efficient implementation of STANAG 4539 on a single Texas Instruments- TMS320C6747 DSP chip is described. The state-of-the-art algorithms used in the receiver and the efficiency of the implementation enables real-time high-speed data / digitized voice transmission over poor HF channels.

Keywords: High frequency, modem, STANAG 4539.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5302

114 Smart Help at theWorkplace for Persons with Disabilities (SHW-PWD)

Authors: Ghassan Kbar, Shady Aly, Ibraheem Elsharawy, Akshay Bhatia, Nur Alhasan, Ronaldo Enriquez

Abstract:

The Smart Help for persons with disability (PWD) is a part of the project SMARTDISABLE which aims to develop relevant solution for PWD that target to provide an adequate workplace environment for them. It would support PWD needs smartly through smart help to allow them access to relevant information and communicate with other effectively and flexibly, and smart editor that assist them in their daily work. It will assist PWD in knowledge processing and creation as well as being able to be productive at the work place. The technical work of the project involves design of a technological scenario for the Ambient Intelligence (AmI) - based assistive technologies at the workplace consisting of an integrated universal smart solution that suits many different impairment conditions and will be designed to empower the Physically disabled persons (PDP) with the capability to access and effectively utilize the ICTs in order to execute knowledge rich working tasks with minimum efforts and with sufficient comfort level. The proposed technology solution for PWD will support voice recognition along with normal keyboard and mouse to control the smart help and smart editor with dynamic auto display interface that satisfies the requirements for different PWD group. In addition, a smart help will provide intelligent intervention based on the behavior of PWD to guide them and warn them about possible misbehavior. PWD can communicate with others using Voice over IP controlled by voice recognition. Moreover, Auto Emergency Help Response would be supported to assist PWD in case of emergency. This proposed technology solution intended to make PWD very effective at the work environment and flexible using voice to conduct their tasks at the work environment. The proposed solution aims to provide favorable outcomes that assist PWD at the work place, with the opportunity to participate in PWD assistive technology innovation market which is still small and rapidly growing as well as upgrading their quality of life to become similar to the normal people at the workplace. Finally, the proposed smart help solution is applicable in all workplace setting, including offices, manufacturing, hospital, etc.

Keywords: Ambient Intelligence, ICT, Persons with disability PWD, Smart application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2511

113 Trusting Smart Speakers: Analysing the Different Levels of Trust between Technologies

Authors: Alec Wells, Aminu Bello Usman, Justin McKeown

Abstract:

The growing usage of smart speakers raises many privacy and trust concerns compared to other technologies such as smart phones and computers. In this study, a proxy measure of trust is used to gauge users’ opinions on three different technologies based on an empirical study, and to understand which technology most people are most likely to trust. The collected data were analysed using the Kruskal-Wallis H test to determine the statistical differences between the users’ trust level of the three technologies: smart speaker, computer and smart phone. The findings of the study revealed that despite the wide acceptance, ease of use and reputation of smart speakers, people find it difficult to trust smart speakers with their sensitive information via the Direct Voice Input (DVI) and would prefer to use a keyboard or touchscreen offered by computers and smart phones. Findings from this study can inform future work on users’ trust in technology based on perceived ease of use, reputation, perceived credibility and risk of using technologies via DVI.

Keywords: Direct voice input, risk, security, technology and trust.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 557

112 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jeong Hun Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: Acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285

111 An Investigation of Community Radio Broadcasting in Phutthamonthon District, Nakhon Pathom, Thailand

Authors: Anchana Sooksomchitra

Abstract:

This study aims to explore and compare the current condition of community radio stations in Phutthamonthon district, Nakhon Pathom province, Thailand, as well as the challenges they are facing. Qualitative research tools including in-depth interviews; documentary analysis; focus group interviews; and observation, are used to examine the content, programming, and management structure of three community radio stations currently in operation within the district. Research findings indicate that the management and operational approaches adopted by the two non-profit stations included in the study, Salaya Pattana and Voice of Dhamma, are more structured and effective than that of the for-profit Tune Radio. Salaya Pattana – backed by the Faculty of Engineering, Mahidol University, and the charity-funded Voice of Dhamma, are comparatively free from political and commercial influence, and able to provide more relevant and consistent community-oriented content to meet the real demand of the audience. Tune Radio, on the other hand, has to rely solely on financial support from political factions and business groups, which heavily influence its content.

Keywords: Radio broadcasting, programming, management, community radio, Thailand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1112

110 A Classification Scheme for Game Input and Output

Authors: P. Prema, B. Ramadoss

Abstract:

Computer game industry has experienced exponential growth in recent years. A game is a recreational activity involving one or more players. Game input is information such as data, commands, etc., which is passed to the game system at run time from an external source. Conversely, game outputs are information which are generated by the game system and passed to an external target, but which is not used internally by the game. This paper identifies a new classification scheme for game input and output, which is based on player-s input and output. Using this, relationship table for game input classifier and output classifier is developed.

Keywords: Game Classification, Game Input, Game Output, Game Testing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960

109 Modeling and Control of a Quadrotor UAV with Aerodynamic Concepts

Authors: Wei Dong, Guo-Ying Gu, Xiangyang Zhu, Han Ding

Abstract:

This paper presents preliminary results on modeling and control of a quadrotor UAV. With aerodynamic concepts, a mathematical model is firstly proposed to describe the dynamics of the quadrotor UAV. Parameters of this model are identified by experiments with Matlab Identify Toolbox. A group of PID controllers are then designed based on the developed model. To verify the developed model and controllers, simulations and experiments for altitude control, position control and trajectory tracking are carried out. The results show that the quadrotor UAV well follows the referenced commands, which clearly demonstrates the effectiveness of the proposed approach.

Keywords: Quadrotor UAV, Modeling, Control, Aerodynamics, System Identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7061