Search results for: closed-set tex-independent speaker identification system (CISI)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 19917

Search results for: closed-set tex-independent speaker identification system (CISI)

19917 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Ben Soltane Cheima, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: feature extraction, speaker modeling, feature matching, Mel frequency cepstrum coefficient (MFCC), Gaussian mixture model (GMM), vector quantization (VQ), Linde-Buzo-Gray (LBG), expectation maximization (EM), pre-processing, voice activity detection (VAD), short time energy (STE), background noise statistical modeling, closed-set tex-independent speaker identification system (CISI)

Procedia PDF Downloads 309
19916 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition

Procedia PDF Downloads 177
19915 USE-Net: SE-Block Enhanced U-Net Architecture for Robust Speaker Identification

Authors: Kilari Nikhil, Ankur Tibrewal, Srinivas Kruthiventi S. S.

Abstract:

Conventional speaker identification systems often fall short of capturing the diverse variations present in speech data due to fixed-scale architectures. In this research, we propose a CNN-based architecture, USENet, designed to overcome these limitations. Leveraging two key techniques, our approach achieves superior performance on the VoxCeleb 1 Dataset without any pre-training. Firstly, we adopt a U-net-inspired design to extract features at multiple scales, empowering our model to capture speech characteristics effectively. Secondly, we introduce the squeeze and excitation block to enhance spatial feature learning. The proposed architecture showcases significant advancements in speaker identification, outperforming existing methods, and holds promise for future research in this domain.

Keywords: multi-scale feature extraction, squeeze and excitation, VoxCeleb1 speaker identification, mel-spectrograms, USENet

Procedia PDF Downloads 74
19914 Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals

Authors: Surbhi Mathur, J. M. Vyas

Abstract:

Although the rapid development of forensic speaker recognition technology has been conducted, there are still many problems to be solved. The biggest problem arises when the cases involving disguised voice samples come across for the purpose of examination and identification. Such type of voice samples of anonymous callers is frequently encountered in crimes involving kidnapping, blackmailing, hoax extortion and many more, where the speaker makes a deliberate effort to manipulate their natural voice in order to conceal their identity due to the fear of being caught. Voice disguise causes serious damage to the natural vocal parameters of the speakers and thus complicates the process of identification. The sole objective of this doctoral project is to find out the possibility of rendering definite opinions in cases involving disguised speech by experimentally determining the effects of different disguise forms on personal identification and percentage rate of speaker recognition for various voice disguise techniques such as raised pitch, lower pitch, increased nasality, covering the mouth, constricting tract, obstacle in mouth etc by analyzing and comparing the amount of phonetic and acoustic variation in of artificial (disguised) and natural sample of an individual, by auditory as well as spectrographic analysis.

Keywords: forensic, speaker recognition, voice, speech, disguise, identification

Procedia PDF Downloads 368
19913 Developed Text-Independent Speaker Verification System

Authors: Mohammed Arif, Abdessalam Kifouche

Abstract:

Speech is a very convenient way of communication between people and machines. It conveys information about the identity of the talker. Since speaker recognition technology is increasingly securing our everyday lives, the objective of this paper is to develop two automatic text-independent speaker verification systems (TI SV) using low-level spectral features and machine learning methods. (i) The first system is based on a support vector machine (SVM), which was widely used in voice signal processing with the aim of speaker recognition involving verifying the identity of the speaker based on its voice characteristics, and (ii) the second is based on Gaussian Mixture Model (GMM) and Universal Background Model (UBM) to combine different functions from different resources to implement the SVM based.

Keywords: speaker verification, text-independent, support vector machine, Gaussian mixture model, cepstral analysis

Procedia PDF Downloads 58
19912 Comparative Methods for Speech Enhancement and the Effects on Text-Independent Speaker Identification Performance

Authors: R. Ajgou, S. Sbaa, S. Ghendir, A. Chemsa, A. Taleb-Ahmed

Abstract:

The speech enhancement algorithm is to improve speech quality. In this paper, we review some speech enhancement methods and we evaluated their performance based on Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862). All method was evaluated in presence of different kind of noise using TIMIT database and NOIZEUS noisy speech corpus.. The noise was taken from the AURORA database and includes suburban train noise, babble, car, exhibition hall, restaurant, street, airport and train station noise. Simulation results showed improved performance of speech enhancement for Tracking of non-stationary noise approach in comparison with various methods in terms of PESQ measure. Moreover, we have evaluated the effects of the speech enhancement technique on Speaker Identification system based on autoregressive (AR) model and Mel-frequency Cepstral coefficients (MFCC).

Keywords: speech enhancement, pesq, speaker recognition, MFCC

Procedia PDF Downloads 424
19911 A Two-Step Framework for Unsupervised Speaker Segmentation Using BIC and Artificial Neural Network

Authors: Ahmad Alwosheel, Ahmed Alqaraawi

Abstract:

This work proposes a new speaker segmentation approach for two speakers. It is an online approach that does not require a prior information about speaker models. It has two phases, a conventional approach such as unsupervised BIC-based is utilized in the first phase to detect speaker changes and train a Neural Network, while in the second phase, the output trained parameters from the Neural Network are used to predict next incoming audio stream. Using this approach, a comparable accuracy to similar BIC-based approaches is achieved with a significant improvement in terms of computation time.

Keywords: artificial neural network, diarization, speaker indexing, speaker segmentation

Procedia PDF Downloads 502
19910 Effect of Clinical Depression on Automatic Speaker Verification

Authors: Sheeraz Memon, Namunu C. Maddage, Margaret Lech, Nicholas Allen

Abstract:

The effect of a clinical environment on the accuracy of the speaker verification was tested. The speaker verification tests were performed within homogeneous environments containing clinically depressed speakers only, and non-depresses speakers only, as well as within mixed environments containing different mixtures of both climatically depressed and non-depressed speakers. The speaker verification framework included the MFCCs features and the GMM modeling and classification method. The speaker verification experiments within homogeneous environments showed 5.1% increase of the EER within the clinically depressed environment when compared to the non-depressed environment. It indicated that the clinical depression increases the intra-speaker variability and makes the speaker verification task more challenging. Experiments with mixed environments indicated that the increase of the percentage of the depressed individuals within a mixed environment increases the speaker verification equal error rates.

Keywords: speaker verification, GMM, EM, clinical environment, clinical depression

Procedia PDF Downloads 375
19909 A Cross-Dialect Statistical Analysis of Final Declarative Intonation in Tuvinian

Authors: D. Beziakina, E. Bulgakova

Abstract:

This study continues the research on Tuvinian intonation and presents a general cross-dialect analysis of intonation of Tuvinian declarative utterances, specifically the character of the tone movement in order to test the hypothesis about the prevalence of level tone in some Tuvinian dialects. The results of the analysis of basic pitch characteristics of Tuvinian speech (in general and in comparison with two other Turkic languages - Uzbek and Azerbaijani) are also given in this paper. The goal of our work was to obtain the ranges of pitch parameter values typical for Tuvinian speech. Such language-specific values can be used in speaker identification systems in order to get more accurate results of ethnic speech analysis. We also present the results of a cross-dialect analysis of declarative intonation in the poorly studied Tuvinian language.

Keywords: speech analysis, statistical analysis, speaker recognition, identification of person

Procedia PDF Downloads 470
19908 The Effect of The Speaker's Speaking Style as A Factor of Understanding and Comfort of The Listener

Authors: Made Rahayu Putri Saron, Mochamad Nizar Palefi Ma’ady

Abstract:

Communication skills are important in everyday life, communication can be done verbally in the form of oral or written and nonverbal in the form of expressions or body movements. Good communication should be able to provide information clearly, and there is feedback from the speaker and listener. However, it is often found that the information conveyed is not clear, and there is no feedback from the listeners, so it cannot be ensured that the communication is effective and understandable. The speaker's understanding of the topic is one of the supporting factors for the listener to be able to accept the meaning of the conversation. However, based on the results of the literature review, it found that the influence factors of person speaking style are as follows: (i) environmental conditions; (ii) voice, articulation, and accent; (iii) gender; (iv) personality; (v) speech disorders (Dysarthria); when speaking also have an important influence on speaker’s speaking style. It can be concluded the factors that support understanding and comfort of the listener are dependent on the nature of the speaker (environmental conditions, voice, gender, personality) or also it the speaker have speech disorders.

Keywords: listener, public speaking, speaking style, understanding, and comfortable factor

Procedia PDF Downloads 166
19907 Multi-Modal Feature Fusion Network for Speaker Recognition Task

Authors: Xiang Shijie, Zhou Dong, Tian Dan

Abstract:

Speaker recognition is a crucial task in the field of speech processing, aimed at identifying individuals based on their vocal characteristics. However, existing speaker recognition methods face numerous challenges. Traditional methods primarily rely on audio signals, which often suffer from limitations in noisy environments, variations in speaking style, and insufficient sample sizes. Additionally, relying solely on audio features can sometimes fail to capture the unique identity of the speaker comprehensively, impacting recognition accuracy. To address these issues, we propose a multi-modal network architecture that simultaneously processes both audio and text signals. By gradually integrating audio and text features, we leverage the strengths of both modalities to enhance the robustness and accuracy of speaker recognition. Our experiments demonstrate significant improvements with this multi-modal approach, particularly in complex environments, where recognition performance has been notably enhanced. Our research not only highlights the limitations of current speaker recognition methods but also showcases the effectiveness of multi-modal fusion techniques in overcoming these limitations, providing valuable insights for future research.

Keywords: feature fusion, memory network, multimodal input, speaker recognition

Procedia PDF Downloads 32
19906 Experimental Study on the Heat Transfer Characteristics of the 200W Class Woofer Speaker

Authors: Hyung-Jin Kim, Dae-Wan Kim, Moo-Yeon Lee

Abstract:

The objective of this study is to experimentally investigate the heat transfer characteristics of 200 W class woofer speaker units with the input voice signals. The temperature and heat transfer characteristics of the 200 W class woofer speaker unit were experimentally tested with the several input voice signals such as 1500 Hz, 2500 Hz, and 5000 Hz respectively. From the experiments, it can be observed that the temperature of the woofer speaker unit including the voice-coil part increases with a decrease in input voice signals. Also, the temperature difference in measured points of the voice coil is increased with decrease of the input voice signals. In addition, the heat transfer characteristics of the woofer speaker in case of the input voice signal of 1500 Hz is 40% higher than that of the woofer speaker in case of the input voice signal of 5000 Hz at the measuring time of 200 seconds. It can be concluded from the experiments that initially the temperature of the voice signal increases rapidly with time, after a certain period of time it increases exponentially. Also during this time dependent temperature change, it can be observed that high voice signal is stable than low voice signal.

Keywords: heat transfer, temperature, voice coil, woofer speaker

Procedia PDF Downloads 360
19905 System Identification and Quantitative Feedback Theory Design of a Lathe Spindle

Authors: M. Khairudin

Abstract:

This paper investigates the system identification and design quantitative feedback theory (QFT) for the robust control of a lathe spindle. The dynamic of the lathe spindle is uncertain and time variation due to the deepness variation on cutting process. System identification was used to obtain the dynamics model of the lathe spindle. In this work, real time system identification is used to construct a linear model of the system from the nonlinear system. These linear models and its uncertainty bound can then be used for controller synthesis. The real time nonlinear system identification process to obtain a set of linear models of the lathe spindle that represents the operating ranges of the dynamic system. With a selected input signal, the data of output and response is acquired and nonlinear system identification is performed using Matlab to obtain a linear model of the system. Practical design steps are presented in which the QFT-based conditions are formulated to obtain a compensator and pre-filter to control the lathe spindle. The performances of the proposed controller are evaluated in terms of velocity responses of the the lathe machine spindle in corporating deepness on cutting process.

Keywords: lathe spindle, QFT, robust control, system identification

Procedia PDF Downloads 543
19904 Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech

Authors: E. Krasnova, E. Bulgakova, V. Shchemelinin

Abstract:

The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.

Keywords: speaker identification, acoustic-spectrographic method, non-native speech, performance evaluation

Procedia PDF Downloads 446
19903 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data

Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri

Abstract:

In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.

Keywords: Gaussian process, nonlinearity distribution, particle filter, system identification

Procedia PDF Downloads 516
19902 A Transform Domain Function Controlled VSSLMS Algorithm for Sparse System Identification

Authors: Cemil Turan, Mohammad Shukri Salman

Abstract:

The convergence rate of the least-mean-square (LMS) algorithm deteriorates if the input signal to the filter is correlated. In a system identification problem, this convergence rate can be improved if the signal is white and/or if the system is sparse. We recently proposed a sparse transform domain LMS-type algorithm that uses a variable step-size for a sparse system identification. The proposed algorithm provided high performance even if the input signal is highly correlated. In this work, we investigate the performance of the proposed TD-LMS algorithm for a large number of filter tap which is also a critical issue for standard LMS algorithm. Additionally, the optimum value of the most important parameter is calculated for all experiments. Moreover, the convergence analysis of the proposed algorithm is provided. The performance of the proposed algorithm has been compared to different algorithms in a sparse system identification setting of different sparsity levels and different number of filter taps. Simulations have shown that the proposed algorithm has prominent performance compared to the other algorithms.

Keywords: adaptive filtering, sparse system identification, TD-LMS algorithm, VSSLMS algorithm

Procedia PDF Downloads 360
19901 Ultracapacitor State-of-Energy Monitoring System with On-Line Parameter Identification

Authors: N. Reichbach, A. Kuperman

Abstract:

The paper describes a design of a monitoring system for super capacitor packs in propulsion systems, allowing determining the instantaneous energy capacity under power loading. The system contains real-time recursive-least-squares identification mechanism, estimating the values of pack capacitance and equivalent series resistance. These values are required for accurate calculation of the state-of-energy.

Keywords: real-time monitoring, RLS identification algorithm, state-of-energy, super capacitor

Procedia PDF Downloads 535
19900 Identification of Impact Load and Partial System Parameters Using 1D-CNN

Authors: Xuewen Yu, Danhui Dan

Abstract:

The identification of impact load and some hard-to-obtain system parameters is crucial for the activities of analysis, validation, and evaluation in the engineering field. This paper proposes a method that utilizes neural networks based on 1D-CNN to identify the impact load and partial system parameters from measured responses. To this end, forward computations are conducted to provide datasets consisting of the triples (parameter θ, input u, output y). Then neural networks are trained to learn the mapping from input to output, fu|{θ} : y → u, as well as from input and output to parameter, fθ : (u, y) → θ. Afterward, feeding the trained neural networks the measured output response, the input impact load and system parameter can be calculated, respectively. The method is tested on two simulated examples and shows sound accuracy in estimating the impact load (waveform and location) and system parameters.

Keywords: convolutional neural network, impact load identification, system parameter identification, inverse problem

Procedia PDF Downloads 123
19899 Kalman Filter Design in Structural Identification with Unknown Excitation

Authors: Z. Masoumi, B. Moaveni

Abstract:

This article is about first step of structural health monitoring by identifying structural system in the presence of unknown input. In the structural system identification, identification of structural parameters such as stiffness and damping are considered. In this study, the Kalman filter (KF) design for structural systems with unknown excitation is expressed. External excitations, such as earthquakes, wind or any other forces are not measured or not available. The purpose of this filter is its strengths to estimate the state variables of the system in the presence of unknown input. Also least squares estimation (LSE) method with unknown input is studied. Estimates of parameters have been adopted. Finally, using two examples advantages and drawbacks of both methods are studied.

Keywords: Kalman filter (KF), least square estimation (LSE), structural health monitoring (SHM), structural system identification

Procedia PDF Downloads 317
19898 Application of the Discrete Rationalized Haar Transform to Distributed Parameter System

Authors: Joon-Hoon Park

Abstract:

In this paper the rationalized Haar transform is applied for distributed parameter system identification and estimation. A distributed parameter system is a dynamical and mathematical model described by a partial differential equation. And system identification concerns the problem of determining mathematical models from observed data. The Haar function has some disadvantages of calculation because it contains irrational numbers, for these reasons the rationalized Haar function that has only rational numbers. The algorithm adopted in this paper is based on the transform and operational matrix of the rationalized Haar function. This approach provides more convenient and efficient computational results.

Keywords: distributed parameter system, rationalized Haar transform, operational matrix, system identification

Procedia PDF Downloads 509
19897 Modeling of a UAV Longitudinal Dynamics through System Identification Technique

Authors: Asadullah I. Qazi, Mansoor Ahsan, Zahir Ashraf, Uzair Ahmad

Abstract:

System identification of an Unmanned Aerial Vehicle (UAV), to acquire its mathematical model, is a significant step in the process of aircraft flight automation. The need for reliable mathematical model is an established requirement for autopilot design, flight simulator development, aircraft performance appraisal, analysis of aircraft modifications, preflight testing of prototype aircraft and investigation of fatigue life and stress distribution etc.  This research is aimed at system identification of a fixed wing UAV by means of specifically designed flight experiment. The purposely designed flight maneuvers were performed on the UAV and aircraft states were recorded during these flights. Acquired data were preprocessed for noise filtering and bias removal followed by parameter estimation of longitudinal dynamics transfer functions using MATLAB system identification toolbox. Black box identification based transfer function models, in response to elevator and throttle inputs, were estimated using least square error   technique. The identification results show a high confidence level and goodness of fit between the estimated model and actual aircraft response.

Keywords: fixed wing UAV, system identification, black box modeling, longitudinal dynamics, least square error

Procedia PDF Downloads 325
19896 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping

Procedia PDF Downloads 408
19895 Modified Form of Margin Based Angular Softmax Loss for Speaker Verification

Authors: Jamshaid ul Rahman, Akhter Ali, Adnan Manzoor

Abstract:

Learning-based systems have received increasing interest in recent years; recognition structures, including end-to-end speak recognition, are one of the hot topics in this area. A famous work on end-to-end speaker verification by using Angular Softmax Loss gained significant importance and is considered useful to directly trains a discriminative model instead of the traditional adopted i-vector approach. The margin-based strategy in angular softmax is beneficial to learn discriminative speaker embeddings where the random selection of margin values is a big issue in additive angular margin and multiplicative angular margin. As a better solution in this matter, we present an alternative approach by introducing a bit similar form of an additive parameter that was originally introduced for face recognition, and it has a capacity to adjust automatically with the corresponding margin values and is applicable to learn more discriminative features than the Softmax. Experiments are conducted on the part of Fisher dataset, where it observed that the additive parameter with angular softmax to train the front-end and probabilistic linear discriminant analysis (PLDA) in the back-end boosts the performance of the structure.

Keywords: additive parameter, angular softmax, speaker verification, PLDA

Procedia PDF Downloads 103
19894 A Critical Discourse Analysis of President Muhammad Buhari's Speeches

Authors: Joy Aworo-Okoroh

Abstract:

Politics is about trust and trust is challenged by the speaker’s ability to manipulate language before the electorate. Critical discourse analysis investigates the role of language in constructing social relationships between a political speaker and his audience. This paper explores the linguistic choices made by President Muhammad Buhari that enshrines his ideologies as well as the socio-political relations of power between him and Nigerians in his speeches. Two speeches of President Buhari –inaugural and Independence Day speeches are analyzed using Norman Fairclough’s perspective on Halliday’s Systemic functional grammar. The analysis is at two levels. The first level of analysis is the identification of transitivity and modality choices in the speeches and how they reveal the covert ideologies. The second analysis is premised on Normal Fairclough’s model, the clauses are analyzed to identify elements of power, hesistation, persuasion, threat and religious statement. It was discovered that Buhari is a dominant character who manipulates the material processes a lot.

Keywords: politics, critical discourse analysis, Norman Fairclough, systemic functional grammar

Procedia PDF Downloads 551
19893 Digital Recording System Identification Based on Audio File

Authors: Michel Kulhandjian, Dimitris A. Pados

Abstract:

The objective of this work is to develop a theoretical framework for reliable digital recording system identification from digital audio files alone, for forensic purposes. A digital recording system consists of a microphone and a digital sound processing card. We view the cascade as a system of unknown transfer function. We expect same manufacturer and model microphone-sound card combinations to have very similar/near identical transfer functions, bar any unique manufacturing defect. Input voice (or other) signals are modeled as non-stationary processes. The technical problem under consideration becomes blind deconvolution with non-stationary inputs as it manifests itself in the specific application of digital audio recording equipment classification.

Keywords: blind system identification, audio fingerprinting, blind deconvolution, blind dereverberation

Procedia PDF Downloads 304
19892 Application of Low-order Modeling Techniques and Neural-Network Based Models for System Identification

Authors: Venkatesh Pulletikurthi, Karthik B. Ariyur, Luciano Castillo

Abstract:

The system identification from the turbulence wakes will lead to the tactical advantage to prepare and also, to predict the trajectory of the opponents’ movements. A low-order modeling technique, POD, is used to predict the object based on the wake pattern and compared with pre-trained image recognition neural network (NN) to classify the wake patterns into objects. It is demonstrated that low-order modeling, POD, is able to predict the objects better compared to pretrained NN by ~30%.

Keywords: the bluff body wakes, low-order modeling, neural network, system identification

Procedia PDF Downloads 180
19891 Smart Unmanned Parking System Based on Radio Frequency Identification Technology

Authors: Yu Qin

Abstract:

In order to tackle the ever-growing problem of the lack of parking space, this paper presents the design and implementation of a smart unmanned parking system that is based on RFID (radio frequency identification) technology and Wireless communication technology. This system uses RFID technology to achieve the identification function (transmitted by 2.4 G wireless module) and is equipped with an STM32L053 micro controller as the main control chip of the smart vehicle. This chip can accomplish automatic parking (in/out), charging and other functions. On this basis, it can also help users easily query the information that is stored in the database through the Internet. Experimental tests have shown that the system has the features of low power consumption and stable operation, among others. It can effectively improve the level of automation control of the parking lot management system and has enormous application prospects.

Keywords: RFID, embedded system, unmanned, parking management

Procedia PDF Downloads 333
19890 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 140
19889 The Difference of Learning Outcomes in Reading Comprehension between Text and Film as The Media in Indonesian Language for Foreign Speaker in Intermediate Level

Authors: Siti Ayu Ningsih

Abstract:

This study aims to find the differences outcomes in learning reading comprehension with text and film as media on Indonesian Language for foreign speaker (BIPA) learning at intermediate level. By using quantitative and qualitative research methods, the respondent of this study is a single respondent from D'Royal Morocco Integrative Islamic School in grade nine from secondary level. Quantitative method used to calculate the learning outcomes that have been given the appropriate action cycle, whereas qualitative method used to translate the findings derived from quantitative methods to be described. The technique used in this study is the observation techniques and testing work. Based on the research, it is known that the use of the text media is more effective than the film for intermediate level of Indonesian Language for foreign speaker learner. This is because, when using film the learner does not have enough time to take note the difficult vocabulary and don't have enough time to look for the meaning of the vocabulary from the dictionary. While the use of media texts shows the better effectiveness because it does not require additional time to take note the difficult words. For the words that are difficult or strange, the learner can immediately find its meaning from the dictionary. The presence of the text is also very helpful for Indonesian Language for foreign speaker learner to find the answers according to the questions more easily. By matching the vocabulary of the question into the text references.

Keywords: Indonesian language for foreign speaker, learning outcome, media, reading comprehension

Procedia PDF Downloads 197
19888 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 322