Search results for: cell recognition
5254 Speaker Recognition Using LIRA Neural Networks
Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul
Abstract:
This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition
Procedia PDF Downloads 1775253 Up-Regulation of SCUBE2 Expression in Co-Cultures of Human Mesenchymal Stem Cell and Breast Cancer Cells
Authors: Hirowati Ali, Aisyah Ellyanti, Dewi Rusnita, Septelia Inawati Wanandi
Abstract:
Stem cell has been known for its potency to be differentiated in many cells. Recently stem cell has been used for many treatment of degenerative medicine. It is still controversy whether stem cell can be used for therapy or these cells can activate cancer stem cell. SCUBE2 is a novel secreted and membrane-anchored protein which has been reported to its role in better prognosis and inhibition of cancer cell proliferation. Our study aims to observe whether stem cell can up-regulate SCUBE2 gene in MCF7 breast cancer cell line. We used in vitro study using MCF-7 cell treated with stem cell derived from placenta Wharton's jelly which has been known for its stemness and widely used. Our results showed that MCF-7 cell line grows up rapidly in 6-well culture dish. Stem cell was cultured in 6-well dish. After 50%-60% MCF-7 confluence, we co-cultured these cells with stem cells for 24 hours and 48 hours. We hypothesize SCUBE2 gene which is previously known for its higher expression in better prognosis of breast cancer, is up-regulated after stem cells addition in MCF7 culture dishes.Keywords: breast cancer cells, inhibition of cancer cells, mesenchymal stem cells, SCUBE2
Procedia PDF Downloads 3405252 New Approaches for the Handwritten Digit Image Features Extraction for Recognition
Authors: U. Ravi Babu, Mohd Mastan
Abstract:
The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.Keywords: handwritten digit recognition, distance measure, MNIST database, image features
Procedia PDF Downloads 4615251 Emotion Recognition in Video and Images in the Wild
Authors: Faizan Tariq, Moayid Ali Zaidi
Abstract:
Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.Keywords: face recognition, emotion recognition, deep learning, CNN
Procedia PDF Downloads 1875250 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains
Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi
Abstract:
In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)
Procedia PDF Downloads 3495249 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network
Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah
Abstract:
Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.Keywords: CNN, deep-learning, facial emotion recognition, machine learning
Procedia PDF Downloads 955248 An Insight into the Conformational Dynamics of Glycan through Molecular Dynamics Simulation
Authors: K. Veluraja
Abstract:
Glycan of glycolipids and glycoproteins is playing a significant role in living systems particularly in molecular recognition processes. Molecular recognition processes are attributed to their occurrence on the surface of the cell, sequential arrangement and type of sugar molecules present in the oligosaccharide structure and glyosidic linkage diversity (glycoinformatics) and conformational diversity (glycoconformatics). Molecular Dynamics Simulation study is a theoretical-cum-computational tool successfully utilized to establish glycoconformatics of glycan. The study on various oligosaccharides of glycan clearly indicates that oligosaccharides do exist in multiple conformational states and these conformational states arise due to the flexibility associated with a glycosidic torsional angle (φ,ψ) . As an example: a single disaccharide structure NeuNacα(2-3) Gal exists in three different conformational states due to the differences in the preferential value of glycosidic torsional angles (φ,ψ). Hence establishing three dimensional structural and conformational models for glycan (cartesian coordinates of every individual atoms of an oligosaccharide structure in a preferred conformation) is quite crucial to understand various molecular recognition processes such as glycan-toxin interaction and glycan-virus interaction. The gycoconformatics models obtained for various glycan through Molecular Dynamics Simulation stored in our 3DSDSCAR (3DSDSCAR.ORG) a public domain database and its utility value in understanding the molecular recognition processes and in drug design venture will be discussed.Keywords: glycan, glycoconformatics, molecular dynamics simulation, oligosaccharide
Procedia PDF Downloads 1375247 Facial Emotion Recognition Using Deep Learning
Authors: Ashutosh Mishra, Nikhil Goyal
Abstract:
A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.Keywords: facial recognition, computational intelligence, convolutional neural network, depth map
Procedia PDF Downloads 2315246 Efficient Pre-Processing of Single-Cell Assay for Transposase Accessible Chromatin with High-Throughput Sequencing Data
Authors: Fan Gao, Lior Pachter
Abstract:
The primary tool currently used to pre-process 10X Chromium single-cell ATAC-seq data is Cell Ranger, which can take very long to run on standard datasets. To facilitate rapid pre-processing that enables reproducible workflows, we present a suite of tools called scATAK for pre-processing single-cell ATAC-seq data that is 15 to 18 times faster than Cell Ranger on mouse and human samples. Our tool can also calculate chromatin interaction potential matrices, and generate open chromatin signal and interaction traces for cell groups. We use scATAK tool to explore the chromatin regulatory landscape of a healthy adult human brain and unveil cell-type specific features, and show that it provides a convenient and computational efficient approach for pre-processing single-cell ATAC-seq data.Keywords: single-cell, ATAC-seq, bioinformatics, open chromatin landscape, chromatin interactome
Procedia PDF Downloads 1555245 Hand Detection and Recognition for Malay Sign Language
Authors: Mohd Noah A. Rahman, Afzaal H. Seyal, Norhafilah Bara
Abstract:
Developing a software application using an interface with computers and peripheral devices using gestures of human body such as hand movements keeps growing in interest. A review on this hand gesture detection and recognition based on computer vision technique remains a very challenging task. This is to provide more natural, innovative and sophisticated way of non-verbal communication, such as sign language, in human computer interaction. Nevertheless, this paper explores hand detection and hand gesture recognition applying a vision based approach. The hand detection and recognition used skin color spaces such as HSV and YCrCb are applied. However, there are limitations that are needed to be considered. Almost all of skin color space models are sensitive to quickly changing or mixed lighting circumstances. There are certain restrictions in order for the hand recognition to give better results such as the distance of user’s hand to the webcam and the posture and size of the hand.Keywords: hand detection, hand gesture, hand recognition, sign language
Procedia PDF Downloads 3065244 Small Text Extraction from Documents and Chart Images
Authors: Rominkumar Busa, Shahira K. C., Lijiya A.
Abstract:
Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.Keywords: small text extraction, OCR, scene text recognition, CRNN
Procedia PDF Downloads 1255243 Recognition and Protection of Indigenous Society in Indonesia
Authors: Triyanto, Rima Vien Permata Hartanto
Abstract:
Indonesia is a legal state. The consequence of this status is the recognition and protection of the existence of indigenous peoples. This paper aims to describe the dynamics of legal recognition and protection for indigenous peoples within the framework of Indonesian law. This paper is library research based on literature. The result states that although the constitution has normatively recognized the existence of indigenous peoples and their traditional rights, in reality, not all rights were recognized and protected. The protection and recognition for indigenous people need to be strengthened.Keywords: indigenous peoples, customary law, state law, state of law
Procedia PDF Downloads 3305242 Detecting Characters as Objects Towards Character Recognition on Licence Plates
Authors: Alden Boby, Dane Brown, James Connan
Abstract:
Character recognition is a well-researched topic across disciplines. Regardless, creating a solution that can cater to multiple situations is still challenging. Vehicle licence plates lack an international standard, meaning that different countries and regions have their own licence plate format. A problem that arises from this is that the typefaces and designs from different regions make it difficult to create a solution that can cater to a wide range of licence plates. The main issue concerning detection is the character recognition stage. This paper aims to create an object detection-based character recognition model trained on a custom dataset that consists of typefaces of licence plates from various regions. Given that characters have featured consistently maintained across an array of fonts, YOLO can be trained to recognise characters based on these features, which may provide better performance than OCR methods such as Tesseract OCR.Keywords: computer vision, character recognition, licence plate recognition, object detection
Procedia PDF Downloads 1215241 Natural Honey and Effect on the Activity of the Cells
Authors: Abujnah Dukali
Abstract:
Natural honey was assessed in cell culture system for its anticancer activity. Human leukemic cell line HL 60 was treated with honey and cultured for 5 days and cytotoxicity was calculated by MTT assay. Honey showed cytotoxicity with CC50 value of 174.20 µg/ml. Radical modulation activities was assessed by lipid peroxidation assay using egg lecithin. Honey showed antioxidant activity with EC50 value of 159.73 µg/ml. In addition, treatment with HL60 cells also resulted in nuclear DNA fragmentation, as seen in agarose gel electrophoresis. This is a hallmark of cells undergoing apoptosis. Confirmation of apoptosis was performed by staining the cells with Annexin V and FACS analysis. Apoptosis is an active, genetically regulated disassembly of the cell form within. Disassembly creates changes in the phospholipid content of the cytoplasmic membrane outer leaflet. Phosphatidylserine (PS) is translocated from the inner to the outer surface of the cell for phagocytic cell recognition. The human anticoagulant, annexin V, is a Ca2+-dependent phospholipid protein with a high affinity for PS. Annexin V labeled with fluorescein can identify apoptotic cells in the population It is a confirmatory test for apoptosis. Annexin V-positive cells were defined as apoptotic cells. Since honey shows both antioxidant activity and cytotoxicity at almost the same concentration, it can prevent the free radical induced cancer as prophylactic agent and kill the cancer cells by apoptotic process as a chemotherapeutic agent. Everyday intake of honey can prevent the cancer induction.Keywords: anticancer, cells, DNA, honey
Procedia PDF Downloads 2065240 Relevant LMA Features for Human Motion Recognition
Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier
Abstract:
Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.Keywords: discriminative LMA features, features reduction, human motion recognition, random forest
Procedia PDF Downloads 1955239 Effects of Reversible Watermarking on Iris Recognition Performance
Authors: Andrew Lock, Alastair Allen
Abstract:
Fragile watermarking has been proposed as a means of adding additional security or functionality to biometric systems, particularly for authentication and tamper detection. In this paper we describe an experimental study on the effect of watermarking iris images with a particular class of fragile algorithm, reversible algorithms, and the ability to correctly perform iris recognition. We investigate two scenarios, matching watermarked images to unmodified images, and matching watermarked images to watermarked images. We show that different watermarking schemes give very different results for a given capacity, highlighting the importance of investigation. At high embedding rates most algorithms cause significant reduction in recognition performance. However, in many cases, for low embedding rates, recognition accuracy is improved by the watermarking process.Keywords: biometrics, iris recognition, reversible watermarking, vision engineering
Procedia PDF Downloads 4565238 Antigen-Presenting Cell Characteristics of Human γδ T Lymphocytes in Chronic Myeloid Leukemia
Authors: Piamsiri Sawaisorn, Tienrat Tangchaikeeree, Waraporn Chan-On, Chaniya Leepiyasakulchai, Rachanee Udomsangpetch, Suradej Hongeng, Kulachart Jangpatarapongsa
Abstract:
Human Vγ9Vδ2 T lymphocytes are regarded as promising effector cells for cancer immunotherapy since they have the ability to eliminate several tumor cells through non-peptide antigen recognition and non-major histocompatibility complex (MHC) restriction. An issue of recent interest is the capability to activate γδ T cells by use of a group of drugs, such as pamidronate, that cause accumulation of phosphoantigen which is recognized by γδ T cell receptors. Moreover, their antigen presenting cell-like phenotype and function have been confirmed in many clinical trials. In this study, Vγ9Vδ2 T cells derived from normal peripheral blood mononuclear cells were activated with pamidronate and the expanded Vγ9Vδ2 T cells can recognize and kill chronic myeloid leukemia (CML) cells treated with pamidronate through their cytotoxic activity. To support the strong role played by Vγ9Vδ2 T cells against cancer, we provide the evidence that Vγ9Vδ2 T cells activated with CML cell lysate antigen can efficiently express antigen presenting cell (APC) phenotype and function. In conclusion, pamidronate can be used in intentional activation of human Vγ9Vδ2 T cells and can increase the susceptibility of CML cells to cytotoxicity of Vγ9Vδ2 T cells. The activated Vγ9Vδ2 T cells by cancer cells lysate can show their APC characteristics, and so greatly increase the interest in exploring their therapeutic potential in hematologic malignancy.Keywords: γδ T lymphocytes, antigen-presenting cells, chronic myeloid leukemia, cancer, immunotherapy
Procedia PDF Downloads 1865237 Preparation of Gramine Nanosuspension and Protective Effect of Gramine on Human Oral Cell Lines by Induction of Apoptosis
Authors: K. Suresh, R. Arunkumar
Abstract:
The objective of this study is to investigate the preparation of gramine nano suspension and protective effect of Gramine on the apoptosis of laryngeal cancer cells cell line (HEp-2 and KB). The growth inhibition rate of Hep-2 and KB cells in vitro were measured by MTT assay and apoptosis by, levels of reactive oxygen species, mitochondrial membrane potential, morphological changes and flowcytometry. Based on the results, we determined the effective doses of gramine as 127.23µm/ml for 24 hr and 119.81 µm/ml for 48hr in hep-2 cell line and 147.58 µm ml for 24 hr and 123.74µm µm/ml for 48hr in KB cell line. cytotoxicity effects of gramine were confirmed by treatment of HEp-2 cell and KB cell with IC50 concentration of gramine resulted in sequences of events marked by the enhance the apoptosis accompanied by loss of cell viability, modulation of reactive oxygen species and cell cycle arrest through the induction of G0/G1 phase arrest on HEp-2 cells. Our study suggests that the nanosuspension of gramine possesses the more cytotoxic effect of cancer cells and a novel candidate for cancer chemoprevention.Keywords: apoptosis, HEp-2 cell line, KB cell line mitochondria, gramine, nanosuspension
Procedia PDF Downloads 4535236 ICanny: CNN Modulation Recognition Algorithm
Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng
Abstract:
Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm
Procedia PDF Downloads 1915235 Video Based Automatic License Plate Recognition System
Authors: Ali Ganoun, Wesam Algablawi, Wasim BenAnaif
Abstract:
Video based traffic surveillance based on License Plate Recognition (LPR) system is an essential part for any intelligent traffic management system. The LPR system utilizes computer vision and pattern recognition technologies to obtain traffic and road information by detecting and recognizing vehicles based on their license plates. Generally, the video based LPR system is a challenging area of research due to the variety of environmental conditions. The LPR systems used in a wide range of commercial applications such as collision warning systems, finding stolen cars, controlling access to car parks and automatic congestion charge systems. This paper presents an automatic LPR system of Libyan license plate. The performance of the proposed system is evaluated with three video sequences.Keywords: license plate recognition, localization, segmentation, recognition
Procedia PDF Downloads 4645234 Genetic Algorithm Based Deep Learning Parameters Tuning for Robot Object Recognition and Grasping
Authors: Delowar Hossain, Genci Capi
Abstract:
This paper concerns with the problem of deep learning parameters tuning using a genetic algorithm (GA) in order to improve the performance of deep learning (DL) method. We present a GA based DL method for robot object recognition and grasping. GA is used to optimize the DL parameters in learning procedure in term of the fitness function that is good enough. After finishing the evolution process, we receive the optimal number of DL parameters. To evaluate the performance of our method, we consider the object recognition and robot grasping tasks. Experimental results show that our method is efficient for robot object recognition and grasping.Keywords: deep learning, genetic algorithm, object recognition, robot grasping
Procedia PDF Downloads 3535233 Face Recognition Using Discrete Orthogonal Hahn Moments
Authors: Fatima Akhmedova, Simon Liao
Abstract:
One of the most critical decision points in the design of a face recognition system is the choice of an appropriate face representation. Effective feature descriptors are expected to convey sufficient, invariant and non-redundant facial information. In this work, we propose a set of Hahn moments as a new approach for feature description. Hahn moments have been widely used in image analysis due to their invariance, non-redundancy and the ability to extract features either globally and locally. To assess the applicability of Hahn moments to Face Recognition we conduct two experiments on the Olivetti Research Laboratory (ORL) database and University of Notre-Dame (UND) X1 biometric collection. Fusion of the global features along with the features from local facial regions are used as an input for the conventional k-NN classifier. The method reaches an accuracy of 93% of correctly recognized subjects for the ORL database and 94% for the UND database.Keywords: face recognition, Hahn moments, recognition-by-parts, time-lapse
Procedia PDF Downloads 3755232 Topology-Based Character Recognition Method for Coin Date Detection
Authors: Xingyu Pan, Laure Tougne
Abstract:
For recognizing coins, the graved release date is important information to identify precisely its monetary type. However, reading characters in coins meets much more obstacles than traditional character recognition tasks in the other fields, such as reading scanned documents or license plates. To address this challenging issue in a numismatic context, we propose a training-free approach dedicated to detection and recognition of the release date of the coin. In the first step, the date zone is detected by comparing histogram features; in the second step, a topology-based algorithm is introduced to recognize coin numbers with various font types represented by binary gradient map. Our method obtained a recognition rate of 92% on synthetic data and of 44% on real noised data.Keywords: coin, detection, character recognition, topology
Procedia PDF Downloads 2535231 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping
Authors: Guoliang Lu, Changhou Lu, Xueyong Li
Abstract:
In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.Keywords: action recognition, multi features, dynamic time warping, feature combination
Procedia PDF Downloads 4375230 Voice Commands Recognition of Mentor Robot in Noisy Environment Using HTK
Authors: Khenfer-Koummich Fatma, Hendel Fatiha, Mesbahi Larbi
Abstract:
this paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a man-machine interface with a voice recognition system that allows the operator to tele-operate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands spoken in two languages: French and Arabic. The recognition rate obtained is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equal to 30 db, the Arabic speech recognition rate is 69% and 80% for French speech recognition rate. This can be explained by the ability of phonetic context of each speech when the noise is added.Keywords: voice command, HMM, TIMIT, noise, HTK, Arabic, speech recognition
Procedia PDF Downloads 3825229 Multi-Modal Feature Fusion Network for Speaker Recognition Task
Authors: Xiang Shijie, Zhou Dong, Tian Dan
Abstract:
Speaker recognition is a crucial task in the field of speech processing, aimed at identifying individuals based on their vocal characteristics. However, existing speaker recognition methods face numerous challenges. Traditional methods primarily rely on audio signals, which often suffer from limitations in noisy environments, variations in speaking style, and insufficient sample sizes. Additionally, relying solely on audio features can sometimes fail to capture the unique identity of the speaker comprehensively, impacting recognition accuracy. To address these issues, we propose a multi-modal network architecture that simultaneously processes both audio and text signals. By gradually integrating audio and text features, we leverage the strengths of both modalities to enhance the robustness and accuracy of speaker recognition. Our experiments demonstrate significant improvements with this multi-modal approach, particularly in complex environments, where recognition performance has been notably enhanced. Our research not only highlights the limitations of current speaker recognition methods but also showcases the effectiveness of multi-modal fusion techniques in overcoming these limitations, providing valuable insights for future research.Keywords: feature fusion, memory network, multimodal input, speaker recognition
Procedia PDF Downloads 325228 Improved Dynamic Bayesian Networks Applied to Arabic On Line Characters Recognition
Authors: Redouane Tlemsani, Abdelkader Benyettou
Abstract:
Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology. This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data. Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables. In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization. The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition, computer vision
Procedia PDF Downloads 4285227 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses
Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas
Abstract:
We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition
Procedia PDF Downloads 5585226 Advanced Mouse Cursor Control and Speech Recognition Module
Authors: Prasad Kalagura, B. Veeresh kumar
Abstract:
We constructed an interface system that would allow a similarly paralyzed user to interact with a computer with almost full functional capability. A real-time tracking algorithm is implemented based on adaptive skin detection and motion analysis. The clicking of the mouse is activated by the user's eye blinking through a sensor. The keyboard function is implemented by voice recognition kit.Keywords: embedded ARM7 processor, mouse pointer control, voice recognition
Procedia PDF Downloads 5785225 Object Recognition Approach Based on Generalized Hough Transform and Color Distribution Serving in Generating Arabic Sentences
Authors: Nada Farhani, Naim Terbeh, Mounir Zrigui
Abstract:
The recognition of the objects contained in images has always presented a challenge in the field of research because of several difficulties that the researcher can envisage because of the variability of shape, position, contrast of objects, etc. In this paper, we will be interested in the recognition of objects. The classical Hough Transform (HT) presented a tool for detecting straight line segments in images. The technique of HT has been generalized (GHT) for the detection of arbitrary forms. With GHT, the forms sought are not necessarily defined analytically but rather by a particular silhouette. For more precision, we proposed to combine the results from the GHT with the results from a calculation of similarity between the histograms and the spatiograms of the images. The main purpose of our work is to use the concepts from recognition to generate sentences in Arabic that summarize the content of the image.Keywords: recognition of shape, generalized hough transformation, histogram, spatiogram, learning
Procedia PDF Downloads 158