Search results for: chord recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1645

Search results for: chord recognition

1645 Robustness of the Deep Chroma Extractor and Locally-Normalized Quarter Tone Filters in Automatic Chord Estimation under Reverberant Conditions

Authors: Luis Alvarado, Victor Poblete, Isaac Gonzalez, Yetzabeth Gonzalez

Abstract:

In MIREX 2016 (http://www.music-ir.org/mirex), the deep neural network (DNN)-Deep Chroma Extractor, proposed by Korzeniowski and Wiedmer, reached the highest score in an audio chord recognition task. In the present paper, this tool is assessed under acoustic reverberant environments and distinct source-microphone distances. The evaluation dataset comprises The Beatles and Queen datasets. These datasets are sequentially re-recorded with a single microphone in a real reverberant chamber at four reverberation times (0 -anechoic-, 1, 2, and 3 s, approximately), as well as four source-microphone distances (32, 64, 128, and 256 cm). It is expected that the performance of the trained DNN will dramatically decrease under these acoustic conditions with signals degraded by room reverberation and distance to the source. Recently, the effect of the bio-inspired Locally-Normalized Cepstral Coefficients (LNCC), has been assessed in a text independent speaker verification task using speech signals degraded by additive noise at different signal-to-noise ratios with variations of recording distance, and it has also been assessed under reverberant conditions with variations of recording distance. LNCC showed a performance so high as the state-of-the-art Mel Frequency Cepstral Coefficient filters. Based on these results, this paper proposes a variation of locally-normalized triangular filters called Locally-Normalized Quarter Tone (LNQT) filters. By using the LNQT spectrogram, robustness improvements of the trained Deep Chroma Extractor are expected, compared with classical triangular filters, and thus compensating the music signal degradation improving the accuracy of the chord recognition system.

Keywords: chord recognition, deep neural networks, feature extraction, music information retrieval

Procedia PDF Downloads 193
1644 Topological Language for Classifying Linear Chord Diagrams via Intersection Graphs

Authors: Michela Quadrini

Abstract:

Chord diagrams occur in mathematics, from the study of RNA to knot theory. They are widely used in theory of knots and links for studying the finite type invariants, whereas in molecular biology one important motivation to study chord diagrams is to deal with the problem of RNA structure prediction. An RNA molecule is a linear polymer, referred to as the backbone, that consists of four types of nucleotides. Each nucleotide is represented by a point, whereas each chord of the diagram stands for one interaction for Watson-Crick base pairs between two nonconsecutive nucleotides. A chord diagram is an oriented circle with a set of n pairs of distinct points, considered up to orientation preserving diffeomorphisms of the circle. A linear chord diagram (LCD) is a special kind of graph obtained cutting the oriented circle of a chord diagram. It consists of a line segment, called its backbone, to which are attached a number of chords with distinct endpoints. There is a natural fattening on any linear chord diagram; the backbone lies on the real axis, while all the chords are in the upper half-plane. Each linear chord diagram has a natural genus of its associated surface. To each chord diagram and linear chord diagram, it is possible to associate the intersection graph. It consists of a graph whose vertices correspond to the chords of the diagram, whereas the chord intersections are represented by a connection between the vertices. Such intersection graph carries a lot of information about the diagram. Our goal is to define an LCD equivalence class in terms of identity of intersection graphs, from which many chord diagram invariants depend. For studying these invariants, we introduce a new representation of Linear Chord Diagrams based on a set of appropriate topological operators that permits to model LCD in terms of the relations among chords. Such set is composed of: crossing, nesting, and concatenations. The crossing operator is able to generate the whole space of linear chord diagrams, and a multiple context free grammar able to uniquely generate each LDC starting from a linear chord diagram adding a chord for each production of the grammar is defined. In other words, it allows to associate a unique algebraic term to each linear chord diagram, while the remaining operators allow to rewrite the term throughout a set of appropriate rewriting rules. Such rules define an LCD equivalence class in terms of the identity of intersection graphs. Starting from a modelled RNA molecule and the linear chord, some authors proposed a topological classification and folding. Our LCD equivalence class could contribute to the RNA folding problem leading to the definition of an algorithm that calculates the free energy of the molecule more accurately respect to the existing ones. Such LCD equivalence class could be useful to obtain a more accurate estimate of link between the crossing number and the topological genus and to study the relation among other invariants.

Keywords: chord diagrams, linear chord diagram, equivalence class, topological language

Procedia PDF Downloads 171
1643 Investigation of Chord Protocol in Peer to Peer Wireless Mesh Network with Mobility

Authors: P. Prasanna Murali Krishna, M. V. Subramanyam, K. Satya Prasad

Abstract:

File sharing in networks are generally achieved using Peer-to-Peer (P2P) applications. Structured P2P approaches are widely used in adhoc networks due to its distributed and scalability features. Efficient mechanisms are required to handle the huge amount of data distributed to all peers. The intrinsic characteristics of P2P system makes for easier content distribution when compared to client-server architecture. All the nodes in a P2P network act as both client and server, thus, distributing data takes lesser time when compared to the client-server method. CHORD protocol is a resource routing based where nodes and data items are structured into a 1- dimensional ring. The structured lookup algorithm of Chord is advantageous for distributed P2P networking applications. Though, structured approach improves lookup performance in a high bandwidth wired network it could contribute to unnecessary overhead in overlay networks leading to degradation of network performance. In this paper, the performance of existing CHORD protocol on Wireless Mesh Network (WMN) when nodes are static and dynamic is investigated.

Keywords: wireless mesh network (WMN), structured P2P networks, peer to peer resource sharing, CHORD Protocol, DHT

Procedia PDF Downloads 444
1642 Aural Skills Pedagogy for Students with Absolute Pitch

Authors: Rika Uchida

Abstract:

In teaching sophomore level aural skills, I have dealt with students with absolute pitch do poorly in my courses, particularly in harmonic dictation. They can identify triads; however, identifying quality of seventh chords or chromatic chords poses serious challenges. Most often, they need to spell all the pitches before identifying the chord qualities and Roman Numerals. Growing up in a country where acquiring absolute pitch is considered essential, I started my early music training with fixed do system at age three and learned all my music with solfege. When I was assigned as a TA in aural skills courses at graduate school in US, I had to learn relative pitch quickly. My survival method was listening to music with absolute pitch first, then quickly "translate" to relative pitch. In teaching my courses, I have been using chord progressions (5-8 chords total), in which students are asked to sing chord arpeggiation with movable do solfege. I use same progressions for harmonic dictation; I hoped that students learn to incorporate singing and listening skills by overlapping same materials. This method has proven to be successful for most students; in particular, it has helped students with absolute pitch to hear chord quality and function. Although original progressions are written in C as a tonic, they can identify chords in harmonic dictation in other keys as well. In short, I believe singing chord progression with movable do arpeggiation helps students with absolute pitch to improve hearing function and quality of chords in harmonic dictation.

Keywords: aural skills pedagogy, music theory, absolute pitch, harmonic dictation

Procedia PDF Downloads 108
1641 Handwriting Recognition of Gurmukhi Script: A Survey of Online and Offline Techniques

Authors: Ravneet Kaur

Abstract:

Character recognition is a very interesting area of pattern recognition. From past few decades, an intensive research on character recognition for Roman, Chinese, and Japanese and Indian scripts have been reported. In this paper, a review of Handwritten Character Recognition work on Indian Script Gurmukhi is being highlighted. Most of the published papers were summarized, various methodologies were analysed and their results are reported.

Keywords: Gurmukhi character recognition, online, offline, HCR survey

Procedia PDF Downloads 395
1640 OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text

Authors: A. R. Bagirzade, A. Sh. Najafova, S. M. Yessirkepova, E. S. Albert

Abstract:

This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication.

Keywords: ABBYY FineReader system, algorithm symbol recognition, OCR/ICR techniques, recognition technologies

Procedia PDF Downloads 134
1639 An Improved OCR Algorithm on Appearance Recognition of Electronic Components Based on Self-adaptation of Multifont Template

Authors: Zhu-Qing Jia, Tao Lin, Tong Zhou

Abstract:

The recognition method of Optical Character Recognition has been expensively utilized, while it is rare to be employed specifically in recognition of electronic components. This paper suggests a high-effective algorithm on appearance identification of integrated circuit components based on the existing methods of character recognition, and analyze the pros and cons.

Keywords: optical character recognition, fuzzy page identification, mutual correlation matrix, confidence self-adaptation

Procedia PDF Downloads 507
1638 Horizontal Circular Curve Computations Using a Developed Calculator

Authors: Adil Hassabo

Abstract:

In this paper, a horizontal circular curve computations calculator is developed in Microsoft Windows. The developed calculator can be used for determining the necessary information required for setting out horizontal curves. Three methods are applied in the developed program namely: incremental chord method, total chord method, and the coordinates method. Computations of horizontal curves by the developed calculator is faster, easier, accurate, and less subject to errors comparable to the traditional method of calculations. Finally, the results obtained by the traditional method and by the developed calculator are presented for checking the behavior of the developed calculator.

Keywords: calculator, circular, computations, curve

Procedia PDF Downloads 128
1637 Facial Recognition on the Basis of Facial Fragments

Authors: Tetyana Baydyk, Ernst Kussul, Sandra Bonilla Meza

Abstract:

There are many articles that attempt to establish the role of different facial fragments in face recognition. Various approaches are used to estimate this role. Frequently, authors calculate the entropy corresponding to the fragment. This approach can only give approximate estimation. In this paper, we propose to use a more direct measure of the importance of different fragments for face recognition. We propose to select a recognition method and a face database and experimentally investigate the recognition rate using different fragments of faces. We present two such experiments in the paper. We selected the PCNC neural classifier as a method for face recognition and parts of the LFW (Labeled Faces in the Wild) face database as training and testing sets. The recognition rate of the best experiment is comparable with the recognition rate obtained using the whole face.

Keywords: face recognition, labeled faces in the wild (LFW) database, random local descriptor (RLD), random features

Procedia PDF Downloads 328
1636 Numerical Study of 5kW Vertical Axis Wind Turbine Using DOE Method

Authors: Yan-Ting Lin, Wei-Nian Su

Abstract:

The purpose of this paper is to demonstrate the design of 5kW vertical axis wind turbine (VAWT) using DOE method. The NACA0015 airfoil was implemented for the design and 3D simulation. The critical design parameters are chord length, tip speed ratio (TSR), aspect ratio (AR) and pitch angle in this investigation. The RNG k-ε turbulent model and the sliding mesh method are adopted in the CFD simulation. The results show that the model with zero pitch, 0.3 m in chord length, TSR of 3, and AR of 10 demonstrated the optimum aerodynamic power under the uniform 10m/s inlet velocity. The aerodynamic power is 3.61kW and 3.89kW under TSR of 3 and 4 respectively. The aerodynamic power decreased dramatically while TSR increased to 5.

Keywords: vertical axis wind turbine, CFD, DOE, VAWT

Procedia PDF Downloads 401
1635 DBN-Based Face Recognition System Using Light Field

Authors: Bing Gu

Abstract:

Abstract—Most of Conventional facial recognition systems are based on image features, such as LBP, SIFT. Recently some DBN-based 2D facial recognition systems have been proposed. However, we find there are few DBN-based 3D facial recognition system and relative researches. 3D facial images include all the individual biometric information. We can use these information to build more accurate features, So we present our DBN-based face recognition system using Light Field. We can see Light Field as another presentation of 3D image, and Light Field Camera show us a way to receive a Light Field. We use the commercially available Light Field Camera to act as the collector of our face recognition system, and the system receive a state-of-art performance as convenient as conventional 2D face recognition system.

Keywords: DBN, face recognition, light field, Lytro

Procedia PDF Downloads 429
1634 Increment of Panel Flutter Margin Using Adaptive Stiffeners

Authors: S. Raja, K. M. Parammasivam, V. Aghilesh

Abstract:

Fluid-structure interaction is a crucial consideration in the design of many engineering systems such as flight vehicles and bridges. Aircraft lifting surfaces and turbine blades can fail due to oscillations caused by fluid-structure interaction. Hence, it is focussed to study the fluid-structure interaction in the present research. First, the effect of free vibration over the panel is studied. It is well known that the deformation of a panel and flow induced forces affects one another. The selected panel has a span 300mm, chord 300mm and thickness 2 mm. The project is to study, the effect of cross-sectional area and the stiffener location is carried out for the same panel. The stiffener spacing is varied along both the chordwise and span-wise direction. Then for that optimal location the ideal stiffener length is identified. The effect of stiffener cross-section shapes (T, I, Hat, Z) over flutter velocity has been conducted. The flutter velocities of the selected panel with two rectangular stiffeners of cantilever configuration are estimated using MSC NASTRAN software package. As the flow passes over the panel, deformation takes place which further changes the flow structure over it. With increasing velocity, the deformation goes on increasing, but the stiffness of the system tries to dampen the excitation and maintain equilibrium. But beyond a critical velocity, the system damping suddenly becomes ineffective, so it loses its equilibrium. This estimated in NASTRAN using PK method. The first 10 modal frequencies of a simple panel and stiffened panel are estimated numerically and are validated with open literature. A grid independence study is also carried out and the modal frequency values remain the same for element lengths less than 20 mm. The current investigation concludes that the span-wise stiffener placement is more effective than the chord-wise placement. The maximum flutter velocity achieved for chord-wise placement is 204 m/s while for a span-wise arrangement it is augmented to 963 m/s for the stiffeners location of ¼ and ¾ of the chord from the panel edge (50% of chord from either side of the mid-chord line). The flutter velocity is directly proportional to the stiffener cross-sectional area. A significant increment in flutter velocity from 218m/s to 1024m/s is observed for the stiffener lengths varying from 50% to 60% of the span. The maximum flutter velocity above Mach 3 is achieved. It is also observed that for a stiffened panel, the full effect of stiffener can be achieved only when the stiffener end is clamped. Stiffeners with Z cross section incremented the flutter velocity from 142m/s (Panel with no stiffener) to 328 m/s, which is 2.3 times that of simple panel.

Keywords: stiffener placement, stiffener cross-sectional area, stiffener length, stiffener cross sectional area shape

Procedia PDF Downloads 265
1633 Numerical Simulation of the Effect of Single and Dual Synthetic Jet on Stall Phenomenon On NACA (National Advisory Committee for Aeronautics) GA(W)-2 Airfoil

Authors: Abbasali Abouei Mehrizi, Hamid Hassanzadeh Afrouzi

Abstract:

Reducing the drag force increases the efficiency of the aircraft and its better performance. Flow control methods delay the phenomenon of flow separation and consequently reduce the reversed flow phenomenon in the separation region and enhance the performance of the lift force while decreasing the drag force and thus improving the aircraft efficiency. Flow control methods can be divided into active and passive types. The use of synthetic jets actuator (SJA) used in this study for NACA GA (W) -2 airfoil is one of the active flow control methods to prevent stall phenomenon on the airfoil. In this research, the relevant airfoil in different angles of attack with and without jets has been compared by OpenFOAM. Also, after achieving the proper SJA position on the airfoil suction surface, the simultaneous effect of two SJAs has been discussed. It was found to have the best effect at 12% chord (C), close to the airfoil’s leading edge (LE). At 12% chord, SJA decreases the drag significantly with increasing lift, and also, the average lift increase was higher than other situations and was equal to 10.4%. The highest drag reduction was about 5% in SJA=0.25C. Then, due to the positive effects of SJA in the 12% and 25% chord regions, these regions were considered for applying dual jets in two post-stall angles of attack, i.e., 16° and 22°.

Keywords: active and passive flow control methods, computational fluid dynamics, flow separation, synthetic jet

Procedia PDF Downloads 44
1632 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 89
1631 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 289
1630 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: automatic speech recognition, asr, air traffic control, atc

Procedia PDF Downloads 363
1629 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 62
1628 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 154
1627 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 510
1626 Human Activities Recognition Based on Expert System

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 100
1625 Enhanced Face Recognition with Daisy Descriptors Using 1BT Based Registration

Authors: Sevil Igit, Merve Meric, Sarp Erturk

Abstract:

In this paper, it is proposed to improve Daisy descriptor based face recognition using a novel One-Bit Transform (1BT) based pre-registration approach. The 1BT based pre-registration procedure is fast and has low computational complexity. It is shown that the face recognition accuracy is improved with the proposed approach. The proposed approach can facilitate highly accurate face recognition using DAISY descriptor with simple matching and thereby facilitate a low-complexity approach.

Keywords: face recognition, Daisy descriptor, One-Bit Transform, image registration

Procedia PDF Downloads 335
1624 Modern Machine Learning Conniptions for Automatic Speech Recognition

Authors: S. Jagadeesh Kumar

Abstract:

This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.

Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning

Procedia PDF Downloads 412
1623 Advances in Artificial intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance

Procedia PDF Downloads 442
1622 Biometric Recognition Techniques: A Survey

Authors: Shabir Ahmad Sofi, Shubham Aggarwal, Sanyam Singhal, Roohie Naaz

Abstract:

Biometric recognition refers to an automatic recognition of individuals based on a feature vector(s) derived from their physiological and/or behavioral characteristic. Biometric recognition systems should provide a reliable personal recognition schemes to either confirm or determine the identity of an individual. These features are used to provide an authentication for computer based security systems. Applications of such a system include computer systems security, secure electronic banking, mobile phones, credit cards, secure access to buildings, health and social services. By using biometrics a person could be identified based on 'who she/he is' rather than 'what she/he has' (card, token, key) or 'what she/he knows' (password, PIN). In this paper, a brief overview of biometric methods, both unimodal and multimodal and their advantages and disadvantages, will be presented.

Keywords: biometric, DNA, fingerprint, ear, face, retina scan, gait, iris, voice recognition, unimodal biometric, multimodal biometric

Procedia PDF Downloads 726
1621 Printed Thai Character Recognition Using Particle Swarm Optimization Algorithm

Authors: Phawin Sangsuvan, Chutimet Srinilta

Abstract:

This Paper presents the applications of Particle Swarm Optimization (PSO) Method for Thai optical character recognition (OCR). OCR consists of the pre-processing, character recognition and post-processing. Before enter into recognition process. The Character must be “Prepped” by pre-processing process. The PSO is an optimization method that belongs to the swarm intelligence family based on the imitation of social behavior patterns of animals. Route of each particle is determined by an individual data among neighborhood particles. The interaction of the particles with neighbors is the advantage of Particle Swarm to determine the best solution. So PSO is interested by a lot of researchers in many difficult problems including character recognition. As the previous this research used a Projection Histogram to extract printed digits features and defined the simple Fitness Function for PSO. The results reveal that PSO gives 67.73% for testing dataset. So in the future there can be explored enhancement the better performance of PSO with improve the Fitness Function.

Keywords: character recognition, histogram projection, particle swarm optimization, pattern recognition techniques

Procedia PDF Downloads 438
1620 Enhanced Thai Character Recognition with Histogram Projection Feature Extraction

Authors: Benjawan Rangsikamol, Chutimet Srinilta

Abstract:

This research paper deals with extraction of Thai character features using the proposed histogram projection so as to improve the recognition performance. The process starts with transformation of image files into binary files before thinning. After character thinning, the skeletons are entered into the proposed extraction using histogram projection (horizontal and vertical) to extract unique features which are inputs of the subsequent recognition step. The recognition rate with the proposed extraction technique is as high as 97 percent since the technique works very well with the idiosyncrasies of Thai characters.

Keywords: character recognition, histogram projection, multilayer perceptron, Thai character features extraction

Procedia PDF Downloads 431
1619 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition

Procedia PDF Downloads 141
1618 New Approaches for the Handwritten Digit Image Features Extraction for Recognition

Authors: U. Ravi Babu, Mohd Mastan

Abstract:

The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.

Keywords: handwritten digit recognition, distance measure, MNIST database, image features

Procedia PDF Downloads 432
1617 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 156
1616 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 313