Search results for: distant speech recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2408

Search results for: distant speech recognition

2168 The Effect of Speech-Shaped Noise and Speaker’s Voice Quality on First-Grade Children’s Speech Perception and Listening Comprehension

Authors: I. Schiller, D. Morsomme, A. Remacle

Abstract:

Children’s ability to process spoken language develops until the late teenage years. At school, where efficient spoken language processing is key to academic achievement, listening conditions are often unfavorable. High background noise and poor teacher’s voice represent typical sources of interference. It can be assumed that these factors particularly affect primary school children, because their language and literacy skills are still low. While it is generally accepted that background noise and impaired voice impede spoken language processing, there is an increasing need for analyzing impacts within specific linguistic areas. Against this background, the aim of the study was to investigate the effect of speech-shaped noise and imitated dysphonic voice on first-grade primary school children’s speech perception and sentence comprehension. Via headphones, 5 to 6-year-old children, recruited within the French-speaking community of Belgium, listened to and performed a minimal-pair discrimination task and a sentence-picture matching task. Stimuli were randomly presented according to four experimental conditions: (1) normal voice / no noise, (2) normal voice / noise, (3) impaired voice / no noise, and (4) impaired voice / noise. The primary outcome measure was task score. How did performance vary with respect to listening condition? Preliminary results will be presented with respect to speech perception and sentence comprehension and carefully interpreted in the light of past findings. This study helps to support our understanding of children’s language processing skills under adverse conditions. Results shall serve as a starting point for probing new measures to optimize children’s learning environment.

Keywords: impaired voice, sentence comprehension, speech perception, speech-shaped noise, spoken language processing

Procedia PDF Downloads 164
2167 Programmed Speech to Text Summarization Using Graph-Based Algorithm

Authors: Hamsini Pulugurtha, P. V. S. L. Jagadamba

Abstract:

Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculations

Keywords: Siamese neural network, English speech, English text, natural language processing, unsupervised extractive text summarization

Procedia PDF Downloads 183
2166 Multimodal Employee Attendance Management System

Authors: Khaled Mohammed

Abstract:

This paper presents novel face recognition and identification approaches for the real-time attendance management problem in large companies/factories and government institutions. The proposed uses the Minimum Ratio (MR) approach for employee identification. Capturing the authentic face variability from a sequence of video frames has been considered for the recognition of faces and resulted in system robustness against the variability of facial features. Experimental results indicated an improvement in the performance of the proposed system compared to the Previous approaches at a rate between 2% to 5%. In addition, it decreased the time two times if compared with the Previous techniques, such as Extreme Learning Machine (ELM) & Multi-Scale Structural Similarity index (MS-SSIM). Finally, it achieved an accuracy of 99%.

Keywords: attendance management system, face detection and recognition, live face recognition, minimum ratio

Procedia PDF Downloads 130
2165 Human Gait Recognition Using Moment with Fuzzy

Authors: Jyoti Bharti, Navneet Manjhi, M. K.Gupta, Bimi Jain

Abstract:

A reliable gait features are required to extract the gait sequences from an images. In this paper suggested a simple method for gait identification which is based on moments. Moment values are extracted on different number of frames of gray scale and silhouette images of CASIA database. These moment values are considered as feature values. Fuzzy logic and nearest neighbour classifier are used for classification. Both achieved higher recognition.

Keywords: gait, fuzzy logic, nearest neighbour, recognition rate, moments

Procedia PDF Downloads 722
2164 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 117
2163 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 323
2162 Reconstructed Phase Space Features for Estimating Post Traumatic Stress Disorder

Authors: Andre Wittenborn, Jarek Krajewski

Abstract:

Trauma-related sadness in speech can alter the voice in several ways. The generation of non-linear aerodynamic phenomena within the vocal tract is crucial when analyzing trauma-influenced speech production. They include non-laminar flow and formation of jets rather than well-behaved laminar flow aspects. Especially state-space reconstruction methods based on chaotic dynamics and fractal theory have been suggested to describe these aerodynamic turbulence-related phenomena of the speech production system. To extract the non-linear properties of the speech signal, we used the time delay embedding method to reconstruct from a scalar time series (reconstructed phase space, RPS). This approach results in the extraction of 7238 Features per .wav file (N= 47, 32 m, 15 f). The speech material was prompted by telling about autobiographical related sadness-inducing experiences (sampling rate 16 kHz, 8-bit resolution). After combining these features in a support vector machine based machine learning approach (leave-one-sample out validation), we achieved a correlation of r = .41 with the well-established, self-report ground truth measure (RATS) of post-traumatic stress disorder (PTSD).

Keywords: non-linear dynamics features, post traumatic stress disorder, reconstructed phase space, support vector machine

Procedia PDF Downloads 79
2161 Speech Perception by Video Hosting Services Actors: Urban Planning Conflicts

Authors: M. Pilgun

Abstract:

The report presents the results of a study of the specifics of speech perception by actors of video hosting services on the material of urban planning conflicts. To analyze the content, the multimodal approach using neural network technologies is employed. Analysis of word associations and associative networks of relevant stimulus revealed the evaluative reactions of the actors. Analysis of the data identified key topics that generated negative and positive perceptions from the participants. The calculation of social stress and social well-being indices based on user-generated content made it possible to build a rating of road transport construction objects according to the degree of negative and positive perception by actors.

Keywords: social media, speech perception, video hosting, networks

Procedia PDF Downloads 121
2160 Functions and Pragmatic Aspects of English Nonsense

Authors: Natalia V. Ursul

Abstract:

In linguistic studies, the question of nonsense is attracting increasing interest. Nonsense is usually defined as spoken or written words that have no meaning. However, this definition is likely to be outdated as any speech act is generated due to the speaker’s pragmatic reasons, thus it cannot be purely illogical or meaningless. In the current paper a new working definition of nonsense as a linguistic medium will be formulated; moreover, the pragmatic peculiarities of newly coined linguistic patterns and possible ways of their interpretation will be discussed.

Keywords: nonsense, nonse verse, pragmatics, speech act

Procedia PDF Downloads 488
2159 Preliminary Study of the Phonological Development in Three and Four Year Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of research on phonological processes in three and four-year-old children. For the purpose of the study, an author's test was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the icing is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, an elision of sound, metathesis of sound, elision of a syllable, and elision of consonants clustered in a syllable. All examined children were identified with the articulatory disorder from type bilabial lambdacism. Measuring the correlation between the average length of repeated speech and the average length of generated speech, the analysis proves that the more words a child can repeat in part “repeated speech,” the more words they can be expected to generate in part “generating sentence.” The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: assessment, phonology, articulation, speech-language development

Procedia PDF Downloads 149
2158 Hand Gesture Interpretation Using Sensing Glove Integrated with Machine Learning Algorithms

Authors: Aqsa Ali, Aleem Mushtaq, Attaullah Memon, Monna

Abstract:

In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.

Keywords: American sign language, assistive hand gesture interpreter, human-machine interface, machine learning, sensing glove

Procedia PDF Downloads 261
2157 Fitness Action Recognition Based on MediaPipe

Authors: Zixuan Xu, Yichun Lou, Yang Song, Zihuai Lin

Abstract:

MediaPipe is an open-source machine learning computer vision framework that can be ported into a multi-platform environment, which makes it easier to use it to recognize the human activity. Based on this framework, many human recognition systems have been created, but the fundamental issue is the recognition of human behavior and posture. In this paper, two methods are proposed to recognize human gestures based on MediaPipe, the first one uses the Adaptive Boosting algorithm to recognize a series of fitness gestures, and the second one uses the Fast Dynamic Time Warping algorithm to recognize 413 continuous fitness actions. These two methods are also applicable to any human posture movement recognition.

Keywords: computer vision, MediaPipe, adaptive boosting, fast dynamic time warping

Procedia PDF Downloads 79
2156 Words Spotting in the Images Handwritten Historical Documents

Authors: Issam Ben Jami

Abstract:

Information retrieval in digital libraries is very important because most famous historical documents occupy a significant value. The word spotting in historical documents is a very difficult notion, because automatic recognition of such documents is naturally cursive, it represents a wide variability in the level scale and translation words in the same documents. We first present a system for the automatic recognition, based on the extraction of interest points words from the image model. The extraction phase of the key points is chosen from the representation of the image as a synthetic description of the shape recognition in a multidimensional space. As a result, we use advanced methods that can find and describe interesting points invariant to scale, rotation and lighting which are linked to local configurations of pixels. We test this approach on documents of the 15th century. Our experiments give important results.

Keywords: feature matching, historical documents, pattern recognition, word spotting

Procedia PDF Downloads 246
2155 Effects of Therapeutic Horseback Riding in Speech and Communication Skills of Children with Autism

Authors: Aristi Alopoudi, Sofia Beloka, Vassiliki Pliogou

Abstract:

Autism is a complex neuro-developmental disorder with a variety of difficulties in many aspects such as social interaction, communication skills and verbal communication (speech). The aim of this study was to examine the impact of therapeutic horseback riding in improving the verbal and communication skills of children diagnosed with autism during 16 sessions. The researcher examined whether the expression of speech, the use of vocabulary, semantics, pragmatics, echolalia and communication skills were influenced by the therapeutic horseback riding when we increase the frequency of the sessions. The researcher observed two subjects of primary-school aged, in a two case observation design, with autism during 16 therapeutic horseback riding sessions (one riding session per week). Compared to baseline, at the end of the 16th therapeutic session, therapeutic horseback riding increased both verbal skills such as vocabulary, semantics, pragmatics, formation of sentences and communication skills such as eye contact, greeting, participation in dialogue and spontaneous speech. It was noticeable that echolalia remained stable. Increased frequency of therapeutic horseback riding was beneficial for significant improvement in verbal and communication skills. More specifically, from the first to the last riding session there was a great increase of vocabulary, semantics, and formation of sentences. Pragmatics reached a lower level than semantics but the same as the right usage of the first person (for example, I make a hug) and echolalia used for that. A great increase of spontaneous speech was noticed. The eye contact was presented in a lower level, and there was a slow but important raise at the greeting as well as the participation in dialogue. Last but not least; this is a first study conducted in therapeutic horseback riding studying the verbal communication and communication skills in autistic children. According to the references, therapeutic horseback riding is a therapy with a variety of benefits, thus; this research made clear that in the benefits of this therapy there should be included the improvement of verbal speech and communication.

Keywords: Autism, communication skills, speech, therapeutic horseback riding

Procedia PDF Downloads 241
2154 The Role of Virtual Reality in Mediating the Vulnerability of Distant Suffering: Distance, Agency, and the Hierarchies of Human Life

Authors: Z. Xu

Abstract:

Immersive virtual reality (VR) has gained momentum in humanitarian communication due to its utopian promises of co-presence, immediacy, and transcendence. These potential benefits have led the United Nations (UN) to tirelessly produce and distribute VR series to evoke global empathy and encourage policymakers, philanthropic business tycoons and citizens around the world to actually do something (i.e. give a donation). However, it is unclear whether or not VR can cultivate cosmopolitans with a sense of social responsibility towards the geographically, socially/culturally and morally mediated misfortune of faraway others. Drawing upon existing works on the mediation of distant suffering, this article constructs an analytical framework to articulate the issue. Applying this framework on a case study of five of the UN’s VR pieces, the article identifies three paradoxes that exist between cyber-utopian and cyber-dystopian narratives. In the “paradox of distance”, VR relies on the notions of “presence” and “storyliving” to implicitly link audiences spatially and temporally to distant suffering, creating global connectivity and reducing perceived distances between audiences and others; yet it also enables audiences to fully occupy the point of view of distant sufferers (creating too close/absolute proximity), which may cause them to feel naive self-righteousness or narcissism with their pleasures and desire, thereby destroying the “proper distance”. In the “paradox of agency”, VR simulates a superficially “real” encounter for visual intimacy, thereby establishing an “audiences–beneficiary” relationship in humanitarian communication; yet in this case the mediated hyperreality is not an authentic reality, and its simulation does not fill the gap between reality and the virtual world. In the “paradox of the hierarchies of human life”, VR enables an audience to experience virtually fundamental “freedom”, epitomizing an attitude of cultural relativism that informs a great deal of contemporary multiculturalism, providing vast possibilities for a more egalitarian representation of distant sufferers; yet it also takes the spectator’s personally empathic feelings as the focus of intervention, rather than structural inequality and political exclusion (an economic and political power relations of viewing). Thus, the audience can potentially remain trapped within the minefield of hegemonic humanitarianism. This study is significant in two respects. First, it advances the turn of digitalization in studies of media and morality in the polymedia milieu; it is motivated by the necessary call for a move beyond traditional technological environments to arrive at a more novel understanding of the asymmetry of power between the safety of spectators and the vulnerability of mediated sufferers. Second, it not only reminds humanitarian journalists and NGOs that they should not rely entirely on the richer news experience or powerful response-ability enabled by VR to gain a “moral bond” with distant sufferers, but also argues that when fully-fledged VR technology is developed, it can serve as a kind of alchemy and should not be underestimated merely as a “bugaboo” of an alarmist philosophical and fictional dystopia.

Keywords: audience, cosmopolitan, distant suffering, virtual reality, humanitarian communication

Procedia PDF Downloads 108
2153 Low-Income African-American Fathers' Gendered Relationships with Their Children: A Study Examining the Impact of Child Gender on Father-Child Interactions

Authors: M. Lim Haslip

Abstract:

This quantitative study explores the correlation between child gender and father-child interactions. The author analyzes data from videotaped interactions between African-American fathers and their boy or girl toddler to explain how African-American fathers and toddlers interact with each other and whether these interactions differ by child gender. The purpose of this study is to investigate the research question: 'How, if at all, do fathers’ speech and gestures differ when interacting with their two-year-old sons versus daughters during free play?' The objectives of this study are to describe how child gender impacts African-American fathers’ verbal communication, examine how fathers gesture and speak to their toddler by gender, and to guide interventions for low-income African-American families and their children in early language development. This study involves a sample of 41 low-income African-American fathers and their 24-month-old toddlers. The videotape data will be used to observe 10-minute father-child interactions during free play. This study uses the already transcribed and coded data provided by Dr. Meredith Rowe, who did her study on the impact of African-American fathers’ verbal input on their children’s language development. The Child Language Data Exchange System (CHILDES program), created to study conversational interactions, was used for transcription and coding of the videotape data. The findings focus on the quantity of speech, diversity of speech, complexity of speech, and the quantity of gesture to inform the vocabulary usage, number of spoken words, length of speech, and the number of object pointings observed during father-toddler interactions in a free play setting. This study will help intervention and prevention scientists understand early language development in the African-American population. It will contribute to knowledge of the role of African-American fathers’ interactions on their children’s language development. It will guide interventions for the early language development of African-American children.

Keywords: parental engagement, early language development, African-American families, quantity of speech, diversity of speech, complexity of speech and the quantity of gesture

Procedia PDF Downloads 83
2152 Recognition of Tifinagh Characters with Missing Parts Using Neural Network

Authors: El Mahdi Barrah, Said Safi, Abdessamad Malaoui

Abstract:

In this paper, we present an algorithm for reconstruction from incomplete 2D scans for tifinagh characters. This algorithm is based on using correlation between the lost block and its neighbors. This system proposed contains three main parts: pre-processing, features extraction and recognition. In the first step, we construct a database of tifinagh characters. In the second step, we will apply “shape analysis algorithm”. In classification part, we will use Neural Network. The simulation results demonstrate that the proposed method give good results.

Keywords: Tifinagh character recognition, neural networks, local cost computation, ANN

Procedia PDF Downloads 307
2151 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are routinely used in patients with conductive or mixed hearing loss, e.g. if conventional air conduction hearing aids cannot be used. New sound processors and new fitting software now allow the adjustment of parameters such as loudness compression ratios or maximum power output separately. Today it is unclear, how the choice of these parameters influences aided speech understanding in BAHI users. In this prospective experimental study, the effect of varying the compression ratio and lowering the maximum power output in a BAHI were investigated. Twelve experienced adult subjects with a mixed hearing loss participated in this study. Four different compression ratios (1.0; 1.3; 1.6; 2.0) were tested along with two different maximum power output settings, resulting in a total of eight different programs. Each participant tested each program during two weeks. A blinded Latin square design was used to minimize bias. For each of the eight programs, speech understanding in quiet and in noise was assessed. For speech in quiet, the Freiburg number test and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL were used. For speech in noise, the Oldenburg sentence test was administered. Speech understanding in quiet and in noise was improved significantly in the aided condition in any program, when compared to the unaided condition. However, no significant differences were found between any of the eight programs. In contrast, on a subjective level there was a significant preference for medium compression ratios of 1.3 to 1.6 and higher maximum power output.

Keywords: Bone Anchored Hearing Implant, baha, compression, maximum power output, speech understanding

Procedia PDF Downloads 354
2150 Lightweight Hybrid Convolutional and Recurrent Neural Networks for Wearable Sensor Based Human Activity Recognition

Authors: Sonia Perez-Gamboa, Qingquan Sun, Yan Zhang

Abstract:

Non-intrusive sensor-based human activity recognition (HAR) is utilized in a spectrum of applications, including fitness tracking devices, gaming, health care monitoring, and smartphone applications. Deep learning models such as convolutional neural networks (CNNs) and long short term memory (LSTM) recurrent neural networks (RNNs) provide a way to achieve HAR accurately and effectively. In this paper, we design a multi-layer hybrid architecture with CNN and LSTM and explore a variety of multi-layer combinations. Based on the exploration, we present a lightweight, hybrid, and multi-layer model, which can improve the recognition performance by integrating local features and scale-invariant with dependencies of activities. The experimental results demonstrate the efficacy of the proposed model, which can achieve a 94.7% activity recognition rate on a benchmark human activity dataset. This model outperforms traditional machine learning and other deep learning methods. Additionally, our implementation achieves a balance between recognition rate and training time consumption.

Keywords: deep learning, LSTM, CNN, human activity recognition, inertial sensor

Procedia PDF Downloads 117
2149 Developing a Secure Iris Recognition System by Using Advance Convolutional Neural Network

Authors: Kamyar Fakhr, Roozbeh Salmani

Abstract:

Alphonse Bertillon developed the first biometric security system in the 1800s. Today, many governments and giant companies are considering or have procured biometrically enabled security schemes. Iris is a kaleidoscope of patterns and colors. Each individual holds a set of irises more unique than their thumbprint. Every single day, giant companies like Google and Apple are experimenting with reliable biometric systems. Now, after almost 200 years of improvements, face ID does not work with masks, it gives access to fake 3D images, and there is no global usage of biometric recognition systems as national identity (ID) card. The goal of this paper is to demonstrate the advantages of iris recognition overall biometric recognition systems. It make two extensions: first, we illustrate how a very large amount of internet fraud and cyber abuse is happening due to bugs in face recognition systems and in a very large dataset of 3.4M people; second, we discuss how establishing a secure global network of iris recognition devices connected to authoritative convolutional neural networks could be the safest solution to this dilemma. Another aim of this study is to provide a system that will prevent system infiltration caused by cyber-attacks and will block all wireframes to the data until the main user ceases the procedure.

Keywords: biometric system, convolutional neural network, cyber-attack, secure

Procedia PDF Downloads 189
2148 ANAC-id - Facial Recognition to Detect Fraud

Authors: Giovanna Borges Bottino, Luis Felipe Freitas do Nascimento Alves Teixeira

Abstract:

This article aims to present a case study of the National Civil Aviation Agency (ANAC) in Brazil, ANAC-id. ANAC-id is the artificial intelligence algorithm developed for image analysis that recognizes standard images of unobstructed and uprighted face without sunglasses, allowing to identify potential inconsistencies. It combines YOLO architecture and 3 libraries in python - face recognition, face comparison, and deep face, providing robust analysis with high level of accuracy.

Keywords: artificial intelligence, deepface, face compare, face recognition, YOLO, computer vision

Procedia PDF Downloads 123
2147 Effects of Recognition of Customer Feedback on Relationships between Emotional Labor and Job Satisfaction: Focusing On Call Centers That Offer Professional Services

Authors: Kiyoko Yoshimura, Yasunobu Kino

Abstract:

Focusing on professional call centers where workers with expertise perform services, this study aims to clarify the relationships between emotional labor and job satisfaction and the effects of recognition of customer feedback. Since the professional call center operators consist of professional license holders (qualification holders) and those who do not (non-holders), the following three points are analyzed in the two groups by using covariance structure analysis and simultaneous multi-population analysis: 1) The relationship between emotional labor and job satisfaction, 2) customer feedback and job satisfaction, and 3) The intermediation effect between the emotional labor of customer feedback and job satisfaction. The following results are obtained: i) no direct effect is found between job satisfaction and emotional labor for qualification holders and non-holders, ii) for qualification holders and non-holders, recognition of positive feedback and recognition of negative feedback had positive and negative effects on job satisfaction, respectively, iii) for qualification and non-holders, "consideration for colleagues" influences job satisfaction by recognizing positive feedback, and iv) only for qualification holders, the factors "customer-oriented emotional expression" and "emotional disharmony" have a positive and negative effect on job satisfaction, respectively, through recognition of positive feedback and recognition of negative feedback.

Keywords: call center, emotional labor, professional service, job satisfaction, customer feedback

Procedia PDF Downloads 66
2146 Distorted Document Images Dataset for Text Detection and Recognition

Authors: Ilia Zharikov, Philipp Nikitin, Ilia Vasiliev, Vladimir Dokholyan

Abstract:

With the increasing popularity of document analysis and recognition systems, text detection (TD) and optical character recognition (OCR) in document images become challenging tasks. However, according to our best knowledge, no publicly available datasets for these particular problems exist. In this paper, we introduce a Distorted Document Images dataset (DDI-100) and provide a detailed analysis of the DDI-100 in its current state. To create the dataset we collected 7000 unique document pages, and extend it by applying different types of distortions and geometric transformations. In total, DDI-100 contains more than 100,000 document images together with binary text masks, text and character locations in terms of bounding boxes. We also present an analysis of several state-of-the-art TD and OCR approaches on the presented dataset. Lastly, we demonstrate the usefulness of DDI-100 to improve accuracy and stability of the considered TD and OCR models.

Keywords: document analysis, open dataset, optical character recognition, text detection

Procedia PDF Downloads 139
2145 Recognition and Enforcement of Foreign Decree Divorces in India with Special Reference to the Hindu Marriage Act, 1955

Authors: Poonamdeep kaur

Abstract:

With the increase in number of Non-Resident Indian marriages there is also increase in foreign decree divorces which inevitably causes the problem of recognition and enforcement of foreign judgments in India. The Hindus in India are governed by the Hindu Marriage Act, 1956. According to the said Act the courts in India have jurisdiction to try the matrimonial dispute if the marriage is performed in India or the parties to the marriage have domicile in India irrespective of their nationality status. But, sometimes one of the parties to the marriage whose marriage is solemnized in India obtains divorce in foreign courts and prays for the recognition and enforcement of such divorce in India. In such case section 13 of the Indian Civil Procedure Code, 1908, comes into play for the recognition and enforcement of foreign divorces in India. The section makes a foreign judgment conclusive in India subject to the fulfilment of certain conditions. Even if a foreign decree divorce is given on personal connecting factors of the parties to the matrimonial dispute like domicile, such divorce may still be refused recognition in India by virtue of section 13 of the Indian Civil Procedure Code, 1908. It is a universal truth that municipal law of countries is not the same throughout the world. Comity plays an important role in recognition and enforcing a foreign judgment, but, now in India the principle is not applied mechanically as the divorce matter is dealt strictly with regard to Indian Law. So in this paper there will be deep analysis of Indian case laws relating to recognition and enforcement of foreign divorces and based on this a comparative study will be made with the laws of Canada and England on the same subject to find out whether the Indian law on recognition and Enforcement of foreign judgment are in line with the laws of Canada and England and whether in recent years the Indian courts have evolved some new principles of private international law to deal with limping marriages. At last conclusions will be drawn out from the comparative study and suggestions would be given to make the rules of recognition and enforcement of foreign judgments on divorce more certain.

Keywords: divorce, foreign decree, private international law, recognition and enforcement of foreign judgment

Procedia PDF Downloads 164
2144 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping

Procedia PDF Downloads 381
2143 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: biometrics, finger vein recognition, principal component analysis (PCA), kernel principal component analysis (KPCA)

Procedia PDF Downloads 339
2142 Arabic Handwriting Recognition Using Local Approach

Authors: Mohammed Arif, Abdessalam Kifouche

Abstract:

Optical character recognition (OCR) has a main role in the present time. It's capable to solve many serious problems and simplify human activities. The OCR yields to 70's, since many solutions has been proposed, but unfortunately, it was supportive to nothing but Latin languages. This work proposes a system of recognition of an off-line Arabic handwriting. This system is based on a structural segmentation method and uses support vector machines (SVM) in the classification phase. We have presented a state of art of the characters segmentation methods, after that a view of the OCR area, also we will address the normalization problems we went through. After a comparison between the Arabic handwritten characters & the segmentation methods, we had introduced a contribution through a segmentation algorithm.

Keywords: OCR, segmentation, Arabic characters, PAW, post-processing, SVM

Procedia PDF Downloads 19
2141 Cells Detection and Recognition in Bone Marrow Examination with Deep Learning Method

Authors: Shiyin He, Zheng Huang

Abstract:

In this paper, deep learning methods are applied in bio-medical field to detect and count different types of cells in an automatic way instead of manual work in medical practice, specifically in bone marrow examination. The process is mainly composed of two steps, detection and recognition. Mask-Region-Convolutional Neural Networks (Mask-RCNN) was used for detection and image segmentation to extract cells and then Convolutional Neural Networks (CNN), as well as Deep Residual Network (ResNet) was used to classify. Result of cell detection network shows high efficiency to meet application requirements. For the cell recognition network, two networks are compared and the final system is fully applicable.

Keywords: cell detection, cell recognition, deep learning, Mask-RCNN, ResNet

Procedia PDF Downloads 156
2140 Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers

Authors: C. V. Aravinda, H. N. Prakash

Abstract:

In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy.

Keywords: word segmentation and recognition, character recognition, optical character recognition, hand written character recognition, South Indian languages

Procedia PDF Downloads 471
2139 A Comparative Study on Vowel Articulation in Malayalam Speaking Children Using Cochlear Implant

Authors: Deepthy Ann Joy, N. Sreedevi

Abstract:

Hearing impairment (HI) at an early age, identified before the onset of language development can reduce the negative effect on speech and language development of children. Early rehabilitation is very important in the improvement of speech production in children with HI. Other than conventional hearing aids, Cochlear Implants are being used in the rehabilitation of children with HI. However, delay in acquisition of speech and language milestones persist in children with Cochlear Implant (CI). Delay in speech milestones are reflected through speech sound errors. These errors reflect the temporal and spectral characteristics of speech. Hence, acoustical analysis of the speech sounds will provide a better representation of speech production skills in children with CI. The present study aimed at investigating the acoustic characteristics of vowels in Malayalam speaking children with a cochlear implant. The participants of the study consisted of 20 Malayalam speaking children in the age range of four and seven years. The experimental group consisted of 10 children with CI, and the control group consisted of 10 typically developing children. Acoustic analysis was carried out for 5 short (/a/, /i/, /u/, /e/, /o/) and 5 long vowels (/a:/, /i:/, /u:/, /e:/, /o:/) in word-initial position. The responses were recorded and analyzed for acoustic parameters such as Vowel duration, Ratio of the duration of a short and long vowel, Formant frequencies (F₁ and F₂) and Formant Centralization Ratio (FCR) computed using the formula (F₂u+F₂a+F₁i+F₁u)/(F₂i+F₁a). Findings of the present study indicated that the values for vowel duration were higher in experimental group compared to the control group for all the vowels except for /u/. Ratio of duration of short and long vowel was also found to be higher in experimental group compared to control group except for /i/. Further F₁ for all vowels was found to be higher in experimental group with variability noticed in F₂ values. FCR was found be higher in experimental group, indicating vowel centralization. Further, the results of independent t-test revealed no significant difference across the parameters in both the groups. It was found that the spectral and temporal measures in children with CI moved towards normal range. The result emphasizes the significance of early rehabilitation in children with hearing impairment. The role of rehabilitation related aspects are also discussed in detail which can be clinically incorporated for the betterment of speech therapeutic services in children with CI.

Keywords: acoustics, cochlear implant, Malayalam, vowels

Procedia PDF Downloads 117