Search results for: automatic recognition of speech
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3072

Search results for: automatic recognition of speech

2802 Application of Signature Verification Models for Document Recognition

Authors: Boris M. Fedorov, Liudmila P. Goncharenko, Sergey A. Sybachin, Natalia A. Mamedova, Ekaterina V. Makarenkova, Saule Rakhimova

Abstract:

In modern economic conditions, the question of the possibility of correct recognition of a signature on digital documents in order to verify the expression of will or confirm a certain operation is relevant. The additional complexity of processing lies in the dynamic variability of the signature for each individual, as well as in the way information is processed because the signature refers to biometric data. The article discusses the issues of using artificial intelligence models in order to improve the quality of signature confirmation in document recognition. The analysis of several possible options for using the model is carried out. The results of the study are given, in which it is possible to correctly determine the authenticity of the signature on small samples.

Keywords: signature recognition, biometric data, artificial intelligence, neural networks

Procedia PDF Downloads 148
2801 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning

Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang

Abstract:

Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.

Keywords: intelligence, software architecture, re-architecting, software reuse, High level design

Procedia PDF Downloads 119
2800 Practical Approach to Development Automated System of Record Research Results Architectural Cultural Heritage Objects Island-Town Sviyazhsk

Authors: Timur R. Azizov, Eugenia F. Shaykhutdinova, Ayrat G. Sitdikov

Abstract:

In this article, we consider problems of automatic research result analysis and current monitoring of cultural legacy objects in island-city Sviyazhsk. We make basic concept of creating Automatic system, including developing the knowledge library with all conditions of three historical objects. In addition, we made described process of developing Automatic system of research result analysis of cultural legacy objects in island-city Sviyazhsk.

Keywords: automated system, record, results of research, unity3D, ASP .NET

Procedia PDF Downloads 245
2799 An Exploratory Survey Questionnaire to Understand What Emotions Are Important and Difficult to Communicate for People with Dysarthria and Their Methodology of Communicating

Authors: Lubna Alhinti, Heidi Christensen, Stuart Cunningham

Abstract:

People with speech disorders may rely on augmentative and alternative communication (AAC) technologies to help them communicate. However, the limitations of the current AAC technologies act as barriers to the optimal use of these technologies in daily communication settings. The ability to communicate effectively relies on a number of factors that are not limited to the intelligibility of the spoken words. In fact, non-verbal cues play a critical role in the correct comprehension of messages and having to rely on verbal communication only, as is the case with current AAC technology, may contribute to problems in communication. This is especially true for people’s ability to express their feelings and emotions, which are communicated to a large part through non-verbal cues. This paper focuses on understanding more about the non-verbal communication ability of people with dysarthria, with the overarching aim of this research being to improve AAC technology by allowing people with dysarthria to better communicate emotions. Preliminary survey results are presented that gives an understanding of how people with dysarthria convey emotions, what emotions that are important for them to get across, what emotions that are difficult for them to convey, and whether there is a difference in communicating emotions when speaking to familiar versus unfamiliar people.

Keywords: alternative and augmentative communication technology, dysarthria, speech emotion recognition, VIVOCA

Procedia PDF Downloads 164
2798 A Recognition Method of Ancient Yi Script Based on Deep Learning

Authors: Shanxiong Chen, Xu Han, Xiaolong Wang, Hui Ma

Abstract:

Yi is an ethnic group mainly living in mainland China, with its own spoken and written language systems, after development of thousands of years. Ancient Yi is one of the six ancient languages in the world, which keeps a record of the history of the Yi people and offers documents valuable for research into human civilization. Recognition of the characters in ancient Yi helps to transform the documents into an electronic form, making their storage and spreading convenient. Due to historical and regional limitations, research on recognition of ancient characters is still inadequate. Thus, deep learning technology was applied to the recognition of such characters. Five models were developed on the basis of the four-layer convolutional neural network (CNN). Alpha-Beta divergence was taken as a penalty term to re-encode output neurons of the five models. Two fully connected layers fulfilled the compression of the features. Finally, at the softmax layer, the orthographic features of ancient Yi characters were re-evaluated, their probability distributions were obtained, and characters with features of the highest probability were recognized. Tests conducted show that the method has achieved higher precision compared with the traditional CNN model for handwriting recognition of the ancient Yi.

Keywords: recognition, CNN, Yi character, divergence

Procedia PDF Downloads 165
2797 Review of the Software Used for 3D Volumetric Reconstruction of the Liver

Authors: P. Strakos, M. Jaros, T. Karasek, T. Kozubek, P. Vavra, T. Jonszta

Abstract:

In medical imaging, segmentation of different areas of human body like bones, organs, tissues, etc. is an important issue. Image segmentation allows isolating the object of interest for further processing that can lead for example to 3D model reconstruction of whole organs. Difficulty of this procedure varies from trivial for bones to quite difficult for organs like liver. The liver is being considered as one of the most difficult human body organ to segment. It is mainly for its complexity, shape versatility and proximity of other organs and tissues. Due to this facts usually substantial user effort has to be applied to obtain satisfactory results of the image segmentation. Process of image segmentation then deteriorates from automatic or semi-automatic to fairly manual one. In this paper, overview of selected available software applications that can handle semi-automatic image segmentation with further 3D volume reconstruction of human liver is presented. The applications are being evaluated based on the segmentation results of several consecutive DICOM images covering the abdominal area of the human body.

Keywords: image segmentation, semi-automatic, software, 3D volumetric reconstruction

Procedia PDF Downloads 290
2796 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism

Authors: Gobinathan Devathasan, Kezia Devathasan

Abstract:

Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.

Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech

Procedia PDF Downloads 195
2795 Characterising the Processes Underlying Emotion Recognition Deficits in Adolescents with Conduct Disorder

Authors: Nayra Martin-Key, Erich Graf, Wendy Adams, Graeme Fairchild

Abstract:

Children and adolescents with Conduct Disorder (CD) have been shown to demonstrate impairments in emotion recognition, but it is currently unclear whether this deficit is related to specific emotions or whether it represents a global deficit in emotion recognition. An emotion recognition task with concurrent eye-tracking was employed to further explore this relationship in a sample of male and female adolescents with CD. Participants made emotion categorization judgements for presented dynamic and morphed static facial expressions. The results demonstrated that males with CD, and to a lesser extent, females with CD, displayed impaired facial expression recognition in general, whereas callous-unemotional (CU) traits were linked to specific problems in sadness recognition in females with CD. A region-of-interest analysis of the eye-tracking data indicated that males with CD exhibited reduced fixation times for the eye-region of the face compared to typically-developing (TD) females, but not TD males. Females with CD did not show reduced fixation to the eye-region of the face relative to TD females. In addition, CU traits did not influence CD subjects’ attention to the eye-region of the face. These findings suggest that the emotion recognition deficits found in CD males, the worst performing group in the behavioural tasks, are partly driven by reduced attention to the eyes.

Keywords: attention, callous-unemotional traits, conduct disorder, emotion recognition, eye-region, eye-tracking, sex differences

Procedia PDF Downloads 321
2794 A Motion Dictionary to Real-Time Recognition of Sign Language Alphabet Using Dynamic Time Warping and Artificial Neural Network

Authors: Marcio Leal, Marta Villamil

Abstract:

Computacional recognition of sign languages aims to allow a greater social and digital inclusion of deaf people through interpretation of their language by computer. This article presents a model of recognition of two of global parameters from sign languages; hand configurations and hand movements. Hand motion is captured through an infrared technology and its joints are built into a virtual three-dimensional space. A Multilayer Perceptron Neural Network (MLP) was used to classify hand configurations and Dynamic Time Warping (DWT) recognizes hand motion. Beyond of the method of sign recognition, we provide a dataset of hand configurations and motion capture built with help of fluent professionals in sign languages. Despite this technology can be used to translate any sign from any signs dictionary, Brazilian Sign Language (Libras) was used as case study. Finally, the model presented in this paper achieved a recognition rate of 80.4%.

Keywords: artificial neural network, computer vision, dynamic time warping, infrared, sign language recognition

Procedia PDF Downloads 217
2793 Investigation of New Gait Representations for Improving Gait Recognition

Authors: Chirawat Wattanapanich, Hong Wei

Abstract:

This study presents new gait representations for improving gait recognition accuracy on cross gait appearances, such as normal walking, wearing a coat and carrying a bag. Based on the Gait Energy Image (GEI), two ideas are implemented to generate new gait representations. One is to append lower knee regions to the original GEI, and the other is to apply convolutional operations to the GEI and its variants. A set of new gait representations are created and used for training multi-class Support Vector Machines (SVMs). Tests are conducted on the CASIA dataset B. Various combinations of the gait representations with different convolutional kernel size and different numbers of kernels used in the convolutional processes are examined. Both the entire images as features and reduced dimensional features by Principal Component Analysis (PCA) are tested in gait recognition. Interestingly, both new techniques, appending the lower knee regions to the original GEI and convolutional GEI, can significantly contribute to the performance improvement in the gait recognition. The experimental results have shown that the average recognition rate can be improved from 75.65% to 87.50%.

Keywords: convolutional image, lower knee, gait

Procedia PDF Downloads 202
2792 Offline Signature Verification in Punjabi Based On SURF Features and Critical Point Matching Using HMM

Authors: Rajpal Kaur, Pooja Choudhary

Abstract:

Biometrics, which refers to identifying an individual based on his or her physiological or behavioral characteristics, has the capabilities to the reliably distinguish between an authorized person and an imposter. The Signature recognition systems can categorized as offline (static) and online (dynamic). This paper presents Surf Feature based recognition of offline signatures system that is trained with low-resolution scanned signature images. The signature of a person is an important biometric attribute of a human being which can be used to authenticate human identity. However the signatures of human can be handled as an image and recognized using computer vision and HMM techniques. With modern computers, there is need to develop fast algorithms for signature recognition. There are multiple techniques are defined to signature recognition with a lot of scope of research. In this paper, (static signature) off-line signature recognition & verification using surf feature with HMM is proposed, where the signature is captured and presented to the user in an image format. Signatures are verified depended on parameters extracted from the signature using various image processing techniques. The Off-line Signature Verification and Recognition is implemented using Mat lab platform. This work has been analyzed or tested and found suitable for its purpose or result. The proposed method performs better than the other recently proposed methods.

Keywords: offline signature verification, offline signature recognition, signatures, SURF features, HMM

Procedia PDF Downloads 384
2791 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 516
2790 Convolutional Neural Networks-Optimized Text Recognition with Binary Embeddings for Arabic Expiry Date Recognition

Authors: Mohamed Lotfy, Ghada Soliman

Abstract:

Recognizing Arabic dot-matrix digits is a challenging problem due to the unique characteristics of dot-matrix fonts, such as irregular dot spacing and varying dot sizes. This paper presents an approach for recognizing Arabic digits printed in dot matrix format. The proposed model is based on Convolutional Neural Networks (CNN) that take the dot matrix as input and generate embeddings that are rounded to generate binary representations of the digits. The binary embeddings are then used to perform Optical Character Recognition (OCR) on the digit images. To overcome the challenge of the limited availability of dotted Arabic expiration date images, we developed a True Type Font (TTF) for generating synthetic images of Arabic dot-matrix characters. The model was trained on a synthetic dataset of 3287 images and 658 synthetic images for testing, representing realistic expiration dates from 2019 to 2027 in the format of yyyy/mm/dd. Our model achieved an accuracy of 98.94% on the expiry date recognition with Arabic dot matrix format using fewer parameters and less computational resources than traditional CNN-based models. By investigating and presenting our findings comprehensively, we aim to contribute substantially to the field of OCR and pave the way for advancements in Arabic dot-matrix character recognition. Our proposed approach is not limited to Arabic dot matrix digit recognition but can also be extended to text recognition tasks, such as text classification and sentiment analysis.

Keywords: computer vision, pattern recognition, optical character recognition, deep learning

Procedia PDF Downloads 94
2789 Analysis of Linguistic Disfluencies in Bilingual Children’s Discourse

Authors: Sheena Christabel Pravin, M. Palanivelan

Abstract:

Speech disfluencies are common in spontaneous speech. The primary purpose of this study was to distinguish linguistic disfluencies from stuttering disfluencies in bilingual Tamil–English (TE) speaking children. The secondary purpose was to determine whether their disfluencies are mediated by native language dominance and/or on an early onset of developmental stuttering at childhood. A detailed study was carried out to identify the prosodic and acoustic features that uniquely represent the disfluent regions of speech. This paper focuses on statistical modeling of repetitions, prolongations, pauses and interjections in the speech corpus encompassing bilingual spontaneous utterances from school going children – English and Tamil. Two classifiers including Hidden Markov Models (HMM) and the Multilayer Perceptron (MLP), which is a class of feed-forward artificial neural network, were compared in the classification of disfluencies. The results of the classifiers document the patterns of disfluency in spontaneous speech samples of school-aged children to distinguish between Children Who Stutter (CWS) and Children with Language Impairment CLI). The ability of the models in classifying the disfluencies was measured in terms of F-measure, Recall, and Precision.

Keywords: bi-lingual, children who stutter, children with language impairment, hidden markov models, multi-layer perceptron, linguistic disfluencies, stuttering disfluencies

Procedia PDF Downloads 217
2788 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 406
2787 Vision-Based Hand Segmentation Techniques for Human-Computer Interaction

Authors: M. Jebali, M. Jemni

Abstract:

This work is the part of vision based hand gesture recognition system for Natural Human Computer Interface. Hand tracking and segmentation are the primary steps for any hand gesture recognition system. The aim of this paper is to develop robust and efficient hand segmentation algorithm such as an input to another system which attempt to bring the HCI performance nearby the human-human interaction, by modeling an intelligent sign language recognition system based on prediction in the context of dialogue between the system (avatar) and the interlocutor. For the purpose of hand segmentation, an overcoming occlusion approach has been proposed for superior results for detection of hand from an image.

Keywords: HCI, sign language recognition, object tracking, hand segmentation

Procedia PDF Downloads 412
2786 An Erudite Technique for Face Detection and Recognition Using Curvature Analysis

Authors: S. Jagadeesh Kumar

Abstract:

Face detection and recognition is an authoritative technology for image database management, video surveillance, and human computer interface (HCI). Face recognition is a rapidly nascent method, which has been extensively discarded in forensics such as felonious identification, tenable entree, and custodial security. This paper recommends an erudite technique using curvature analysis (CA) that has less false positives incidence, operative in different light environments and confiscates the artifacts that are introduced during image acquisition by ring correction in polar coordinate (RCP) method. This technique affronts mean and median filtering technique to remove the artifacts but it works in polar coordinate during image acquisition. Investigational fallouts for face detection and recognition confirms decent recitation even in diagonal orientation and stance variation.

Keywords: curvature analysis, ring correction in polar coordinate method, face detection, face recognition, human computer interaction

Procedia PDF Downloads 287
2785 Emotional and Physiological Reaction While Listening the Speech of Adults Who Stutter

Authors: Xharavina V., Gallopeni F., Ahmeti K.

Abstract:

Stuttered speech is filled with intermittent sound prolongations and/or rapid part word repetitions. Oftentimes, these aberrant acoustic behaviors are associated with intermittent physical tension and struggle behaviors such as head jerks, arm jerks, finger tapping, excessive eye-blinks, etc. Additionally, the jarring nature of acoustic and physical manifestations that often accompanies moderate-severe stuttering may induce negative emotional responses in listeners, which alters communication between the person who stutters and their listeners. However, researches for the influence of negative emotions in the communication and for physical reaction are limited. Therefore, to compare psycho-physiological responses of fluent adults, while listening the speech of adults who speak fluency and adults who stutter, are necessary. This study comprises the experimental method, with total of 104 participants (average age-20 years old, SD=2.1), divided into 3 groups. All participants self-reported no impairments in speech, language, or hearing. Exploring the responses of the participants, there were used two records speeches; a voice who speaks fluently and the voice who stutters. Heartbeats and the pulse were measured by the digital blood pressure monitor called 'Tensoval', as a physiological response to the fluent and stuttering sample. Meanwhile, the emotional responses of participants were measured by the self-reporting questionnaire (Steenbarger, 2001). Results showed an increase in heartbeats during the stuttering speech compared with the fluent sample (p < 0.5). The listeners also self-reported themselves as more alive, unhappy, nervous, repulsive, sad, tense, distracted and upset when listening the stuttering words versus the words of the fluent adult (where it was reported to experience positive emotions). These data support the notions that speech with stuttering can bring a psycho-physical reaction to the listeners. Speech pathologists should be aware that listeners show intolerable physiological reactions to stuttering that remain visible over time.

Keywords: emotional, physiological, stuttering, fluent speech

Procedia PDF Downloads 143
2784 An Evaluation of Neural Network Efficacies for Image Recognition on Edge-AI Computer Vision Platform

Authors: Jie Zhao, Meng Su

Abstract:

Image recognition, as one of the most critical technologies in computer vision, works to help machine-like robotics understand a scene, that is, if deployed appropriately, will trigger the revolution in remote sensing and industry automation. With the developments of AI technologies, there are many prevailing and sophisticated neural networks as technologies developed for image recognition. However, computer vision platforms as hardware, supporting neural networks for image recognition, as crucial as the neural network technologies, need to be more congruently addressed as the research subjects. In contrast, different computer vision platforms are deterministic to leverage the performance of different neural networks for recognition. In this paper, three different computer vision platforms – Jetson Nano(with 4GB), a standalone laptop(with RTX 3000s, using CUDA), and Google Colab (web-based, using GPU) are explored and four prominent neural network architectures (including AlexNet, VGG(16/19), GoogleNet, and ResNet(18/34/50)), are investigated. In the context of pairwise usage between different computer vision platforms and distinctive neural networks, with the merits of recognition accuracy and time efficiency, the performances are evaluated. In the case study using public imageNets, our findings provide a nuanced perspective on optimizing image recognition tasks across Edge-AI platforms, offering guidance on selecting appropriate neural network structures to maximize performance under hardware constraints.

Keywords: alexNet, VGG, googleNet, resNet, Jetson nano, CUDA, COCO-NET, cifar10, imageNet large scale visual recognition challenge (ILSVRC), google colab

Procedia PDF Downloads 90
2783 Deep Learning Based Unsupervised Sport Scene Recognition and Highlights Generation

Authors: Ksenia Meshkova

Abstract:

With increasing amount of multimedia data, it is very important to automate and speed up the process of obtaining meta. This process means not just recognition of some object or its movement, but recognition of the entire scene versus separate frames and having timeline segmentation as a final result. Labeling datasets is time consuming, besides, attributing characteristics to particular scenes is clearly difficult due to their nature. In this article, we will consider autoencoders application to unsupervised scene recognition and clusterization based on interpretable features. Further, we will focus on particular types of auto encoders that relevant to our study. We will take a look at the specificity of deep learning related to information theory and rate-distortion theory and describe the solutions empowering poor interpretability of deep learning in media content processing. As a conclusion, we will present the results of the work of custom framework, based on autoencoders, capable of scene recognition as was deeply studied above, with highlights generation resulted out of this recognition. We will not describe in detail the mathematical description of neural networks work but will clarify the necessary concepts and pay attention to important nuances.

Keywords: neural networks, computer vision, representation learning, autoencoders

Procedia PDF Downloads 127
2782 Speech Acts of Selected Classroom Encounters: Analyzing the Speech Acts of a Career Technology Lesson

Authors: Michael Amankwaa Adu

Abstract:

Effective communication in the classroom plays a vital role in ensuring successful teaching and learning. In particular, the types of language and speech acts teachers use shape classroom interactions and influence student engagement. This study aims to analyze the speech acts employed by a Career Technology teacher in a junior high school. While much research has focused on speech acts in language classrooms, less attention has been given to how these acts operate in non-language subject areas like technical education. The study explores how different types of speech acts—directives, assertives, expressives, and commissives—are used during three classroom encounters: lesson introduction, content delivery, and classroom management. This research seeks to fill the gap in understanding how teachers of non-language subjects use speech acts to manage classroom dynamics and facilitate learning. The study employs a mixed-methods design, combining qualitative and quantitative approaches. Data was collected through direct classroom observation and audio recordings of a one-hour Career Technology lesson. The transcriptions of the lesson were analyzed using John Searle’s taxonomy of speech acts, classifying the teacher’s utterances into directives, assertives, expressives, and commissives. Results show that directives were the most frequently used speech act, accounting for 59.3% of the teacher's utterances. These speech acts were essential in guiding student behavior, giving instructions, and maintaining classroom control. Assertives made up 20.4% of the speech acts, primarily used for stating facts and reinforcing content. Expressives, at 14.2%, expressed emotions such as approval or frustration, helping to manage the emotional atmosphere of the classroom. Commissives were the least used, representing 6.2% of the speech acts, often used to set expectations or outline future actions. No declarations were observed during the lesson. The findings of this study reveal the critical role that speech acts play in managing classroom behavior and delivering content in technical subjects. Directives were crucial for ensuring students followed instructions and completed tasks, while assertives helped in reinforcing lesson objectives. Expressives contributed to motivating or disciplining students, and commissives, though less frequent, helped set clear expectations for students’ future actions. The absence of declarations suggests that the teacher prioritized guiding students over making formal pronouncements. These insights can inform teaching strategies across various subject areas, demonstrating that a diverse use of speech acts can create a balanced and interactive learning environment. This study contributes to the growing field of pragmatics in education and offers practical recommendations for educators, particularly in non-language classrooms, on how to utilize speech acts to enhance both classroom management and student engagement.

Keywords: classroom interaction, pragmatics, speech acts, teacher communication, career technology

Procedia PDF Downloads 21
2781 A Weighted Approach to Unconstrained Iris Recognition

Authors: Yao-Hong Tsai

Abstract:

This paper presents a weighted approach to unconstrained iris recognition. Nowadays, commercial systems are usually characterized by strong acquisition constraints based on the subject’s cooperation. However, it is not always achievable for real scenarios in our daily life. Researchers have been focused on reducing these constraints and maintaining the performance of the system by new techniques at the same time. With large variation in the environment, there are two main improvements to develop the proposed iris recognition system. For solving extremely uneven lighting condition, statistic based illumination normalization is first used on eye region to increase the accuracy of iris feature. The detection of the iris image is based on Adaboost algorithm. Secondly, the weighted approach is designed by Gaussian functions according to the distance to the center of the iris. Furthermore, local binary pattern (LBP) histogram is then applied to texture classification with the weight. Experiment showed that the proposed system provided users a more flexible and feasible way to interact with the verification system through iris recognition.

Keywords: authentication, iris recognition, adaboost, local binary pattern

Procedia PDF Downloads 225
2780 Reviewing Image Recognition and Anomaly Detection Methods Utilizing GANs

Authors: Agastya Pratap Singh

Abstract:

This review paper examines the emerging applications of generative adversarial networks (GANs) in the fields of image recognition and anomaly detection. With the rapid growth of digital image data, the need for efficient and accurate methodologies to identify and classify images has become increasingly critical. GANs, known for their ability to generate realistic data, have gained significant attention for their potential to enhance traditional image recognition systems and improve anomaly detection performance. The paper systematically analyzes various GAN architectures and their modifications tailored for image recognition tasks, highlighting their strengths and limitations. Additionally, it delves into the effectiveness of GANs in detecting anomalies in diverse datasets, including medical imaging, industrial inspection, and surveillance. The review also discusses the challenges faced in training GANs, such as mode collapse and stability issues, and presents recent advancements aimed at overcoming these obstacles.

Keywords: generative adversarial networks, image recognition, anomaly detection, synthetic data generation, deep learning, computer vision, unsupervised learning, pattern recognition, model evaluation, machine learning applications

Procedia PDF Downloads 26
2779 The Importance of the Historical Approach in the Linguistic Research

Authors: Zoran Spasovski

Abstract:

The paper shortly discusses the significance and the benefits of the historical approach in the research of languages by presenting examples of it in the fields of phonetics and phonology, lexicology, morphology, syntax, and even in the onomastics (toponomy and anthroponomy). The examples from the field of phonetics/phonology include insights into animal speech and its evolution into human speech, the evolution of the sounds of human speech from vocals to glides and consonants and from velar consonants to palatal, etc., on well-known examples of former researchers. Those from the field of lexicology show shortly the formation of the lexemes and their evolution; the morphology and syntax are explained by examples of the development of grammar and syntax forms, and the importance of the historical approach in the research of place-names and personal names is briefly outlined through examples of place-names and personal names and surnames, and the conclusions that come from it, in different languages.

Keywords: animal speech, glotogenesis, grammar forms, lexicology, place-names, personal names, surnames, syntax categories

Procedia PDF Downloads 85
2778 Efficient Feature Fusion for Noise Iris in Unconstrained Environment

Authors: Yao-Hong Tsai

Abstract:

This paper presents an efficient fusion algorithm for iris images to generate stable feature for recognition in unconstrained environment. Recently, iris recognition systems are focused on real scenarios in our daily life without the subject’s cooperation. Under large variation in the environment, the objective of this paper is to combine information from multiple images of the same iris. The result of image fusion is a new image which is more stable for further iris recognition than each original noise iris image. A wavelet-based approach for multi-resolution image fusion is applied in the fusion process. The detection of the iris image is based on Adaboost algorithm and then local binary pattern (LBP) histogram is then applied to texture classification with the weighting scheme. Experiment showed that the generated features from the proposed fusion algorithm can improve the performance for verification system through iris recognition.

Keywords: image fusion, iris recognition, local binary pattern, wavelet

Procedia PDF Downloads 367
2777 An Automated System for the Detection of Citrus Greening Disease Based on Visual Descriptors

Authors: Sidra Naeem, Ayesha Naeem, Sahar Rahim, Nadia Nawaz Qadri

Abstract:

Citrus greening is a bacterial disease that causes considerable damage to citrus fruits worldwide. Efficient method for this disease detection must be carried out to minimize the production loss. This paper presents a pattern recognition system that comprises three stages for the detection of citrus greening from Orange leaves: segmentation, feature extraction and classification. Image segmentation is accomplished by adaptive thresholding. The feature extraction stage comprises of three visual descriptors i.e. shape, color and texture. From shape feature we have used asymmetry index, from color feature we have used histogram of Cb component from YCbCr domain and from texture feature we have used local binary pattern. Classification was done using support vector machines and k nearest neighbors. The best performances of the system is Accuracy = 88.02% and AUROC = 90.1% was achieved by automatic segmented images. Our experiments validate that: (1). Segmentation is an imperative preprocessing step for computer assisted diagnosis of citrus greening, and (2). The combination of shape, color and texture features form a complementary set towards the identification of citrus greening disease.

Keywords: citrus greening, pattern recognition, feature extraction, classification

Procedia PDF Downloads 184
2776 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 574
2775 Gender Recognition with Deep Belief Networks

Authors: Xiaoqi Jia, Qing Zhu, Hao Zhang, Su Yang

Abstract:

A gender recognition system is able to tell the gender of the given person through a few of frontal facial images. An effective gender recognition approach enables to improve the performance of many other applications, including security monitoring, human-computer interaction, image or video retrieval and so on. In this paper, we present an effective method for gender classification task in frontal facial images based on deep belief networks (DBNs), which can pre-train model and improve accuracy a little bit. Our experiments have shown that the pre-training method with DBNs for gender classification task is feasible and achieves a little improvement of accuracy on FERET and CAS-PEAL-R1 facial datasets.

Keywords: gender recognition, beep belief net-works, semi-supervised learning, greedy-layer wise RBMs

Procedia PDF Downloads 453
2774 Improving Activity Recognition Classification of Repetitious Beginner Swimming Using a 2-Step Peak/Valley Segmentation Method with Smoothing and Resampling for Machine Learning

Authors: Larry Powell, Seth Polsley, Drew Casey, Tracy Hammond

Abstract:

Human activity recognition (HAR) systems have shown positive performance when recognizing repetitive activities like walking, running, and sleeping. Water-based activities are a reasonably new area for activity recognition. However, water-based activity recognition has largely focused on supporting the elite and competitive swimming population, which already has amazing coordination and proper form. Beginner swimmers are not perfect, and activity recognition needs to support the individual motions to help beginners. Activity recognition algorithms are traditionally built around short segments of timed sensor data. Using a time window input can cause performance issues in the machine learning model. The window’s size can be too small or large, requiring careful tuning and precise data segmentation. In this work, we present a method that uses a time window as the initial segmentation, then separates the data based on the change in the sensor value. Our system uses a multi-phase segmentation method that pulls all peaks and valleys for each axis of an accelerometer placed on the swimmer’s lower back. This results in high recognition performance using leave-one-subject-out validation on our study with 20 beginner swimmers, with our model optimized from our final dataset resulting in an F-Score of 0.95.

Keywords: time window, peak/valley segmentation, feature extraction, beginner swimming, activity recognition

Procedia PDF Downloads 123
2773 Frequency of Consonant Production Errors in Children with Speech Sound Disorder: A Retrospective-Descriptive Study

Authors: Amulya P. Rao, Prathima S., Sreedevi N.

Abstract:

Speech sound disorders (SSD) encompass the major concern in younger population of India with highest prevalence rate among the speech disorders. Children with SSD if not identified and rehabilitated at the earliest, are at risk for academic difficulties. This necessitates early identification using screening tools assessing the frequently misarticulated speech sounds. The literature on frequently misarticulated speech sounds is ample in English and other western languages targeting individuals with various communication disorders. Articulation is language specific, and there are limited studies reporting the same in Kannada, a Dravidian Language. Hence, the present study aimed to identify the frequently misarticulated consonants in Kannada and also to examine the error type. A retrospective, descriptive study was carried out using secondary data analysis of 41 participants (34-phonetic type and 7-phonemic type) with SSD in the age range 3-to 12-years. All the consonants of Kannada were analyzed by considering three words for each speech sound from the Kannada Diagnostic Photo Articulation test (KDPAT). Picture naming task was carried out, and responses were audio recorded. The recorded data were transcribed using IPA 2018 broad transcription. A criterion of 2/3 or 3/3 error productions was set to consider the speech sound to be an error. Number of error productions was calculated for each consonant in each participant. Then, the percentage of participants meeting the criteria were documented for each consonant to identify the frequently misarticulated speech sound. Overall results indicated that velar /k/ (48.78%) and /g/ (43.90%) were frequently misarticulated followed by voiced retroflex /ɖ/ (36.58%) and trill /r/ (36.58%). The lateral retroflex /ɭ/ was misarticulated by 31.70% of the children with SSD. Dentals (/t/, /n/), bilabials (/p/, /b/, /m/) and labiodental /v/ were produced correctly by all the participants. The highly misarticulated velars /k/ and /g/ were frequently substituted by dentals /t/ and /d/ respectively or omitted. Participants with SSD-phonemic type had multiple substitutions for one speech sound whereas, SSD-phonetic type had consistent single sound substitutions. Intra- and inter-judge reliability for 10% of the data using Cronbach’s Alpha revealed good reliability (0.8 ≤ α < 0.9). Analyzing a larger sample by replicating such studies will validate the present study results.

Keywords: consonant, frequently misarticulated, Kannada, SSD

Procedia PDF Downloads 134