Search results for: speaker recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1742

Search results for: speaker recognition

1232 Investigation of Flow Effects of Soundwaves Incident on an Airfoil

Authors: Thirsa Sherry, Utkarsh Shrivastav, Kannan B. T., Iynthezhuton K.

Abstract:

The field of aerodynamics and aeroacoustics remains one of the most poignant and well-researched fields of today. The current paper aims to investigate the predominant problem concerning the effects of noise of varying frequencies and waveforms on airflow surrounding an airfoil. Using a single speaker beneath the airfoil at different positions, we wish to simulate the effects of sound directly impinging on an airfoil and study its direct effects on airflow. We wish to study the same using smoke visualization methods with incense as our smoke-generating material in a variable-speed subsonic wind tunnel. Using frequencies and wavelengths similar to those of common engine noise, we wish to simulate real-world conditions of engine noise interfering with airflow and document the arising trends. These results will allow us to look into the real-world effects of noise on airflow and how to minimize them and expand on the possible relation between waveforms and noise. The parameters used in the study include frequency, Reynolds number, waveforms, angle of attack, and the effects on airflow when varying these parameters.

Keywords: engine noise, aeroacoustics, acoustic excitation, low speed

Procedia PDF Downloads 72
1231 Mirrors and Lenses: Multiple Views on Recognition in Holocaust Literature

Authors: Kirsten A. Bartels

Abstract:

There are a number of similarities between survivor literature and Holocaust fiction for children and young adults. The paper explores three facets of the parallels of recognition found specifically between Livia Bitton-Jackson’s memoir of her experience during the Holocaust as an inmate in Auschwitz, I Have Lived a Thousand Years (1999) and Morris Glietzman series of Holocaust fiction. While Bitton-Jackson reflects on her past and Glietzman designs a fictive character, both are judicious with what they are willing to impart, only providing information about their appearance or themselves when it impacts others or when it serves a necessary purpose to the story. Another similarity lies in another critical aspect of many works of Holocaust literature – the idea of being ‘representatively Jewish’. The authors come to this idea from different angles, perhaps best explained as the difference between showing and telling, for Bitton-Jackson provides personal details, and Gleitzman constructed Felix arguably with this idea in mind. Interwoven through their journeys is a shift in perspectives on being recognized -- from wanting to be seen as individuals to being seen as Jew. With this, being Jewish takes on different meaning, both youths struggle with being labeled as something they do not truly understand, and may have not truly identified with, from a label, to a death warrant. With survivor literature viewed as the most credible and worthwhile type of Holocaust literature and Holocaust fiction is often seen as the least (with children’s and young-adult being the lowest form) the similarities in approaches to telling the stories may go overlooked or be undervalued. This paper serves as an exploration in the some of parallel messages shared between the two.

Keywords: holocaust fiction, Holocaust literature, representatively Jewish, survivor literature

Procedia PDF Downloads 134
1230 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 47
1229 Using Deep Learning Real-Time Object Detection Convolution Neural Networks for Fast Fruit Recognition in the Tree

Authors: K. Bresilla, L. Manfrini, B. Morandi, A. Boini, G. Perulli, L. C. Grappadelli

Abstract:

Image/video processing for fruit in the tree using hard-coded feature extraction algorithms have shown high accuracy during recent years. While accurate, these approaches even with high-end hardware are computationally intensive and too slow for real-time systems. This paper details the use of deep convolution neural networks (CNNs), specifically an algorithm (YOLO - You Only Look Once) with 24+2 convolution layers. Using deep-learning techniques eliminated the need for hard-code specific features for specific fruit shapes, color and/or other attributes. This CNN is trained on more than 5000 images of apple and pear fruits on 960 cores GPU (Graphical Processing Unit). Testing set showed an accuracy of 90%. After this, trained data were transferred to an embedded device (Raspberry Pi gen.3) with camera for more portability. Based on correlation between number of visible fruits or detected fruits on one frame and the real number of fruits on one tree, a model was created to accommodate this error rate. Speed of processing and detection of the whole platform was higher than 40 frames per second. This speed is fast enough for any grasping/harvesting robotic arm or other real-time applications.

Keywords: artificial intelligence, computer vision, deep learning, fruit recognition, harvesting robot, precision agriculture

Procedia PDF Downloads 389
1228 Multilingual Females and Linguistic Change: A Quantitative and Qualitative Sociolinguistic Case Study of Minority Speaker in Southeast Asia

Authors: Stefanie Siebenhütter

Abstract:

Men and women use minority and majority languages differently and with varying confidence levels. This paper contrasts gendered differences in language use with socioeconomic status and age factors of minority language speakers in Southeast Asia. Language use and competence are conditioned by the variable of gender. Potential reasons for this variation by examining gendered language awareness and sociolinguistic attitudes will be given. Moreover, it is analyzed whether women in multilingual minority speakers’ society function as 'leaders of linguistic change', as represented in Labov’s sociolinguistic model. It is asked whether the societal role expectations in collectivistic cultures influence the model of linguistic change. The findings reveal speaking preferences and suggest predictions on the prospective language use, which is a stable situation of multilingualism. The study further exhibits differences between male and females identity-forming processes and shows why females are the leaders of (socio-) linguistic change.

Keywords: gender, identity construction, multilingual minorities, linguistic change, social networks

Procedia PDF Downloads 134
1227 Socioeconomic Status and Gender Influence on Linguistic Change: A Case Study on Language Competence and Confidence of Multilingual Minority Language Speakers

Authors: Stefanie Siebenhütter

Abstract:

Male and female speakers use language differently and with varying confidence levels. This paper contrasts gendered differences in language use with socioeconomic status and age factors. It specifically examines how Kui minority language use and competence are conditioned by the variable of gender and discusses potential reasons for this variation by examining gendered language awareness and sociolinguistic attitudes. Moreover, it discusses whether women in Kui society function as 'leaders of linguistic change', as represented in Labov’s sociolinguistic model. It discusses whether societal role expectations in collectivistic cultures influence the model of linguistic change. The findings reveal current Kui speaking preferences and give predictions on the prospective language use, which is a stable situation of multilingualism because the current Kui speakers will socialize and teach the prospective Kui speakers in the near future. It further confirms that Lao is losing importance in Kui speaker’s (female’s) daily life.

Keywords: gender, identity construction, language change, minority language, multilingualism, sociolinguistics, social Networks

Procedia PDF Downloads 148
1226 ‘Non-Legitimate’ Voices as L2 Models: Towards Becoming a Legitimate L2 Speaker

Authors: M. Rilliard

Abstract:

Based on a Multiliteracies-inspired and sociolinguistically-informed advanced French composition class, this study employed autobiographical narratives from speakers traditionally considered non-legitimate models for L2 teaching purposes of inspiring students to develop an authentic L2 voice and to see themselves as legitimate L2 speakers. Students explored their L2 identities in French through a self-inspired fictional character. Two autobiographical narratives of identity quest by non-traditional French speakers provided them guidance through this process: the novel Le Bleu des Abeilles (2013) and the film Qu’Allah Bénisse la France (2014). Written and French oral productions for different genres, as well as metalinguistic reflections in English, were collected and analyzed. Results indicate that ideas and materials that were relatable to students, namely relatable experiences and relatable language, were most useful to them in developing their L2 voices and achieving authentic and legitimate L2 speakership. These results point towards the benefits of using non-traditional speakers as pedagogical models, as they serve to legitimize students’ sense of their own L2-speakership, which ultimately leads them towards a better, more informed, mastery of the language.

Keywords: foreign language classroom, L2 identity, L2 learning and teaching, L2 writing, sociolinguistics

Procedia PDF Downloads 110
1225 Effective Nutrition Label Use on Smartphones

Authors: Vladimir Kulyukin, Tanwir Zaman, Sarat Kiran Andhavarapu

Abstract:

Research on nutrition label use identifies four factors that impede comprehension and retention of nutrition information by consumers: label’s location on the package, presentation of information within the label, label’s surface size, and surrounding visual clutter. In this paper, a system is presented that makes nutrition label use more effective for nutrition information comprehension and retention. The system’s front end is a smartphone application. The system’s back end is a four node Linux cluster for image recognition and data storage. Image frames captured on the smartphone are sent to the back end for skewed or aligned barcode recognition. When barcodes are recognized, corresponding nutrition labels are retrieved from a cloud database and presented to the user on the smartphone’s touchscreen. Each displayed nutrition label is positioned centrally on the touchscreen with no surrounding visual clutter. Wikipedia links to important nutrition terms are embedded to improve comprehension and retention of nutrition information. Standard touch gestures (e.g., zoom in/out) available on mainstream smartphones are used to manipulate the label’s surface size. The nutrition label database currently includes 200,000 nutrition labels compiled from public web sites by a custom crawler. Stress test experiments with the node cluster are presented. Implications for proactive nutrition management and food policy are discussed.

Keywords: mobile computing, cloud computing, nutrition label use, nutrition management, barcode scanning

Procedia PDF Downloads 345
1224 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: stacking, multi-layers, ensemble, multi-class

Procedia PDF Downloads 245
1223 Entrepreneurial Leadership in Malaysian Public University: Competency and Behavior in the Face of Institutional Adversity

Authors: Noorlizawati Abd Rahim, Zainai Mohamed, Zaidatun Tasir, Astuty Amrin, Haliyana Khalid, Nina Diana Nawi

Abstract:

Entrepreneurial leaders have been sought as in-demand talents to lead profit-driven organizations during turbulent and unprecedented times. However, research regarding the pertinence of their roles in the public sector has been limited. This paper examined the characteristics of the challenging experiences encountered by senior leaders in public universities that require them to embrace entrepreneurialism in their leadership. Through a focus group interview with five Malaysian university top senior leaders with experience being Vice-Chancellor, we explored and developed a framework of institutional adversity characteristics and exemplary entrepreneurial leadership competency in the face of adversity. Complexity of diverse stakeholders, multiplicity of academic disciplines, unfamiliarity to lead different and broader roles, leading new directions, and creating change in high velocity and uncertain environment are among the dimensions that characterise institutional adversities. Our findings revealed that learning agility, opportunity recognition capacity, and bridging capability are among the characteristics of entrepreneurial university leaders. The findings reinforced that the presence of specific attributes in institutional adversity and experiences in overcoming those challenges may contribute to the development of entrepreneurial leadership capabilities.

Keywords: bridging capability, entrepreneurial leadership, leadership development, learning agility, opportunity recognition, university leaders

Procedia PDF Downloads 90
1222 Optimization of the Dental Direct Digital Imaging by Applying the Self-Recognition Technology

Authors: Mina Dabirinezhad, Mohsen Bayat Pour, Amin Dabirinejad

Abstract:

This paper is intended to introduce the technology to solve some of the deficiencies of the direct digital radiology. Nowadays, digital radiology is the latest progression in dental imaging, which has become an essential part of dentistry. There are two main parts of the direct digital radiology comprised of an intraoral X-ray machine and a sensor (digital image receptor). The dentists and the dental nurses experience afflictions during the taking image process by the direct digital X-ray machine. For instance, sometimes they need to readjust the sensor in the mouth of the patient to take the X-ray image again due to the low quality of that. Another problem is, the position of the sensor may move in the mouth of the patient and it triggers off an inappropriate image for the dentists. It means that it is a time-consuming process for dentists or dental nurses. On the other hand, taking several the X-ray images brings some problems for the patient such as being harmful to their health and feeling pain in their mouth due to the pressure of the sensor to the jaw. The author provides a technology to solve the above-mentioned issues that is called “Self-Recognition Direct Digital Radiology” (SDDR). This technology is based on the principle that the intraoral X-ray machine is capable to diagnose the location of the sensor in the mouth of the patient automatically. In addition, to solve the aforementioned problems, SDDR technology brings out fewer environmental impacts in comparison to the previous version.

Keywords: Dental direct digital imaging, digital image receptor, digital x-ray machine, and environmental impacts

Procedia PDF Downloads 118
1221 Development of a New Characterization Method to Analyse Cypermethrin Penetration in Wood Material by Immunolabelling

Authors: Sandra Tapin-Lingua, Katia Ruel, Jean-Paul Joseleau, Daouia Messaoudi, Olivier Fahy, Michel Petit-Conil

Abstract:

The preservative efficacy of organic biocides is strongly related to their capacity of penetration and retention within wood tissues. The specific detection of the pyrethroid insecticide is currently obtained after extraction followed by chemical analysis by chromatography techniques. However visualizing the insecticide molecule within the wood structure requires specific probes together with microscopy techniques. Therefore, the aim of the present work was to apply a new methodology based on antibody-antigen recognition and electronic microscopy to visualize directly pyrethroids in the wood material. A polyclonal antibody directed against cypermethrin was developed and implement it on Pinus sylvestris wood samples coated with technical cypermethrin. The antibody was tested on impregnated wood and the specific recognition of the insecticide was visualized in transmission electron microscopy (TEM). The immunogold-TEM assay evidenced the capacity of the synthetic biocide to penetrate in the wood. The depth of penetration was measured on sections taken at increasing distances from the coated surface of the wood. Such results correlated with chemical analyzes carried out by GC-ECD after extraction. In addition, the immuno-TEM investigation allowed visualizing, for the first time at the ultrastructure scale of resolution, that cypermethrin was able to diffuse within the secondary wood cell walls.

Keywords: cypermethrin, insecticide, wood penetration, wood retention, immuno-transmission electron microscopy, polyclonal antibody

Procedia PDF Downloads 384
1220 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 128
1219 The Visible Third: Female Artists’ Participation in the Portuguese Contemporary Art World

Authors: Sonia Bernardo Correia

Abstract:

This paper is part of ongoing research that aims to understand the role of gender in the composition of the Portuguese contemporary art world and the possibilities and limits to the success of the professional paths of women and men artists. The field of visual arts is gender-sensitive as it differentiates the positions occupied by artists in terms of visibility and recognition. Women artists occupy a peripheral space, which may hinder the progression of their professional careers. Based on the collection of data on the participation of artists in Portuguese exhibitions, art fairs, auctions, and art awards between 2012 and 2019, the goal of this study is to portray female artists’ participation as a condition of professional, social, and cultural visibility. From the analysis of a significant sample of institutions from the artistic field, it was possible to observe that the works of female authors are under exhibited, never exceeding one-third of the total of exhibitions. Male artists also enjoy a comfortable majority as gallery artists (around 70%) and as part of institutional collections (around 80%). However, when analysing the younger age cohorts of artists by gender, it appears that there is representation parity, which may be a good sign of change. The data shows that there are persistent gender inequalities in accessing the artist profession. Women are not yet occupying positions of exposure, recognition, and legitimation in the market similar to those of their male counterparts, suggesting that they may face greater obstacles in experiencing successful professional trajectories.

Keywords: inequalities, invisibility of the woman artist, gender, visual arts

Procedia PDF Downloads 115
1218 Trusting Smart Speakers: Analysing the Different Levels of Trust between Technologies

Authors: Alec Wells, Aminu Bello Usman, Justin McKeown

Abstract:

The growing usage of smart speakers raises many privacy and trust concerns compared to other technologies such as smart phones and computers. In this study, a proxy measure of trust is used to gauge users’ opinions on three different technologies based on an empirical study, and to understand which technology most people are most likely to trust. The collected data were analysed using the Kruskal-Wallis H test to determine the statistical differences between the users’ trust level of the three technologies: smart speaker, computer and smart phone. The findings of the study revealed that despite the wide acceptance, ease of use and reputation of smart speakers, people find it difficult to trust smart speakers with their sensitive information via the Direct Voice Input (DVI) and would prefer to use a keyboard or touchscreen offered by computers and smart phones. Findings from this study can inform future work on users’ trust in technology based on perceived ease of use, reputation, perceived credibility and risk of using technologies via DVI.

Keywords: direct voice input, risk, security, technology, trust

Procedia PDF Downloads 160
1217 Sign Language Recognition of Static Gestures Using Kinect™ and Convolutional Neural Networks

Authors: Rohit Semwal, Shivam Arora, Saurav, Sangita Roy

Abstract:

This work proposes a supervised framework with deep convolutional neural networks (CNNs) for vision-based sign language recognition of static gestures. Our approach addresses the acquisition and segmentation of correct inputs for the CNN-based classifier. Microsoft Kinect™ sensor, despite complex environmental conditions, can track hands efficiently. Skin Colour based segmentation is applied on cropped images of hands in different poses, used to depict different sign language gestures. The segmented hand images are used as an input for our classifier. The CNN classifier proposed in the paper is able to classify the input images with a high degree of accuracy. The system was trained and tested on 39 static sign language gestures, including 26 letters of the alphabet and 13 commonly used words. This paper includes a problem definition for building the proposed system, which acts as a sign language translator between deaf/mute and the rest of the society. It is then followed by a focus on reviewing existing knowledge in the area and work done by other researchers. It also describes the working principles behind different components of CNNs in brief. The architecture and system design specifications of the proposed system are discussed in the subsequent sections of the paper to give the reader a clear picture of the system in terms of the capability required. The design then gives the top-level details of how the proposed system meets the requirements.

Keywords: sign language, CNN, HCI, segmentation

Procedia PDF Downloads 122
1216 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition

Procedia PDF Downloads 201
1215 Real-Time Gesture Recognition System Using Microsoft Kinect

Authors: Ankita Wadhawan, Parteek Kumar, Umesh Kumar

Abstract:

Gesture is any body movement that expresses some attitude or any sentiment. Gestures as a sign language are used by deaf people for conveying messages which helps in eliminating the communication barrier between deaf people and normal persons. Nowadays, everybody is using mobile phone and computer as a very important gadget in their life. But there are some physically challenged people who are blind/deaf and the use of mobile phone or computer like device is very difficult for them. So, there is an immense need of a system which works on body gesture or sign language as input. In this research, Microsoft Kinect Sensor, SDK V2 and Hidden Markov Toolkit (HTK) are used to recognize the object, motion of object and human body joints through Touch less NUI (Natural User Interface) in real-time. The depth data collected from Microsoft Kinect has been used to recognize gestures of Indian Sign Language (ISL). The recorded clips are analyzed using depth, IR and skeletal data at different angles and positions. The proposed system has an average accuracy of 85%. The developed Touch less NUI provides an interface to recognize gestures and controls the cursor and click operation in computer just by waving hand gesture. This research will help deaf people to make use of mobile phones, computers and socialize among other persons in the society.

Keywords: gesture recognition, Indian sign language, Microsoft Kinect, natural user interface, sign language

Procedia PDF Downloads 282
1214 Impact of Integrated Signals for Doing Human Activity Recognition Using Deep Learning Models

Authors: Milagros Jaén-Vargas, Javier García Martínez, Karla Miriam Reyes Leiva, María Fernanda Trujillo-Guerrero, Francisco Fernandes, Sérgio Barroso Gonçalves, Miguel Tavares Silva, Daniel Simões Lopes, José Javier Serrano Olmedo

Abstract:

Human Activity Recognition (HAR) is having a growing impact in creating new applications and is responsible for emerging new technologies. Also, the use of wearable sensors is an important key to exploring the human body's behavior when performing activities. Hence, the use of these dispositive is less invasive and the person is more comfortable. In this study, a database that includes three activities is used. The activities were acquired from inertial measurement unit sensors (IMU) and motion capture systems (MOCAP). The main objective is differentiating the performance from four Deep Learning (DL) models: Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and hybrid model Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), when considering acceleration, velocity and position and evaluate if integrating the IMU acceleration to obtain velocity and position represent an increment in performance when it works as input to the DL models. Moreover, compared with the same type of data provided by the MOCAP system. Despite the acceleration data is cleaned when integrating, results show a minimal increase in accuracy for the integrated signals.

Keywords: HAR, IMU, MOCAP, acceleration, velocity, position, feature maps

Procedia PDF Downloads 67
1213 Correlation between Defect Suppression and Biosensing Capability of Hydrothermally Grown ZnO Nanorods

Authors: Mayoorika Shukla, Pramila Jakhar, Tejendra Dixit, I. A. Palani, Vipul Singh

Abstract:

Biosensors are analytical devices with wide range of applications in biological, chemical, environmental and clinical analysis. It comprises of bio-recognition layer which has biomolecules (enzymes, antibodies, DNA, etc.) immobilized over it for detection of analyte and transducer which converts the biological signal into the electrical signal. The performance of biosensor primarily the depends on the bio-recognition layer and therefore it has to be chosen wisely. In this regard, nanostructures of metal oxides such as ZnO, SnO2, V2O5, and TiO2, etc. have been explored extensively as bio-recognition layer. Recently, ZnO has the attracted attention of researchers due to its unique properties like high iso-electric point, biocompatibility, stability, high electron mobility and high electron binding energy, etc. Although there have been many reports on usage of ZnO as bio-recognition layer but to the authors’ knowledge, none has ever observed correlation between optical properties like defect suppression and biosensing capability of the sensor. Here, ZnO nanorods (ZNR) have been synthesized by a low cost, simple and low-temperature hydrothermal growth process, over Platinum (Pt) coated glass substrate. The ZNR have been synthesized in two steps viz. initially a seed layer was coated over substrate (Pt coated glass) followed by immersion of it into nutrient solution of Zinc nitrate and Hexamethylenetetramine (HMTA) with in situ addition of KMnO4. The addition of KMnO4 was observed to have a profound effect over the growth rate anisotropy of ZnO nanostructures. Clustered and powdery growth of ZnO was observed without addition of KMnO4, although by addition of it during the growth, uniform and crystalline ZNR were found to be grown over the substrate. Moreover, the same has resulted in suppression of defects as observed by Normalized Photoluminescence (PL) spectra since KMnO4 is a strong oxidizing agent which provides an oxygen rich growth environment. Further, to explore the correlation between defect suppression and biosensing capability of the ZNR Glucose oxidase (Gox) was immobilized over it, using physical adsorption technique followed by drop casting of nafion. Here the main objective of the work was to analyze effect of defect suppression over biosensing capability, and therefore Gox has been chosen as model enzyme, and electrochemical amperometric glucose detection was performed. The incorporation of KMnO4 during growth has resulted in variation of optical and charge transfer properties of ZNR which in turn were observed to have deep impact on biosensor figure of merits. The sensitivity of biosensor was found to increase by 12-18 times, due to variations introduced by addition of KMnO4 during growth. The amperometric detection of glucose in continuously stirred buffer solution was performed. Interestingly, defect suppression has been observed to contribute towards the improvement of biosensor performance. The detailed mechanism of growth of ZNR along with the overall influence of defect suppression on the sensing capabilities of the resulting enzymatic electrochemical biosensor and different figure of merits of the biosensor (Glass/Pt/ZNR/Gox/Nafion) will be discussed during the conference.

Keywords: biosensors, defects, KMnO4, ZnO nanorods

Procedia PDF Downloads 258
1212 Highly Accurate Target Motion Compensation Using Entropy Function Minimization

Authors: Amin Aghatabar Roodbary, Mohammad Hassan Bastani

Abstract:

One of the defects of stepped frequency radar systems is their sensitivity to target motion. In such systems, target motion causes range cell shift, false peaks, Signal to Noise Ratio (SNR) reduction and range profile spreading because of power spectrum interference of each range cell in adjacent range cells which induces distortion in High Resolution Range Profile (HRRP) and disrupt target recognition process. Thus Target Motion Parameters (TMPs) effects compensation should be employed. In this paper, such a method for estimating TMPs (velocity and acceleration) and consequently eliminating or suppressing the unwanted effects on HRRP based on entropy minimization has been proposed. This method is carried out in two major steps: in the first step, a discrete search method has been utilized over the whole acceleration-velocity lattice network, in a specific interval seeking to find a less-accurate minimum point of the entropy function. Then in the second step, a 1-D search over velocity is done in locus of the minimum for several constant acceleration lines, in order to enhance the accuracy of the minimum point found in the first step. The provided simulation results demonstrate the effectiveness of the proposed method.

Keywords: automatic target recognition (ATR), high resolution range profile (HRRP), motion compensation, stepped frequency waveform technique (SFW), target motion parameters (TMPs)

Procedia PDF Downloads 131
1211 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 123
1210 Human Gesture Recognition for Real-Time Control of Humanoid Robot

Authors: S. Aswath, Chinmaya Krishna Tilak, Amal Suresh, Ganesh Udupa

Abstract:

There are technologies to control a humanoid robot in many ways. But the use of Electromyogram (EMG) electrodes has its own importance in setting up the control system. The EMG based control system helps to control robotic devices with more fidelity and precision. In this paper, development of an electromyogram based interface for human gesture recognition for the control of a humanoid robot is presented. To recognize control signs in the gestures, a single channel EMG sensor is positioned on the muscles of the human body. Instead of using a remote control unit, the humanoid robot is controlled by various gestures performed by the human. The EMG electrodes attached to the muscles generates an analog signal due to the effect of nerve impulses generated on moving muscles of the human being. The analog signals taken up from the muscles are supplied to a differential muscle sensor that processes the given signal to generate a signal suitable for the microcontroller to get the control over a humanoid robot. The signal from the differential muscle sensor is converted to a digital form using the ADC of the microcontroller and outputs its decision to the CM-530 humanoid robot controller through a Zigbee wireless interface. The output decision of the CM-530 processor is sent to a motor driver in order to control the servo motors in required direction for human like actions. This method for gaining control of a humanoid robot could be used for performing actions with more accuracy and ease. In addition, a study has been conducted to investigate the controllability and ease of use of the interface and the employed gestures.

Keywords: electromyogram, gesture, muscle sensor, humanoid robot, microcontroller, Zigbee

Procedia PDF Downloads 383
1209 Variations of Metaphors: Wittgenstein's Contribution to Literary Studies

Authors: Dorit Lemberger

Abstract:

Wittgenstein directly used the term "metaphor" only infrequently and with reservations, but his writings include a number of metaphors that have become imprinted in the philosophical memory of Western thought. For example, the ladder in his book Tractatus, or in Philosophical investigations - the ancient city, the beetle in a box, the fly in the fly-bottle, and the duck-rabbit. In light of Wittgenstein's stressing, throughout his investigations, that the only language that exists is ordinary language, and that there is no "second-order" language, the question should be asked: How do these metaphors function, specifically, and in general, how are we to relate to language use that exceeds the normal? Wittgenstein did not disregard such phenomena, but he proposed viewing them in a different way, that would enable understanding them as uses in ordinary language, without necessarily exceeding such language. Two important terms that he coined in this context are "secondary sense" and "experience of meaning". Each denotes language use as reflective of a subjective element characteristic of the speaker, such as intent, experience, or emphasis of a certain aspect. More recent Wittgenstein scholars added the term "quasi-metaphor", that refers to his discussion of the possibility of aesthetic judgment. This paper will examine how, according to Wittgenstein, these terms function without exceeding ordinary language, and will illustrate how they can be applied, in an analysis of the poem "Butterfly" by Nelly Sachs.

Keywords: metaphor, quasi-metaphor, secondary sense, experience of meaning

Procedia PDF Downloads 412
1208 Italian Speech Vowels Landmark Detection through the Legacy Tool 'xkl' with Integration of Combined CNNs and RNNs

Authors: Kaleem Kashif, Tayyaba Anam, Yizhi Wu

Abstract:

This paper introduces a methodology for advancing Italian speech vowels landmark detection within the distinctive feature-based speech recognition domain. Leveraging the legacy tool 'xkl' by integrating combined convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the study presents a comprehensive enhancement to the 'xkl' legacy software. This integration incorporates re-assigned spectrogram methodologies, enabling meticulous acoustic analysis. Simultaneously, our proposed model, integrating combined CNNs and RNNs, demonstrates unprecedented precision and robustness in landmark detection. The augmentation of re-assigned spectrogram fusion within the 'xkl' software signifies a meticulous advancement, particularly enhancing precision related to vowel formant estimation. This augmentation catalyzes unparalleled accuracy in landmark detection, resulting in a substantial performance leap compared to conventional methods. The proposed model emerges as a state-of-the-art solution in the distinctive feature-based speech recognition systems domain. In the realm of deep learning, a synergistic integration of combined CNNs and RNNs is introduced, endowed with specialized temporal embeddings, harnessing self-attention mechanisms, and positional embeddings. The proposed model allows it to excel in capturing intricate dependencies within Italian speech vowels, rendering it highly adaptable and sophisticated in the distinctive feature domain. Furthermore, our advanced temporal modeling approach employs Bayesian temporal encoding, refining the measurement of inter-landmark intervals. Comparative analysis against state-of-the-art models reveals a substantial improvement in accuracy, highlighting the robustness and efficacy of the proposed methodology. Upon rigorous testing on a database (LaMIT) speech recorded in a silent room by four Italian native speakers, the landmark detector demonstrates exceptional performance, achieving a 95% true detection rate and a 10% false detection rate. A majority of missed landmarks were observed in proximity to reduced vowels. These promising results underscore the robust identifiability of landmarks within the speech waveform, establishing the feasibility of employing a landmark detector as a front end in a speech recognition system. The synergistic integration of re-assigned spectrogram fusion, CNNs, RNNs, and Bayesian temporal encoding not only signifies a significant advancement in Italian speech vowels landmark detection but also positions the proposed model as a leader in the field. The model offers distinct advantages, including unparalleled accuracy, adaptability, and sophistication, marking a milestone in the intersection of deep learning and distinctive feature-based speech recognition. This work contributes to the broader scientific community by presenting a methodologically rigorous framework for enhancing landmark detection accuracy in Italian speech vowels. The integration of cutting-edge techniques establishes a foundation for future advancements in speech signal processing, emphasizing the potential of the proposed model in practical applications across various domains requiring robust speech recognition systems.

Keywords: landmark detection, acoustic analysis, convolutional neural network, recurrent neural network

Procedia PDF Downloads 27
1207 Visual Speech Perception of Arabic Emphatics

Authors: Maha Saliba Foster

Abstract:

Speech perception has been recognized as a bi-sensory process involving the auditory and visual channels. Compared to the auditory modality, the contribution of the visual signal to speech perception is not very well understood. Studying how the visual modality affects speech recognition can have pedagogical implications in second language learning, as well as clinical application in speech therapy. The current investigation explores the potential effect of speech visual cues on the perception of Arabic emphatics (AEs). The corpus consists of 36 minimal pairs each containing two contrasting consonants, an AE versus a non-emphatic (NE). Movies of four Lebanese speakers were edited to allow perceivers to have partial view of facial regions: lips only, lips-cheeks, lips-chin, lips-cheeks-chin, lips-cheeks-chin-neck. In the absence of any auditory information and relying solely on visual speech, perceivers were above chance at correctly identifying AEs or NEs across vowel contexts; moreover, the models were able to predict the probability of perceivers’ accuracy in identifying some of the COIs produced by certain speakers; additionally, results showed an overlap between the measurements selected by the computer and those selected by human perceivers. The lack of significant face effect on the perception of AEs seems to point to the lips, present in all of the videos, as the most important and often sufficient facial feature for emphasis recognition. Future investigations will aim at refining the analyses of visual cues used by perceivers by using Principal Component Analysis and including time evolution of facial feature measurements.

Keywords: Arabic emphatics, machine learning, speech perception, visual speech perception

Procedia PDF Downloads 280
1206 Spatial Object-Oriented Template Matching Algorithm Using Normalized Cross-Correlation Criterion for Tracking Aerial Image Scene

Authors: Jigg Pelayo, Ricardo Villar

Abstract:

Leaning on the development of aerial laser scanning in the Philippine geospatial industry, researches about remote sensing and machine vision technology became a trend. Object detection via template matching is one of its application which characterized to be fast and in real time. The paper purposely attempts to provide application for robust pattern matching algorithm based on the normalized cross correlation (NCC) criterion function subjected in Object-based image analysis (OBIA) utilizing high-resolution aerial imagery and low density LiDAR data. The height information from laser scanning provides effective partitioning order, thus improving the hierarchal class feature pattern which allows to skip unnecessary calculation. Since detection is executed in the object-oriented platform, mathematical morphology and multi-level filter algorithms were established to effectively avoid the influence of noise, small distortion and fluctuating image saturation that affect the rate of recognition of features. Furthermore, the scheme is evaluated to recognized the performance in different situations and inspect the computational complexities of the algorithms. Its effectiveness is demonstrated in areas of Misamis Oriental province, achieving an overall accuracy of 91% above. Also, the garnered results portray the potential and efficiency of the implemented algorithm under different lighting conditions.

Keywords: algorithm, LiDAR, object recognition, OBIA

Procedia PDF Downloads 224
1205 Machine Learning Strategies for Data Extraction from Unstructured Documents in Financial Services

Authors: Delphine Vendryes, Dushyanth Sekhar, Baojia Tong, Matthew Theisen, Chester Curme

Abstract:

Much of the data that inform the decisions of governments, corporations and individuals are harvested from unstructured documents. Data extraction is defined here as a process that turns non-machine-readable information into a machine-readable format that can be stored, for instance, in a database. In financial services, introducing more automation in data extraction pipelines is a major challenge. Information sought by financial data consumers is often buried within vast bodies of unstructured documents, which have historically required thorough manual extraction. Automated solutions provide faster access to non-machine-readable datasets, in a context where untimely information quickly becomes irrelevant. Data quality standards cannot be compromised, so automation requires high data integrity. This multifaceted task is broken down into smaller steps: ingestion, table parsing (detection and structure recognition), text analysis (entity detection and disambiguation), schema-based record extraction, user feedback incorporation. Selected intermediary steps are phrased as machine learning problems. Solutions leveraging cutting-edge approaches from the fields of computer vision (e.g. table detection) and natural language processing (e.g. entity detection and disambiguation) are proposed.

Keywords: computer vision, entity recognition, finance, information retrieval, machine learning, natural language processing

Procedia PDF Downloads 90
1204 Automatic Reporting System for Transcriptome Indel Identification and Annotation Based on Snapshot of Next-Generation Sequencing Reads Alignment

Authors: Shuo Mu, Guangzhi Jiang, Jinsa Chen

Abstract:

The analysis of Indel for RNA sequencing of clinical samples is easily affected by sequencing experiment errors and software selection. In order to improve the efficiency and accuracy of analysis, we developed an automatic reporting system for Indel recognition and annotation based on image snapshot of transcriptome reads alignment. This system includes sequence local-assembly and realignment, target point snapshot, and image-based recognition processes. We integrated high-confidence Indel dataset from several known databases as a training set to improve the accuracy of image processing and added a bioinformatical processing module to annotate and filter Indel artifacts. Subsequently, the system will automatically generate data, including data quality levels and images results report. Sanger sequencing verification of the reference Indel mutation of cell line NA12878 showed that the process can achieve 83% sensitivity and 96% specificity. Analysis of the collected clinical samples showed that the interpretation accuracy of the process was equivalent to that of manual inspection, and the processing efficiency showed a significant improvement. This work shows the feasibility of accurate Indel analysis of clinical next-generation sequencing (NGS) transcriptome. This result may be useful for RNA study for clinical samples with microsatellite instability in immunotherapy in the future.

Keywords: automatic reporting, indel, next-generation sequencing, NGS, transcriptome

Procedia PDF Downloads 156
1203 Acoustic Characteristics of Ḫijaiyaḫ Letters Pronunciation by Indonesian Native Speaker

Authors: Romi Hardiyansyah, Raden Sugeng Joko Sarwono, Agus Samsi

Abstract:

Indonesian people have a mother language but not Arabic. Meanwhile, they must be able to pronounce the Arabic because Islam is the biggest religion in Indonesia. Arabic is composed by ḫijaiyaḫ letters which has its own pronunciation. Sound production process in humans can be divided into three physiological processes, namely: the formation of airflow from the lungs, the change in airflow from the lungs into the sound, and articulation (the modulation/sound setting into a specific sound). Ḫijaiyaḫ letters has its own articulation, some of which seem strange for most people in Indonesia. Those letters come out from the middle and upper throat so that the letters has its own acoustic characteristics. Acoustic characteristics of voice can be observed by source-filter approach that has parameters: pitch, formant, and formant bandwidth. Pitch is the basic tone in every human being. Formant is the resonance frequency of the human voice. Formant bandwidth is the time-width of a formant. After recording the sound from 21 subjects, data is processed by software Praat version 5.3.39. The analysis showed that each pronunciation, syakal (vowel changer), and the place of discharge letters has the same timbre which are determined by third and fourth formant.

Keywords: ḫijaiyaḫ, articulation, pitch, formant, formant bandwidth, timbre

Procedia PDF Downloads 370