Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1712

Search results for: icon recognition

1232 An Event-Related Potential Study of Individual Differences in Word Recognition: The Evidence from Morphological Knowledge of Sino-Korean Prefixes

Authors: Jinwon Kang, Seonghak Jo, Joohee Ahn, Junghye Choi, Sun-Young Lee

Abstract:

A morphological priming has proved its importance by showing that segmentation occurs in morphemes when visual words are recognized within a noticeably short time. Regarding Sino-Korean prefixes, this study conducted an experiment on visual masked priming tasks with 57 ms stimulus-onset asynchrony (SOA) to see how individual differences in the amount of morphological knowledge affect morphological priming. The relationship between the prime and target words were classified as morphological (e.g., 미개척 migaecheog [unexplored] – 미해결 mihaegyel [unresolved]), semantical (e.g., 친환경 chinhwangyeong [eco-friendly]) – 무공해 mugonghae [no-pollution]), and orthographical (e.g., 미용실 miyongsil [beauty shop] – 미확보 mihwagbo [uncertainty]) conditions. We then compared the priming by configuring irrelevant paired stimuli for each condition’s control group. As a result, in the behavioral data, we observed facilitatory priming from a group with high morphological knowledge only under the morphological condition. In contrast, a group with low morphological knowledge showed the priming only under the orthographic condition. In the event-related potential (ERP) data, the group with high morphological knowledge presented the N250 only under the morphological condition. The findings of this study imply that individual differences in morphological knowledge in Korean may have a significant influence on the segmental processing of Korean word recognition.

Keywords: ERP, individual differences, morphological priming, sino-Korean prefixes

Procedia PDF Downloads 215

1231 Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks

Authors: B. R. Campomanes-Alvarez, P. Quiros, B. Fernandez

Abstract:

Automatic Speech Recognition (ASR) is a machine-based process of decoding and transcribing oral speech. A typical ASR system receives acoustic input from a speaker or an audio file, analyzes it using algorithms, and produces an output in the form of a text. Some speech recognition systems use Hidden Markov Models (HMMs) to deal with the temporal variability of speech and Gaussian Mixture Models (GMMs) to determine how well each state of each HMM fits a short window of frames of coefficients that represents the acoustic input. Another way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition systems. Acoustic models for state-of-the-art ASR systems are usually training on massive amounts of data. However, audio files with their corresponding transcriptions can be difficult to obtain, especially in the Spanish language. Hence, in the case of these low-resource scenarios, building an ASR model is considered as a complex task due to the lack of labeled data, resulting in an under-trained system. Semi-supervised learning approaches arise as necessary tasks given the high cost of transcribing audio data. The main goal of this proposal is to develop a procedure based on acoustic semi-supervised learning for Spanish ASR systems by using DNNs. This semi-supervised learning approach consists of: (a) Training a seed ASR model with a DNN using a set of audios and their respective transcriptions. A DNN with a one-hidden-layer network was initialized; increasing the number of hidden layers in training, to a five. A refinement, which consisted of the weight matrix plus bias term and a Stochastic Gradient Descent (SGD) training were also performed. The objective function was the cross-entropy criterion. (b) Decoding/testing a set of unlabeled data with the obtained seed model. (c) Selecting a suitable subset of the validated data to retrain the seed model, thereby improving its performance on the target test set. To choose the most precise transcriptions, three confidence scores or metrics, regarding the lattice concept (based on the graph cost, the acoustic cost and a combination of both), was performed as selection technique. The performance of the ASR system will be calculated by means of the Word Error Rate (WER). The test dataset was renewed in order to extract the new transcriptions added to the training dataset. Some experiments were carried out in order to select the best ASR results. A comparison between a GMM-based model without retraining and the DNN proposed system was also made under the same conditions. Results showed that the semi-supervised ASR-model based on DNNs outperformed the GMM-model, in terms of WER, in all tested cases. The best result obtained an improvement of 6% relative WER. Hence, these promising results suggest that the proposed technique could be suitable for building ASR models in low-resource environments.

Keywords: automatic speech recognition, deep neural networks, machine learning, semi-supervised learning

Procedia PDF Downloads 339

1230 An Ontological Approach to Existentialist Theatre and Theatre of the Absurd in the Works of Jean-Paul Sartre and Samuel Beckett

Authors: Gülten Silindir Keretli

Abstract:

The aim of this study is to analyse the works of playwrights within the framework of existential philosophy. It is to observe the ontological existence in the plays of No Exit and Endgame. Literary works will be discussed separately in each section of this study. The despair of post-war generation of Europe problematized the ‘human condition’ in every field of literature which is the very product of social upheaval. With this concern in his mind, Sartre’s creative works portrayed man as a lonely being, burdened with terrifying freedom to choose and create his own meaning in an apparently meaningless world. The traces of the existential thought are to be found throughout the history of philosophy and literature. On the other hand, the theatre of the absurd is a form of drama showing the absurdity of the human condition and it is heavily influenced by the existential philosophy. Beckett is the most influential playwright of the theatre of the absurd. The themes and thoughts in his plays share many tenets of the existential philosophy. The existential philosophy posits the meaninglessness of existence and it regards man as being thrown into the universe and into desolate isolation. To overcome loneliness and isolation, the human ego needs recognition from the other people. Sartre calls this need of recognition as the need for ‘the Look’ (Le regard) from the Other. In this paper, existentialist philosophy and existentialist angst will be elaborated and then the works of existentialist theatre and theatre of absurd will be discussed within the framework of existential philosophy.

Keywords: consciousness, existentialism, the notion of the absurd, the other

Procedia PDF Downloads 158

1229 Automatic Target Recognition in SAR Images Based on Sparse Representation Technique

Authors: Ahmet Karagoz, Irfan Karagoz

Abstract:

Synthetic Aperture Radar (SAR) is a radar mechanism that can be integrated into manned and unmanned aerial vehicles to create high-resolution images in all weather conditions, regardless of day and night. In this study, SAR images of military vehicles with different azimuth and descent angles are pre-processed at the first stage. The main purpose here is to reduce the high speckle noise found in SAR images. For this, the Wiener adaptive filter, the mean filter, and the median filters are used to reduce the amount of speckle noise in the images without causing loss of data. During the image segmentation phase, pixel values are ordered so that the target vehicle region is separated from other regions containing unnecessary information. The target image is parsed with the brightest 20% pixel value of 255 and the other pixel values of 0. In addition, by using appropriate parameters of statistical region merging algorithm, segmentation comparison is performed. In the step of feature extraction, the feature vectors belonging to the vehicles are obtained by using Gabor filters with different orientation, frequency and angle values. A number of Gabor filters are created by changing the orientation, frequency and angle parameters of the Gabor filters to extract important features of the images that form the distinctive parts. Finally, images are classified by sparse representation method. In the study, l₁ norm analysis of sparse representation is used. A joint database of the feature vectors generated by the target images of military vehicle types is obtained side by side and this database is transformed into the matrix form. In order to classify the vehicles in a similar way, the test images of each vehicle is converted to the vector form and l₁ norm analysis of the sparse representation method is applied through the existing database matrix form. As a result, correct recognition has been performed by matching the target images of military vehicles with the test images by means of the sparse representation method. 97% classification success of SAR images of different military vehicle types is obtained.

Keywords: automatic target recognition, sparse representation, image classification, SAR images

Procedia PDF Downloads 366

1228 The Development of Integrated Real-Life Video and Animation with Addie Based on Constructive for Improving Students’ Mastery Concept in Rotational Dynamics

Authors: Silka Abyadati, Dadi Rusdiana, Enjang Akhmad Juanda

Abstract:

This study aims to investigate the students’ mastery concepts enhancement between students who are studying by using Integrated Real-Life Video and Animation (IRVA) and students who are studying without using IRVA. The development of IRVA is conducted by five stages: Analyze, Design, Development, Implementation and Evaluation (ADDIE) based on constructivist for Rotational Dynamics material in Physics learning. A constructivist model-based learning used is Interpretation Construction (ICON), which has the following phases: 1) Observation, 2) Construction interpretation, 3) Contextualization prior knowledge, 4) Conflict cognitive, 5) Learning cognitive, 6) Collaboration, 7) Multiple interpretation, 8) Multiple manifestation. The IRVA is developed for the stages of observation, cognitive conflict and cognitive learning. The sample of this study consisted of 32 students experimental group and a control group of 32 students in class XI of the school year 2015/2016 in one of Senior High Schools Bandung. The study was conducted by giving the pretest and posttest in the form of 20 items of multiple choice questions to determine the enhancement of mastery concept of Rotational Dynamics. Hypothesis testing is done by using T-test on the value of N-gain average of mastery concepts. The results showed that there is a significant difference in an enhancement of students’ mastery concepts between students who are studying by using IRVA and students who are studying without IRVA. Students in the experimental group increased by 0.468 while students in the control group increased by 0.207.

Keywords: ADDIE, constructivist learning, Integrated Real-Life Video and Animation, mastery concepts, rotational dynamics

Procedia PDF Downloads 232

1227 Being Your Own First Responder: A Training to Identify and Respond to Mental Health

Authors: Joe Voshall, Leigha Shoup

Abstract:

In 2022, the Ohio Peace Officer Training Council and the Attorney General required officers to complete a minimum of 24 hours of continued professional training for the year. Much of the training was based on Mental Health or similarly related topics. This includes Officer Wellness and Officer Mental Health. It is becoming clearer that the stigma of Officer / First Responder Mental Health is a topic that is becoming more prevalently faced. To assist officers and first responders in facing mental health issues, we are developing new training. This training will aid in recognizing mental health-related issues in officers/first responders and citizens, as well as further using the same information to better respond and interact with one another and the public. In general, society has many varying views of mental health, much of which is largely over-sensationalized by television, movies, and other forms of entertainment. There has also been a stigma in law enforcement / first responders related to mental health and being weak as a result of on-the-job-related trauma-induced struggles. It is our hope this new training will assist officers and first responders in not only positively facing and addressing their mental health but using their own experience and education to recognize signs and symptoms of mental health within individuals in the community. Further, we hope that through this recognition, officers and first responders can use their experiences and more in-depth understanding to better interact within the field and with the public. Through recognition and better understanding of mental health issues and more positive interaction with the public, additional achievements are likely to result. This includes in the removal of bias and stigma for everyone.

Keywords: law enforcement, mental health, officer related mental health, trauma

Procedia PDF Downloads 164

1226 Mirrors and Lenses: Multiple Views on Recognition in Holocaust Literature

Authors: Kirsten A. Bartels

Abstract:

There are a number of similarities between survivor literature and Holocaust fiction for children and young adults. The paper explores three facets of the parallels of recognition found specifically between Livia Bitton-Jackson’s memoir of her experience during the Holocaust as an inmate in Auschwitz, I Have Lived a Thousand Years (1999) and Morris Glietzman series of Holocaust fiction. While Bitton-Jackson reflects on her past and Glietzman designs a fictive character, both are judicious with what they are willing to impart, only providing information about their appearance or themselves when it impacts others or when it serves a necessary purpose to the story. Another similarity lies in another critical aspect of many works of Holocaust literature – the idea of being ‘representatively Jewish’. The authors come to this idea from different angles, perhaps best explained as the difference between showing and telling, for Bitton-Jackson provides personal details, and Gleitzman constructed Felix arguably with this idea in mind. Interwoven through their journeys is a shift in perspectives on being recognized -- from wanting to be seen as individuals to being seen as Jew. With this, being Jewish takes on different meaning, both youths struggle with being labeled as something they do not truly understand, and may have not truly identified with, from a label, to a death warrant. With survivor literature viewed as the most credible and worthwhile type of Holocaust literature and Holocaust fiction is often seen as the least (with children’s and young-adult being the lowest form) the similarities in approaches to telling the stories may go overlooked or be undervalued. This paper serves as an exploration in the some of parallel messages shared between the two.

Keywords: holocaust fiction, Holocaust literature, representatively Jewish, survivor literature

Procedia PDF Downloads 169

1225 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 75

1224 Using Deep Learning Real-Time Object Detection Convolution Neural Networks for Fast Fruit Recognition in the Tree

Authors: K. Bresilla, L. Manfrini, B. Morandi, A. Boini, G. Perulli, L. C. Grappadelli

Abstract:

Image/video processing for fruit in the tree using hard-coded feature extraction algorithms have shown high accuracy during recent years. While accurate, these approaches even with high-end hardware are computationally intensive and too slow for real-time systems. This paper details the use of deep convolution neural networks (CNNs), specifically an algorithm (YOLO - You Only Look Once) with 24+2 convolution layers. Using deep-learning techniques eliminated the need for hard-code specific features for specific fruit shapes, color and/or other attributes. This CNN is trained on more than 5000 images of apple and pear fruits on 960 cores GPU (Graphical Processing Unit). Testing set showed an accuracy of 90%. After this, trained data were transferred to an embedded device (Raspberry Pi gen.3) with camera for more portability. Based on correlation between number of visible fruits or detected fruits on one frame and the real number of fruits on one tree, a model was created to accommodate this error rate. Speed of processing and detection of the whole platform was higher than 40 frames per second. This speed is fast enough for any grasping/harvesting robotic arm or other real-time applications.

Keywords: artificial intelligence, computer vision, deep learning, fruit recognition, harvesting robot, precision agriculture

Procedia PDF Downloads 420

1223 Effective Nutrition Label Use on Smartphones

Authors: Vladimir Kulyukin, Tanwir Zaman, Sarat Kiran Andhavarapu

Abstract:

Research on nutrition label use identifies four factors that impede comprehension and retention of nutrition information by consumers: label’s location on the package, presentation of information within the label, label’s surface size, and surrounding visual clutter. In this paper, a system is presented that makes nutrition label use more effective for nutrition information comprehension and retention. The system’s front end is a smartphone application. The system’s back end is a four node Linux cluster for image recognition and data storage. Image frames captured on the smartphone are sent to the back end for skewed or aligned barcode recognition. When barcodes are recognized, corresponding nutrition labels are retrieved from a cloud database and presented to the user on the smartphone’s touchscreen. Each displayed nutrition label is positioned centrally on the touchscreen with no surrounding visual clutter. Wikipedia links to important nutrition terms are embedded to improve comprehension and retention of nutrition information. Standard touch gestures (e.g., zoom in/out) available on mainstream smartphones are used to manipulate the label’s surface size. The nutrition label database currently includes 200,000 nutrition labels compiled from public web sites by a custom crawler. Stress test experiments with the node cluster are presented. Implications for proactive nutrition management and food policy are discussed.

Keywords: mobile computing, cloud computing, nutrition label use, nutrition management, barcode scanning

Procedia PDF Downloads 373

1222 Efficient Residual Road Condition Segmentation Network Based on Reconstructed Images

Authors: Xiang Shijie, Zhou Dong, Tian Dan

Abstract:

This paper focuses on the application of real-time semantic segmentation technology in complex road condition recognition, aiming to address the critical issue of how to improve segmentation accuracy while ensuring real-time performance. Semantic segmentation technology has broad application prospects in fields such as autonomous vehicle navigation and remote sensing image recognition. However, current real-time semantic segmentation networks face significant technical challenges and optimization gaps in balancing speed and accuracy. To tackle this problem, this paper conducts an in-depth study and proposes an innovative Guided Image Reconstruction Module. By resampling high-resolution images into a set of low-resolution images, this module effectively reduces computational complexity, allowing the network to more efficiently extract features within limited resources, thereby improving the performance of real-time segmentation tasks. In addition, a dual-branch network structure is designed in this paper to fully leverage the advantages of different feature layers. A novel Hybrid Attention Mechanism is also introduced, which can dynamically capture multi-scale contextual information and effectively enhance the focus on important features, thus improving the segmentation accuracy of the network in complex road condition. Compared with traditional methods, the proposed model achieves a better balance between accuracy and real-time performance and demonstrates competitive results in road condition segmentation tasks, showcasing its superiority. Experimental results show that this method not only significantly improves segmentation accuracy while maintaining real-time performance, but also remains stable across diverse and complex road conditions, making it highly applicable in practical scenarios. By incorporating the Guided Image Reconstruction Module, dual-branch structure, and Hybrid Attention Mechanism, this paper presents a novel approach to real-time semantic segmentation tasks, which is expected to further advance the development of this field.

Keywords: hybrid attention mechanism, image reconstruction, real-time, road status recognition

Procedia PDF Downloads 24

1221 Re-identification Risk and Mitigation in Federated Learning: Human Activity Recognition Use Case

Authors: Besma Khalfoun

Abstract:

In many current Human Activity Recognition (HAR) applications, users' data is frequently shared and centrally stored by third parties, posing a significant privacy risk. This practice makes these entities attractive targets for extracting sensitive information about users, including their identity, health status, and location, thereby directly violating users' privacy. To tackle the issue of centralized data storage, a relatively recent paradigm known as federated learning has emerged. In this approach, users' raw data remains on their smartphones, where they train the HAR model locally. However, users still share updates of their local models originating from raw data. These updates are vulnerable to several attacks designed to extract sensitive information, such as determining whether a data sample is used in the training process, recovering the training data with inversion attacks, or inferring a specific attribute or property from the training data. In this paper, we first introduce PUR-Attack, a parameter-based user re-identification attack developed for HAR applications within a federated learning setting. It involves associating anonymous model updates (i.e., local models' weights or parameters) with the originating user's identity using background knowledge. PUR-Attack relies on a simple yet effective machine learning classifier and produces promising results. Specifically, we have found that by considering the weights of a given layer in a HAR model, we can uniquely re-identify users with an attack success rate of almost 100%. This result holds when considering a small attack training set and various data splitting strategies in the HAR model training. Thus, it is crucial to investigate protection methods to mitigate this privacy threat. Along this path, we propose SAFER, a privacy-preserving mechanism based on adaptive local differential privacy. Before sharing the model updates with the FL server, SAFER adds the optimal noise based on the re-identification risk assessment. Our approach can achieve a promising tradeoff between privacy, in terms of reducing re-identification risk, and utility, in terms of maintaining acceptable accuracy for the HAR model.

Keywords: federated learning, privacy risk assessment, re-identification risk, privacy preserving mechanisms, local differential privacy, human activity recognition

Procedia PDF Downloads 11

1220 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: stacking, multi-layers, ensemble, multi-class

Procedia PDF Downloads 269

1219 Entrepreneurial Leadership in Malaysian Public University: Competency and Behavior in the Face of Institutional Adversity

Authors: Noorlizawati Abd Rahim, Zainai Mohamed, Zaidatun Tasir, Astuty Amrin, Haliyana Khalid, Nina Diana Nawi

Abstract:

Entrepreneurial leaders have been sought as in-demand talents to lead profit-driven organizations during turbulent and unprecedented times. However, research regarding the pertinence of their roles in the public sector has been limited. This paper examined the characteristics of the challenging experiences encountered by senior leaders in public universities that require them to embrace entrepreneurialism in their leadership. Through a focus group interview with five Malaysian university top senior leaders with experience being Vice-Chancellor, we explored and developed a framework of institutional adversity characteristics and exemplary entrepreneurial leadership competency in the face of adversity. Complexity of diverse stakeholders, multiplicity of academic disciplines, unfamiliarity to lead different and broader roles, leading new directions, and creating change in high velocity and uncertain environment are among the dimensions that characterise institutional adversities. Our findings revealed that learning agility, opportunity recognition capacity, and bridging capability are among the characteristics of entrepreneurial university leaders. The findings reinforced that the presence of specific attributes in institutional adversity and experiences in overcoming those challenges may contribute to the development of entrepreneurial leadership capabilities.

Keywords: bridging capability, entrepreneurial leadership, leadership development, learning agility, opportunity recognition, university leaders

Procedia PDF Downloads 110

1218 Optimization of the Dental Direct Digital Imaging by Applying the Self-Recognition Technology

Authors: Mina Dabirinezhad, Mohsen Bayat Pour, Amin Dabirinejad

Abstract:

This paper is intended to introduce the technology to solve some of the deficiencies of the direct digital radiology. Nowadays, digital radiology is the latest progression in dental imaging, which has become an essential part of dentistry. There are two main parts of the direct digital radiology comprised of an intraoral X-ray machine and a sensor (digital image receptor). The dentists and the dental nurses experience afflictions during the taking image process by the direct digital X-ray machine. For instance, sometimes they need to readjust the sensor in the mouth of the patient to take the X-ray image again due to the low quality of that. Another problem is, the position of the sensor may move in the mouth of the patient and it triggers off an inappropriate image for the dentists. It means that it is a time-consuming process for dentists or dental nurses. On the other hand, taking several the X-ray images brings some problems for the patient such as being harmful to their health and feeling pain in their mouth due to the pressure of the sensor to the jaw. The author provides a technology to solve the above-mentioned issues that is called “Self-Recognition Direct Digital Radiology” (SDDR). This technology is based on the principle that the intraoral X-ray machine is capable to diagnose the location of the sensor in the mouth of the patient automatically. In addition, to solve the aforementioned problems, SDDR technology brings out fewer environmental impacts in comparison to the previous version.

Keywords: Dental direct digital imaging, digital image receptor, digital x-ray machine, and environmental impacts

Procedia PDF Downloads 138

1217 Development of a New Characterization Method to Analyse Cypermethrin Penetration in Wood Material by Immunolabelling

Authors: Sandra Tapin-Lingua, Katia Ruel, Jean-Paul Joseleau, Daouia Messaoudi, Olivier Fahy, Michel Petit-Conil

Abstract:

The preservative efficacy of organic biocides is strongly related to their capacity of penetration and retention within wood tissues. The specific detection of the pyrethroid insecticide is currently obtained after extraction followed by chemical analysis by chromatography techniques. However visualizing the insecticide molecule within the wood structure requires specific probes together with microscopy techniques. Therefore, the aim of the present work was to apply a new methodology based on antibody-antigen recognition and electronic microscopy to visualize directly pyrethroids in the wood material. A polyclonal antibody directed against cypermethrin was developed and implement it on Pinus sylvestris wood samples coated with technical cypermethrin. The antibody was tested on impregnated wood and the specific recognition of the insecticide was visualized in transmission electron microscopy (TEM). The immunogold-TEM assay evidenced the capacity of the synthetic biocide to penetrate in the wood. The depth of penetration was measured on sections taken at increasing distances from the coated surface of the wood. Such results correlated with chemical analyzes carried out by GC-ECD after extraction. In addition, the immuno-TEM investigation allowed visualizing, for the first time at the ultrastructure scale of resolution, that cypermethrin was able to diffuse within the secondary wood cell walls.

Keywords: cypermethrin, insecticide, wood penetration, wood retention, immuno-transmission electron microscopy, polyclonal antibody

Procedia PDF Downloads 413

1216 Machine Learning and Deep Learning Approach for People Recognition and Tracking in Crowd for Safety Monitoring

Authors: A. Degale Desta, Cheng Jian

Abstract:

Deep learning application in computer vision is rapidly advancing, giving it the ability to monitor the public and quickly identify potentially anomalous behaviour from crowd scenes. Therefore, the purpose of the current work is to improve the performance of safety of people in crowd events from panic behaviour through introducing the innovative idea of Aggregation of Ensembles (AOE), which makes use of the pre-trained ConvNets and a pool of classifiers to find anomalies in video data with packed scenes. According to the theory of algorithms that applied K-means, KNN, CNN, SVD, and Faster-CNN, YOLOv5 architectures learn different levels of semantic representation from crowd videos; the proposed approach leverages an ensemble of various fine-tuned convolutional neural networks (CNN), allowing for the extraction of enriched feature sets. In addition to the above algorithms, a long short-term memory neural network to forecast future feature values and a handmade feature that takes into consideration the peculiarities of the crowd to understand human behavior. On well-known datasets of panic situations, experiments are run to assess the effectiveness and precision of the suggested method. Results reveal that, compared to state-of-the-art methodologies, the system produces better and more promising results in terms of accuracy and processing speed.

Keywords: action recognition, computer vision, crowd detecting and tracking, deep learning

Procedia PDF Downloads 161

1215 The Visible Third: Female Artists’ Participation in the Portuguese Contemporary Art World

Authors: Sonia Bernardo Correia

Abstract:

This paper is part of ongoing research that aims to understand the role of gender in the composition of the Portuguese contemporary art world and the possibilities and limits to the success of the professional paths of women and men artists. The field of visual arts is gender-sensitive as it differentiates the positions occupied by artists in terms of visibility and recognition. Women artists occupy a peripheral space, which may hinder the progression of their professional careers. Based on the collection of data on the participation of artists in Portuguese exhibitions, art fairs, auctions, and art awards between 2012 and 2019, the goal of this study is to portray female artists’ participation as a condition of professional, social, and cultural visibility. From the analysis of a significant sample of institutions from the artistic field, it was possible to observe that the works of female authors are under exhibited, never exceeding one-third of the total of exhibitions. Male artists also enjoy a comfortable majority as gallery artists (around 70%) and as part of institutional collections (around 80%). However, when analysing the younger age cohorts of artists by gender, it appears that there is representation parity, which may be a good sign of change. The data shows that there are persistent gender inequalities in accessing the artist profession. Women are not yet occupying positions of exposure, recognition, and legitimation in the market similar to those of their male counterparts, suggesting that they may face greater obstacles in experiencing successful professional trajectories.

Keywords: inequalities, invisibility of the woman artist, gender, visual arts

Procedia PDF Downloads 136

1214 Economic Important of Manta Ray Watching Tourism in Dampier Strait, Raja Ampat, West Papua, Indonesia

Authors: Maulita Sari Hani, Abraham B. Sianipar, Jamaluddin Jompa, Natsir Nessa, Alan T. White

Abstract:

Manta ray is an icon for tourism in Raja Ampat. The tourist volume has been increased for the past ten years which up to approximately 23,000 tourists in 2017. Since 2013, Conservation International Indonesia deployed satellite and acoustic tags on manta ray in Dampier strait to track the species and identify the aggregation areas. These findings encourage the government and the local community to boost conservation through the management of marine protected areas for tourism purposes. Community in Dampier strait including the village of Arborek, Kurkapa, Kapisawar, and Sawingray involved in variety of small scale tourism business including homestay, dive shop, tour operator, and crafts. Working groups of related local businesses were established to support the local community and to ensure the sustainability of the economic viability and environmental sustainability. In order to analyze the economic benefits of manta ray tourism, this study was conducted to identify the number of local business in Dampier Strait and the economic impacts in terms of local finance security, social, humanity, individual, and physical assets. The results of this study identify 30 homestays, 2 dive shops, 10 tour operators, 30 women involved in crafts, and about 50 villagers worked for dive resorts. In addition to community assets, we confirmed the welfare of community has been improved in terms of food security, households, education for children, savings, and health insurance.

Keywords: marine wildlife tourism, elasmobranch, conservation, ecotourism, co-management, economic viability, environmental sustainability

Procedia PDF Downloads 217

1213 Sign Language Recognition of Static Gestures Using Kinect™ and Convolutional Neural Networks

Authors: Rohit Semwal, Shivam Arora, Saurav, Sangita Roy

Abstract:

This work proposes a supervised framework with deep convolutional neural networks (CNNs) for vision-based sign language recognition of static gestures. Our approach addresses the acquisition and segmentation of correct inputs for the CNN-based classifier. Microsoft Kinect™ sensor, despite complex environmental conditions, can track hands efficiently. Skin Colour based segmentation is applied on cropped images of hands in different poses, used to depict different sign language gestures. The segmented hand images are used as an input for our classifier. The CNN classifier proposed in the paper is able to classify the input images with a high degree of accuracy. The system was trained and tested on 39 static sign language gestures, including 26 letters of the alphabet and 13 commonly used words. This paper includes a problem definition for building the proposed system, which acts as a sign language translator between deaf/mute and the rest of the society. It is then followed by a focus on reviewing existing knowledge in the area and work done by other researchers. It also describes the working principles behind different components of CNNs in brief. The architecture and system design specifications of the proposed system are discussed in the subsequent sections of the paper to give the reader a clear picture of the system in terms of the capability required. The design then gives the top-level details of how the proposed system meets the requirements.

Keywords: sign language, CNN, HCI, segmentation

Procedia PDF Downloads 157

1212 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition

Procedia PDF Downloads 229

1211 Real-Time Gesture Recognition System Using Microsoft Kinect

Authors: Ankita Wadhawan, Parteek Kumar, Umesh Kumar

Abstract:

Gesture is any body movement that expresses some attitude or any sentiment. Gestures as a sign language are used by deaf people for conveying messages which helps in eliminating the communication barrier between deaf people and normal persons. Nowadays, everybody is using mobile phone and computer as a very important gadget in their life. But there are some physically challenged people who are blind/deaf and the use of mobile phone or computer like device is very difficult for them. So, there is an immense need of a system which works on body gesture or sign language as input. In this research, Microsoft Kinect Sensor, SDK V2 and Hidden Markov Toolkit (HTK) are used to recognize the object, motion of object and human body joints through Touch less NUI (Natural User Interface) in real-time. The depth data collected from Microsoft Kinect has been used to recognize gestures of Indian Sign Language (ISL). The recorded clips are analyzed using depth, IR and skeletal data at different angles and positions. The proposed system has an average accuracy of 85%. The developed Touch less NUI provides an interface to recognize gestures and controls the cursor and click operation in computer just by waving hand gesture. This research will help deaf people to make use of mobile phones, computers and socialize among other persons in the society.

Keywords: gesture recognition, Indian sign language, Microsoft Kinect, natural user interface, sign language

Procedia PDF Downloads 306

1210 Impact of Integrated Signals for Doing Human Activity Recognition Using Deep Learning Models

Authors: Milagros Jaén-Vargas, Javier García Martínez, Karla Miriam Reyes Leiva, María Fernanda Trujillo-Guerrero, Francisco Fernandes, Sérgio Barroso Gonçalves, Miguel Tavares Silva, Daniel Simões Lopes, José Javier Serrano Olmedo

Abstract:

Human Activity Recognition (HAR) is having a growing impact in creating new applications and is responsible for emerging new technologies. Also, the use of wearable sensors is an important key to exploring the human body's behavior when performing activities. Hence, the use of these dispositive is less invasive and the person is more comfortable. In this study, a database that includes three activities is used. The activities were acquired from inertial measurement unit sensors (IMU) and motion capture systems (MOCAP). The main objective is differentiating the performance from four Deep Learning (DL) models: Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and hybrid model Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), when considering acceleration, velocity and position and evaluate if integrating the IMU acceleration to obtain velocity and position represent an increment in performance when it works as input to the DL models. Moreover, compared with the same type of data provided by the MOCAP system. Despite the acceleration data is cleaned when integrating, results show a minimal increase in accuracy for the integrated signals.

Keywords: HAR, IMU, MOCAP, acceleration, velocity, position, feature maps

Procedia PDF Downloads 98

1209 Correlation between Defect Suppression and Biosensing Capability of Hydrothermally Grown ZnO Nanorods

Authors: Mayoorika Shukla, Pramila Jakhar, Tejendra Dixit, I. A. Palani, Vipul Singh

Abstract:

Biosensors are analytical devices with wide range of applications in biological, chemical, environmental and clinical analysis. It comprises of bio-recognition layer which has biomolecules (enzymes, antibodies, DNA, etc.) immobilized over it for detection of analyte and transducer which converts the biological signal into the electrical signal. The performance of biosensor primarily the depends on the bio-recognition layer and therefore it has to be chosen wisely. In this regard, nanostructures of metal oxides such as ZnO, SnO2, V2O5, and TiO2, etc. have been explored extensively as bio-recognition layer. Recently, ZnO has the attracted attention of researchers due to its unique properties like high iso-electric point, biocompatibility, stability, high electron mobility and high electron binding energy, etc. Although there have been many reports on usage of ZnO as bio-recognition layer but to the authors’ knowledge, none has ever observed correlation between optical properties like defect suppression and biosensing capability of the sensor. Here, ZnO nanorods (ZNR) have been synthesized by a low cost, simple and low-temperature hydrothermal growth process, over Platinum (Pt) coated glass substrate. The ZNR have been synthesized in two steps viz. initially a seed layer was coated over substrate (Pt coated glass) followed by immersion of it into nutrient solution of Zinc nitrate and Hexamethylenetetramine (HMTA) with in situ addition of KMnO4. The addition of KMnO4 was observed to have a profound effect over the growth rate anisotropy of ZnO nanostructures. Clustered and powdery growth of ZnO was observed without addition of KMnO4, although by addition of it during the growth, uniform and crystalline ZNR were found to be grown over the substrate. Moreover, the same has resulted in suppression of defects as observed by Normalized Photoluminescence (PL) spectra since KMnO4 is a strong oxidizing agent which provides an oxygen rich growth environment. Further, to explore the correlation between defect suppression and biosensing capability of the ZNR Glucose oxidase (Gox) was immobilized over it, using physical adsorption technique followed by drop casting of nafion. Here the main objective of the work was to analyze effect of defect suppression over biosensing capability, and therefore Gox has been chosen as model enzyme, and electrochemical amperometric glucose detection was performed. The incorporation of KMnO4 during growth has resulted in variation of optical and charge transfer properties of ZNR which in turn were observed to have deep impact on biosensor figure of merits. The sensitivity of biosensor was found to increase by 12-18 times, due to variations introduced by addition of KMnO4 during growth. The amperometric detection of glucose in continuously stirred buffer solution was performed. Interestingly, defect suppression has been observed to contribute towards the improvement of biosensor performance. The detailed mechanism of growth of ZNR along with the overall influence of defect suppression on the sensing capabilities of the resulting enzymatic electrochemical biosensor and different figure of merits of the biosensor (Glass/Pt/ZNR/Gox/Nafion) will be discussed during the conference.

Keywords: biosensors, defects, KMnO4, ZnO nanorods

Procedia PDF Downloads 282

1208 Highly Accurate Target Motion Compensation Using Entropy Function Minimization

Authors: Amin Aghatabar Roodbary, Mohammad Hassan Bastani

Abstract:

One of the defects of stepped frequency radar systems is their sensitivity to target motion. In such systems, target motion causes range cell shift, false peaks, Signal to Noise Ratio (SNR) reduction and range profile spreading because of power spectrum interference of each range cell in adjacent range cells which induces distortion in High Resolution Range Profile (HRRP) and disrupt target recognition process. Thus Target Motion Parameters (TMPs) effects compensation should be employed. In this paper, such a method for estimating TMPs (velocity and acceleration) and consequently eliminating or suppressing the unwanted effects on HRRP based on entropy minimization has been proposed. This method is carried out in two major steps: in the first step, a discrete search method has been utilized over the whole acceleration-velocity lattice network, in a specific interval seeking to find a less-accurate minimum point of the entropy function. Then in the second step, a 1-D search over velocity is done in locus of the minimum for several constant acceleration lines, in order to enhance the accuracy of the minimum point found in the first step. The provided simulation results demonstrate the effectiveness of the proposed method.

Keywords: automatic target recognition (ATR), high resolution range profile (HRRP), motion compensation, stepped frequency waveform technique (SFW), target motion parameters (TMPs)

Procedia PDF Downloads 152

1207 Preprocessing and Fusion of Multiple Representation of Finger Vein patterns using Conventional and Machine Learning techniques

Authors: Tomas Trainys, Algimantas Venckauskas

Abstract:

Application of biometric features to the cryptography for human identification and authentication is widely studied and promising area of the development of high-reliability cryptosystems. Biometric cryptosystems typically are designed for patterns recognition, which allows biometric data acquisition from an individual, extracts feature sets, compares the feature set against the set stored in the vault and gives a result of the comparison. Preprocessing and fusion of biometric data are the most important phases in generating a feature vector for key generation or authentication. Fusion of biometric features is critical for achieving a higher level of security and prevents from possible spoofing attacks. The paper focuses on the tasks of initial processing and fusion of multiple representations of finger vein modality patterns. These tasks are solved by applying conventional image preprocessing methods and machine learning techniques, Convolutional Neural Network (SVM) method for image segmentation and feature extraction. An article presents a method for generating sets of biometric features from a finger vein network using several instances of the same modality. Extracted features sets were fused at the feature level. The proposed method was tested and compared with the performance and accuracy results of other authors.

Keywords: bio-cryptography, biometrics, cryptographic key generation, data fusion, information security, SVM, pattern recognition, finger vein method.

Procedia PDF Downloads 150

1206 Human Gesture Recognition for Real-Time Control of Humanoid Robot

Authors: S. Aswath, Chinmaya Krishna Tilak, Amal Suresh, Ganesh Udupa

Abstract:

There are technologies to control a humanoid robot in many ways. But the use of Electromyogram (EMG) electrodes has its own importance in setting up the control system. The EMG based control system helps to control robotic devices with more fidelity and precision. In this paper, development of an electromyogram based interface for human gesture recognition for the control of a humanoid robot is presented. To recognize control signs in the gestures, a single channel EMG sensor is positioned on the muscles of the human body. Instead of using a remote control unit, the humanoid robot is controlled by various gestures performed by the human. The EMG electrodes attached to the muscles generates an analog signal due to the effect of nerve impulses generated on moving muscles of the human being. The analog signals taken up from the muscles are supplied to a differential muscle sensor that processes the given signal to generate a signal suitable for the microcontroller to get the control over a humanoid robot. The signal from the differential muscle sensor is converted to a digital form using the ADC of the microcontroller and outputs its decision to the CM-530 humanoid robot controller through a Zigbee wireless interface. The output decision of the CM-530 processor is sent to a motor driver in order to control the servo motors in required direction for human like actions. This method for gaining control of a humanoid robot could be used for performing actions with more accuracy and ease. In addition, a study has been conducted to investigate the controllability and ease of use of the interface and the employed gestures.

Keywords: electromyogram, gesture, muscle sensor, humanoid robot, microcontroller, Zigbee

Procedia PDF Downloads 407

1205 Italian Speech Vowels Landmark Detection through the Legacy Tool 'xkl' with Integration of Combined CNNs and RNNs

Authors: Kaleem Kashif, Tayyaba Anam, Yizhi Wu

Abstract:

This paper introduces a methodology for advancing Italian speech vowels landmark detection within the distinctive feature-based speech recognition domain. Leveraging the legacy tool 'xkl' by integrating combined convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the study presents a comprehensive enhancement to the 'xkl' legacy software. This integration incorporates re-assigned spectrogram methodologies, enabling meticulous acoustic analysis. Simultaneously, our proposed model, integrating combined CNNs and RNNs, demonstrates unprecedented precision and robustness in landmark detection. The augmentation of re-assigned spectrogram fusion within the 'xkl' software signifies a meticulous advancement, particularly enhancing precision related to vowel formant estimation. This augmentation catalyzes unparalleled accuracy in landmark detection, resulting in a substantial performance leap compared to conventional methods. The proposed model emerges as a state-of-the-art solution in the distinctive feature-based speech recognition systems domain. In the realm of deep learning, a synergistic integration of combined CNNs and RNNs is introduced, endowed with specialized temporal embeddings, harnessing self-attention mechanisms, and positional embeddings. The proposed model allows it to excel in capturing intricate dependencies within Italian speech vowels, rendering it highly adaptable and sophisticated in the distinctive feature domain. Furthermore, our advanced temporal modeling approach employs Bayesian temporal encoding, refining the measurement of inter-landmark intervals. Comparative analysis against state-of-the-art models reveals a substantial improvement in accuracy, highlighting the robustness and efficacy of the proposed methodology. Upon rigorous testing on a database (LaMIT) speech recorded in a silent room by four Italian native speakers, the landmark detector demonstrates exceptional performance, achieving a 95% true detection rate and a 10% false detection rate. A majority of missed landmarks were observed in proximity to reduced vowels. These promising results underscore the robust identifiability of landmarks within the speech waveform, establishing the feasibility of employing a landmark detector as a front end in a speech recognition system. The synergistic integration of re-assigned spectrogram fusion, CNNs, RNNs, and Bayesian temporal encoding not only signifies a significant advancement in Italian speech vowels landmark detection but also positions the proposed model as a leader in the field. The model offers distinct advantages, including unparalleled accuracy, adaptability, and sophistication, marking a milestone in the intersection of deep learning and distinctive feature-based speech recognition. This work contributes to the broader scientific community by presenting a methodologically rigorous framework for enhancing landmark detection accuracy in Italian speech vowels. The integration of cutting-edge techniques establishes a foundation for future advancements in speech signal processing, emphasizing the potential of the proposed model in practical applications across various domains requiring robust speech recognition systems.

Keywords: landmark detection, acoustic analysis, convolutional neural network, recurrent neural network

Procedia PDF Downloads 63

1204 Visual Speech Perception of Arabic Emphatics

Authors: Maha Saliba Foster

Abstract:

Speech perception has been recognized as a bi-sensory process involving the auditory and visual channels. Compared to the auditory modality, the contribution of the visual signal to speech perception is not very well understood. Studying how the visual modality affects speech recognition can have pedagogical implications in second language learning, as well as clinical application in speech therapy. The current investigation explores the potential effect of speech visual cues on the perception of Arabic emphatics (AEs). The corpus consists of 36 minimal pairs each containing two contrasting consonants, an AE versus a non-emphatic (NE). Movies of four Lebanese speakers were edited to allow perceivers to have partial view of facial regions: lips only, lips-cheeks, lips-chin, lips-cheeks-chin, lips-cheeks-chin-neck. In the absence of any auditory information and relying solely on visual speech, perceivers were above chance at correctly identifying AEs or NEs across vowel contexts; moreover, the models were able to predict the probability of perceivers’ accuracy in identifying some of the COIs produced by certain speakers; additionally, results showed an overlap between the measurements selected by the computer and those selected by human perceivers. The lack of significant face effect on the perception of AEs seems to point to the lips, present in all of the videos, as the most important and often sufficient facial feature for emphasis recognition. Future investigations will aim at refining the analyses of visual cues used by perceivers by using Principal Component Analysis and including time evolution of facial feature measurements.

Keywords: Arabic emphatics, machine learning, speech perception, visual speech perception

Procedia PDF Downloads 306

1203 Spatial Object-Oriented Template Matching Algorithm Using Normalized Cross-Correlation Criterion for Tracking Aerial Image Scene

Authors: Jigg Pelayo, Ricardo Villar

Abstract:

Leaning on the development of aerial laser scanning in the Philippine geospatial industry, researches about remote sensing and machine vision technology became a trend. Object detection via template matching is one of its application which characterized to be fast and in real time. The paper purposely attempts to provide application for robust pattern matching algorithm based on the normalized cross correlation (NCC) criterion function subjected in Object-based image analysis (OBIA) utilizing high-resolution aerial imagery and low density LiDAR data. The height information from laser scanning provides effective partitioning order, thus improving the hierarchal class feature pattern which allows to skip unnecessary calculation. Since detection is executed in the object-oriented platform, mathematical morphology and multi-level filter algorithms were established to effectively avoid the influence of noise, small distortion and fluctuating image saturation that affect the rate of recognition of features. Furthermore, the scheme is evaluated to recognized the performance in different situations and inspect the computational complexities of the algorithms. Its effectiveness is demonstrated in areas of Misamis Oriental province, achieving an overall accuracy of 91% above. Also, the garnered results portray the potential and efficiency of the implemented algorithm under different lighting conditions.

Keywords: algorithm, LiDAR, object recognition, OBIA

Procedia PDF Downloads 245