Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3422

Search results for: visual recognition

3362 The Wear Recognition on Guide Surface Based on the Feature of Radar Graph

Authors: Youhang Zhou, Weimin Zeng, Qi Xie

Abstract:

Abstract: In order to solve the wear recognition problem of the machine tool guide surface, a new machine tool guide surface recognition method based on the radar-graph barycentre feature is presented in this paper. Firstly, the gray mean value, skewness, projection variance, flat degrees and kurtosis features of the guide surface image data are defined as primary characteristics. Secondly, data Visualization technology based on radar graph is used. The visual barycentre graphical feature is demonstrated based on the radar plot of multi-dimensional data. Thirdly, a classifier based on the support vector machine technology is used, the radar-graph barycentre feature and wear original feature are put into the classifier separately for classification and comparative analysis of classification and experiment results. The calculation and experimental results show that the method based on the radar-graph barycentre feature can detect the guide surface effectively.

Keywords: guide surface, wear defects, feature extraction, data visualization

Procedia PDF Downloads 515

3361 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition

Procedia PDF Downloads 170

3360 An Event-Related Potential Study of Individual Differences in Word Recognition: The Evidence from Morphological Knowledge of Sino-Korean Prefixes

Authors: Jinwon Kang, Seonghak Jo, Joohee Ahn, Junghye Choi, Sun-Young Lee

Abstract:

A morphological priming has proved its importance by showing that segmentation occurs in morphemes when visual words are recognized within a noticeably short time. Regarding Sino-Korean prefixes, this study conducted an experiment on visual masked priming tasks with 57 ms stimulus-onset asynchrony (SOA) to see how individual differences in the amount of morphological knowledge affect morphological priming. The relationship between the prime and target words were classified as morphological (e.g., 미개척 migaecheog [unexplored] – 미해결 mihaegyel [unresolved]), semantical (e.g., 친환경 chinhwangyeong [eco-friendly]) – 무공해 mugonghae [no-pollution]), and orthographical (e.g., 미용실 miyongsil [beauty shop] – 미확보 mihwagbo [uncertainty]) conditions. We then compared the priming by configuring irrelevant paired stimuli for each condition’s control group. As a result, in the behavioral data, we observed facilitatory priming from a group with high morphological knowledge only under the morphological condition. In contrast, a group with low morphological knowledge showed the priming only under the orthographic condition. In the event-related potential (ERP) data, the group with high morphological knowledge presented the N250 only under the morphological condition. The findings of this study imply that individual differences in morphological knowledge in Korean may have a significant influence on the segmental processing of Korean word recognition.

Keywords: ERP, individual differences, morphological priming, sino-Korean prefixes

Procedia PDF Downloads 204

3359 New Approaches for the Handwritten Digit Image Features Extraction for Recognition

Authors: U. Ravi Babu, Mohd Mastan

Abstract:

The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.

Keywords: handwritten digit recognition, distance measure, MNIST database, image features

Procedia PDF Downloads 455

3358 Toward a Methodology of Visual Rhetoric with Constant Reference to Mikhail Bakhtin’s Concept of “Chronotope”: A Theoretical Proposal and Taiwan Case Study

Authors: Hsiao-Yung Wang

Abstract:

This paper aims to elaborate methodology of visual rhetoric with constant reference to Mikhail Bakhtin’s concept of “chronotope”. First, it attempts to outline Ronald Barthes, the most representative scholar of visual rhetoric and structuralism, perspective on visual rhetoric and its time-space category by referring to the concurrent word-image, the symbolic systematicity, the outer dialogicity. Second, an alternative approach is explored for grasping the dynamics and functions of visual rhetoric by articulating Mikhail Bakhtin’s concept of “chronotope.” Furthermore, that visual rhetorical consciousness could be identified as “the meaning parabola which projects from word to image,” “the symbolic system which proceeds from sequence to disorder,” “the ideological environment which struggles from the local to the global.” Last but not least, primary vision of the 2014 Taipei LGBT parade would be analyzed preliminarily to evaluate the effectiveness and persuasiveness embodied by specific visual rhetorical strategies. How Bakhtin’s concept of “chronotope” to explain the potential or possible ideological struggle deployed by visual rhetoric might be interpreted empirically and extensively.

Keywords: barthes, chronotope, Mikhail Bakhtin, Taipei LGBT parade, visual rhetoric

Procedia PDF Downloads 468

3357 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 184

3356 Enhancing Visual Corporate Identity on Festive Money Packets Design with Cultural Symbolisms

Authors: Noranis Ismail, Shamsul H. A. Rahman

Abstract:

The objective of this research is to accentuate the importance of Visual Corporate Identity by utilizing Malay motifs amalgamated with Malay proverbs to enhance the corporate brand of The Design School (TDS) of Taylor’s University. The researchers aim to manipulate festive money packet as a mean to communicate to the audience by using non-verbal visual cues such as colour, languages, and symbols that reflect styles and cultural heritage. The paper concluded that it is possible to utilize Hari Raya packet as a medium for creative expressions by creating high-impact design through the symbolism of selected Malay proverbs and traditional Malay motifs to enhance TDS corporate visual identity. It also provides a vital contribution to other organizations to understand an integral part of corporate visual identity in heightening corporate brand by communicating indirectly to its stakeholders using visual mnemonic and cultural heritage.

Keywords: corporate branding, cultural cues, Malay culture, visual identity

Procedia PDF Downloads 424

3355 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 344

3354 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 89

3353 Visual Aid and Imagery Ramification on Decision Making: An Exploratory Study Applicable in Emergency Situations

Authors: Priyanka Bharti

Abstract:

Decades ago designs were based on common sense and tradition, but after an enhancement in visualization technology and research, we are now able to comprehend the cognitive ability involved in the decoding of the visual information. However, many fields in visuals need intense research to deliver an efficient explanation for the events. Visuals are an information representation mode through images, symbols and graphics. It plays an impactful role in decision making by facilitating quick recognition, comprehension, and analysis of a situation. They enhance problem-solving capabilities by enabling the processing of more data without overloading the decision maker. As research proves that, visuals offer an improved learning environment by a factor of 400 compared to textual information. Visual information engages learners at a cognitive level and triggers the imagination, which enables the user to process the information faster (visuals are processed 60,000 times faster in the brain than text). Appropriate information, visualization, and its presentation are known to aid and intensify the decision-making process for the users. However, most literature discusses the role of visual aids in comprehension and decision making during normal conditions alone. Unlike emergencies, in a normal situation (e.g. our day to day life) users are neither exposed to stringent time constraints nor face the anxiety of survival and have sufficient time to evaluate various alternatives before making any decision. An emergency is an unexpected probably fatal real-life situation which may inflict serious ramifications on both human life and material possessions unless corrective measures are taken instantly. The situation demands the exposed user to negotiate in a dynamic and unstable scenario in the absence or lack of any preparation, but still, take swift and appropriate decisions to save life/lives or possessions. But the resulting stress and anxiety restricts cue sampling, decreases vigilance, reduces the capacity of working memory, causes premature closure in evaluating alternative options, and results in task shedding. Limited time, uncertainty, high stakes and vague goals negatively affect cognitive abilities to take appropriate decisions. More so, theory of natural decision making by experts has been understood with far more depth than that of an ordinary user. Therefore, in this study, the author aims to understand the role of visual aids in supporting rapid comprehension to take appropriate decisions during an emergency situation.

Keywords: cognition, visual, decision making, graphics, recognition

Procedia PDF Downloads 265

3352 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 221

3351 Contemporary Visual Art and Shariah: A Conceptual Framework

Authors: Ishak Ramli, Mohamad Noorman Masrek, Muhamad Abdul Aziz Ab Gani

Abstract:

Islam places restrictions and limitation to the creation and ownership of visual art. Not all forms of visual arts are permissible in Islam. However, guidance on the creation and ownership of visual arts is not made plain and clear not only to the Islamic followers but also to the art community. Given this gap, this study attempts to develop a conceptual framework that will guide artist and art collectors on what constitute to valid and acceptable through the Islamic perspective. Based on this framework, several research checklist are proposed. It is highly useful especially for the researchers who are interested to study the topic. Qualitative research is the best choice to test run the paper work to attempt all the checklist which are formed.

Keywords: contemporary visual art, Shariah, conceptual framework, Islam

Procedia PDF Downloads 372

3350 Effective Nutrition Label Use on Smartphones

Authors: Vladimir Kulyukin, Tanwir Zaman, Sarat Kiran Andhavarapu

Abstract:

Research on nutrition label use identifies four factors that impede comprehension and retention of nutrition information by consumers: label’s location on the package, presentation of information within the label, label’s surface size, and surrounding visual clutter. In this paper, a system is presented that makes nutrition label use more effective for nutrition information comprehension and retention. The system’s front end is a smartphone application. The system’s back end is a four node Linux cluster for image recognition and data storage. Image frames captured on the smartphone are sent to the back end for skewed or aligned barcode recognition. When barcodes are recognized, corresponding nutrition labels are retrieved from a cloud database and presented to the user on the smartphone’s touchscreen. Each displayed nutrition label is positioned centrally on the touchscreen with no surrounding visual clutter. Wikipedia links to important nutrition terms are embedded to improve comprehension and retention of nutrition information. Standard touch gestures (e.g., zoom in/out) available on mainstream smartphones are used to manipulate the label’s surface size. The nutrition label database currently includes 200,000 nutrition labels compiled from public web sites by a custom crawler. Stress test experiments with the node cluster are presented. Implications for proactive nutrition management and food policy are discussed.

Keywords: mobile computing, cloud computing, nutrition label use, nutrition management, barcode scanning

Procedia PDF Downloads 366

3349 Difficulty and Complexity in Dealing with Visual Pollution in the Historical Cities: The Historical City of Ibb-Yemen as a Case Study

Authors: Abdulfattah A. Q .Alwah, Wen Li, Mohammed A. Q. Alwah, Duc Thien Tran, Bing Xi Liu

Abstract:

The historical cities in the third world suffer from many environmental problems; one of them is the spread of visual pollution manifestations. These phenomena increase with low levels of public awareness and low per capita income. The historical city of Ibb is suffering from a variety of visual pollution of the urban environment, so it has been chosen as a case study. This study aims to identify the difficulty and complexity of dealing with visual pollutions manifestations in the historical city of Ibb, and to provide appropriate solutions, which suit with the complex and contradictory circumstances. The study relies on an inductive approach to achieve its aims through two methods; the first is a visual survey of the visual pollution phenomenon based on images and researcher notes. The Second method is the analyses of the opinions and impressions of the city's residents and visitors through interviews, in addition to interviews with the officials in the competent authorities, and some specialists in the field of urban environment. Through the results of the field study and discussion of the interview results, this study presents an analysis of the phenomenon of visual distortion of the historical city of Ibb regarding the appearances and the reasons. Furthermore, this study provides appropriate solutions, which suitable with the complex and contradictory circumstances. These solutions take two paths: the first one is to stop the spread of visual distortions, and the second path is to address the current visual pollutions.

Keywords: visual pollution, visual image, urban environment, difficulty, complexity, historical cities, the historical city of Ibb

Procedia PDF Downloads 140

3348 Visual Analytics in K 12 Education: Emerging Dimensions of Complexity

Authors: Linnea Stenliden

Abstract:

The aim of this paper is to understand emerging learning conditions, when a visual analytics is implemented and used in K 12 (education). To date, little attention has been paid to the role visual analytics (digital media and technology that highlight visual data communication in order to support analytical tasks) can play in education, and to the extent to which these tools can process actionable data for young students. This study was conducted in three public K 12 schools, in four social science classes with students aged 10 to 13 years, over a period of two to four weeks at each school. Empirical data were generated using video observations and analyzed with help of metaphors by Latour. The learning conditions are found to be distinguished by broad complexity characterized by four dimensions. These emerge from the actors’ deeply intertwined relations in the activities. The paper argues in relation to the found dimensions that novel approaches to teaching and learning could benefit students’ knowledge building as they work with visual analytics, analyzing visualized data.

Keywords: analytical reasoning, complexity, data use, problem space, visual analytics, visual storytelling, translation

Procedia PDF Downloads 364

3347 Hand Detection and Recognition for Malay Sign Language

Authors: Mohd Noah A. Rahman, Afzaal H. Seyal, Norhafilah Bara

Abstract:

Developing a software application using an interface with computers and peripheral devices using gestures of human body such as hand movements keeps growing in interest. A review on this hand gesture detection and recognition based on computer vision technique remains a very challenging task. This is to provide more natural, innovative and sophisticated way of non-verbal communication, such as sign language, in human computer interaction. Nevertheless, this paper explores hand detection and hand gesture recognition applying a vision based approach. The hand detection and recognition used skin color spaces such as HSV and YCrCb are applied. However, there are limitations that are needed to be considered. Almost all of skin color space models are sensitive to quickly changing or mixed lighting circumstances. There are certain restrictions in order for the hand recognition to give better results such as the distance of user’s hand to the webcam and the posture and size of the hand.

Keywords: hand detection, hand gesture, hand recognition, sign language

Procedia PDF Downloads 301

3346 Small Text Extraction from Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 120

3345 The Visible Third: Female Artists’ Participation in the Portuguese Contemporary Art World

Authors: Sonia Bernardo Correia

Abstract:

This paper is part of ongoing research that aims to understand the role of gender in the composition of the Portuguese contemporary art world and the possibilities and limits to the success of the professional paths of women and men artists. The field of visual arts is gender-sensitive as it differentiates the positions occupied by artists in terms of visibility and recognition. Women artists occupy a peripheral space, which may hinder the progression of their professional careers. Based on the collection of data on the participation of artists in Portuguese exhibitions, art fairs, auctions, and art awards between 2012 and 2019, the goal of this study is to portray female artists’ participation as a condition of professional, social, and cultural visibility. From the analysis of a significant sample of institutions from the artistic field, it was possible to observe that the works of female authors are under exhibited, never exceeding one-third of the total of exhibitions. Male artists also enjoy a comfortable majority as gallery artists (around 70%) and as part of institutional collections (around 80%). However, when analysing the younger age cohorts of artists by gender, it appears that there is representation parity, which may be a good sign of change. The data shows that there are persistent gender inequalities in accessing the artist profession. Women are not yet occupying positions of exposure, recognition, and legitimation in the market similar to those of their male counterparts, suggesting that they may face greater obstacles in experiencing successful professional trajectories.

Keywords: inequalities, invisibility of the woman artist, gender, visual arts

Procedia PDF Downloads 132

3344 Optimization Aluminium Design for the Facade Second Skin toward Visual Comfort: Case Studies & Dialux Daylighting Simulation Model

Authors: Yaseri Dahlia Apritasari

Abstract:

Visual comfort is important for the building occupants to need. Visual comfort can be fulfilled through natural lighting (daylighting) and artificial lighting. One strategy to optimize natural lighting can be achieved through the facade second skin design. This strategy can reduce glare, and fulfill visual comfort need. However, the design strategy cannot achieve light intensity for visual comfort. Because the materials, design and opening percentage of the facade of second skin blocked sunlight. This paper discusses aluminum material for the facade second skin design that can fulfill the optimal visual comfort with the case studies Multi Media Tower building. The methodology of the research is combination quantitative and qualitative through field study observed, lighting measurement and visual comfort questionnaire. Then it used too simulation modeling (DIALUX 4.13, 2016) for three facades second skin design model. Through following steps; (1) Measuring visual comfort factor: light intensity indoor and outdoor; (2) Taking visual comfort data from building occupants; (3) Making models with different facade second skin design; (3) Simulating and analyzing the light intensity value for each models that meet occupants visual comfort standard: 350 lux (Indonesia National Standard, 2010). The result shows that optimization of aluminum material for the facade second skin design can meet optimal visual comfort for building occupants. The result can give recommendation aluminum opening percentage of the facade second skin can meet optimal visual comfort for building occupants.

Keywords: aluminium material, Facade, second skin, visual comfort

Procedia PDF Downloads 349

3343 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 96

3342 Freedom of Expression and Its Restriction in Audiovisual Media

Authors: Sevil Yildiz

Abstract:

Audio visual communication is a type of collective expression. Collective expression activity informs the masses, gives direction to opinions and establishes public opinion. Due to these characteristics, audio visual communication must be subjected to special restrictions. This has been stipulated in both the Constitution and the European Human Rights Agreement. This paper aims to review freedom of expression and its restriction in audio visual media. For this purpose, the authorisation of the Radio and Television Supreme Council to impose sanctions as an independent administrative authority empowered to regulate the field of audio visual communication has been reviewed with regard to freedom of expression and its limits.

Keywords: audio visual media, freedom of expression, its limits, radio and television supreme council

Procedia PDF Downloads 321

3341 Recognition and Protection of Indigenous Society in Indonesia

Authors: Triyanto, Rima Vien Permata Hartanto

Abstract:

Indonesia is a legal state. The consequence of this status is the recognition and protection of the existence of indigenous peoples. This paper aims to describe the dynamics of legal recognition and protection for indigenous peoples within the framework of Indonesian law. This paper is library research based on literature. The result states that although the constitution has normatively recognized the existence of indigenous peoples and their traditional rights, in reality, not all rights were recognized and protected. The protection and recognition for indigenous people need to be strengthened.

Keywords: indigenous peoples, customary law, state law, state of law

Procedia PDF Downloads 320

3340 Detecting Characters as Objects Towards Character Recognition on Licence Plates

Authors: Alden Boby, Dane Brown, James Connan

Abstract:

Character recognition is a well-researched topic across disciplines. Regardless, creating a solution that can cater to multiple situations is still challenging. Vehicle licence plates lack an international standard, meaning that different countries and regions have their own licence plate format. A problem that arises from this is that the typefaces and designs from different regions make it difficult to create a solution that can cater to a wide range of licence plates. The main issue concerning detection is the character recognition stage. This paper aims to create an object detection-based character recognition model trained on a custom dataset that consists of typefaces of licence plates from various regions. Given that characters have featured consistently maintained across an array of fonts, YOLO can be trained to recognise characters based on these features, which may provide better performance than OCR methods such as Tesseract OCR.

Keywords: computer vision, character recognition, licence plate recognition, object detection

Procedia PDF Downloads 115

3339 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 189

3338 Effects of Reversible Watermarking on Iris Recognition Performance

Authors: Andrew Lock, Alastair Allen

Abstract:

Fragile watermarking has been proposed as a means of adding additional security or functionality to biometric systems, particularly for authentication and tamper detection. In this paper we describe an experimental study on the effect of watermarking iris images with a particular class of fragile algorithm, reversible algorithms, and the ability to correctly perform iris recognition. We investigate two scenarios, matching watermarked images to unmodiﬁed images, and matching watermarked images to watermarked images. We show that different watermarking schemes give very different results for a given capacity, highlighting the importance of investigation. At high embedding rates most algorithms cause significant reduction in recognition performance. However, in many cases, for low embedding rates, recognition accuracy is improved by the watermarking process.

Keywords: biometrics, iris recognition, reversible watermarking, vision engineering

Procedia PDF Downloads 451

3337 Visual and Verbal Imagination in a Bilingual Context

Authors: Erzsebet Gulyas

Abstract:

Our inner world, our imagination, and our way of thinking are invisible and inaudible to others, but they influence our behavior. To investigate the relationship between thinking and language use, we created a test in Hungarian using ideas from the literature. The test prompts participants to make decisions based on visual images derived from the written information presented. There is a correlation (r=0.5) between the test result and the self-assessment of the visual imagery vividness and the visual and verbal components of internal representations measured by self-report questionnaires, as well as with responses to language-use inquiries in the background questionnaire. 56 university students completed the tests, and SPSS was used to analyze the data.

Keywords: imagination, internal representations, verbalization, visualization

Procedia PDF Downloads 49

3336 Binocular Heterogeneity in Saccadic Suppression

Authors: Evgeny Kozubenko, Dmitry Shaposhnikov, Mikhail Petrushan

Abstract:

This work is focused on the study of the binocular characteristics of the phenomenon of perisaccadic suppression in humans when perceiving visual objects. This phenomenon manifests in a decrease in the subject's ability to perceive visual information during saccades, which play an important role in purpose-driven behavior and visual perception. It was shown that the impairment of perception of visual information in the post-saccadic time window is stronger (p < 0.05) in the ipsilateral eye (the eye towards which the saccade occurs). In addition, the observed heterogeneity of post-saccadic suppression in the contralateral and ipsilateral eyes may relate to depth perception. Taking the studied phenomenon into account is important when developing ergonomic control panels in modern operator systems.

Keywords: eye movement, natural vision, saccadic suppression, visual perception

Procedia PDF Downloads 150

3335 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 185

3334 Promoting Visual Literacy from Primary to Tertiary Levels through Literature

Authors: Mohd Nazri Latiff Azmi, Mairas Abd Rahman

Abstract:

Traditionally, literacy has been commonly defined as the ability to read and write at an adequate level of proficiency that is necessary for communication. However, as time goes by, literacy has started to refer to reading and writing at a level adequate for communication, or at a level that lets one understand and communicate ideas in a literate society, so as to take part in that society. Meanwhile, visual literacy is a set of abilities that enables an individual to effectively find, interpret, evaluate, use, and create images and visual media. This study aims to investigate the collaboration between visual literacy and literature, eventually to determine how visual literacy can enhance learner’s ability to comprehend literary texts such as poems and short stories and develop his intellectuality, especially critical and creative thinking skills, and also to find out the different impacts of literature in visual literacy at four levels of education: pre-school, primary and secondary schools and university. This study is based on Malaysian environment and involves a qualitative method consisting of observation and interviews. The initial findings show that people with different levels of education grasp visual literacy differently but all levels show outstanding impacts of using literature.

Keywords: visual literacy, literature, language studies, higher education

Procedia PDF Downloads 363

3333 Artificial Generation of Visual Evoked Potential to Enhance Visual Ability

Authors: A. Vani, M. N. Mamatha

Abstract:

Visual signal processing in human beings occurs in the occipital lobe of the brain. The signals that are generated in the brain are universal for all the human beings and they are called Visual Evoked Potential (VEP). Generally, the visually impaired people lose sight because of severe damage to only the eyes natural photo sensors, but the occipital lobe will still be functioning. In this paper, a technique of artificially generating VEP is proposed to enhance the visual ability of the subject. The system uses the electrical photoreceptors to capture image, process the image, to detect and recognize the subject or object. This voltage is further processed and can transmit wirelessly to a BIOMEMS implanted into occipital lobe of the patient’s brain. The proposed BIOMEMS consists of array of electrodes that generate the neuron potential which is similar to VEP of normal people. Thus, the neurons get the visual data from the BioMEMS which helps in generating partial vision or sight for the visually challenged patient.

Keywords: BioMEMS, neuro-prosthetic, openvibe, visual evoked potential

Procedia PDF Downloads 305