Search results for: Visual speech.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 802

Search results for: Visual speech.

622 Effective Image and Video Error Concealment using RST-Invariant Partial Patch Matching Model and Exemplar-based Inpainting

Authors: Shiraz Ahmad, Zhe-Ming Lu

Abstract:

An effective visual error concealment method has been presented by employing a robust rotation, scale, and translation (RST) invariant partial patch matching model (RSTI-PPMM) and exemplar-based inpainting. While the proposed robust and inherently feature-enhanced texture synthesis approach ensures the generation of excellent and perceptually plausible visual error concealment results, the outlier pruning property guarantees the significant quality improvements, both quantitatively and qualitatively. No intermediate user-interaction is required for the pre-segmented media and the presented method follows a bootstrapping approach for an automatic visual loss recovery and the image and video error concealment.

Keywords: Exemplar-based image and video inpainting, outlierpruning, RST-invariant partial patch matching model (RSTI-PPMM), visual error concealment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374
621 Automatically-generated Concept Maps as a Learning Tool

Authors: Xia Lin

Abstract:

Concept maps can be generated manually or automatically. It is important to recognize differences of the two types of concept maps. The automatically generated concept maps are dynamic, interactive, and full of associations between the terms on the maps and the underlying documents. Through a specific concept mapping system, Visual Concept Explorer (VCE), this paper discusses how automatically generated concept maps are different from manually generated concept maps and how different applications and learning opportunities might be created with the automatically generated concept maps. The paper presents several examples of learning strategies that take advantages of the automatically generated concept maps for concept learning and exploration.

Keywords: Concept maps, Dynamic concept representation, learning strategies, visual interface, Visual Concept Explorer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468
620 Communication Design in Newspapers: A Comparative Study of Graphic Resources in Portuguese and Spanish Publications

Authors: Fátima Gonçalves, Joaquim Brigas, Jorge Gonçalves

Abstract:

As a way of managing the increasing volume and complexity of information that circulates in the present time, graphical representations are increasingly used, which add meaning to the information presented in communication media, through an efficient communication design. The visual culture itself, driven by technological evolution, has been redefining the forms of communication, so that contemporary visual communication represents a major impact on society. This article presents the results and respective comparative analysis of four publications in the Iberian press, focusing on the formal aspects of newspapers and the space they dedicate to the various communication elements. Two Portuguese newspapers and two Spanish newspapers were selected for this purpose. The findings indicated that the newspapers show a similarity in the use of graphic solutions, which corroborate a visual trend in communication design. The results also reveal that Spanish newspapers are more meticulous with graphic consistency. This study intended to contribute to improving knowledge of the Iberian generalist press.

Keywords: Communication design, graphic resources, Iberian Press, visual journalism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1170
619 Image Indexing Using a Color Similarity Metric based on the Human Visual System

Authors: Angelo Nodari, Ignazio Gallo

Abstract:

The novelty proposed in this study is twofold and consists in the developing of a new color similarity metric based on the human visual system and a new color indexing based on a textual approach. The new color similarity metric proposed is based on the color perception of the human visual system. Consequently the results returned by the indexing system can fulfill as much as possibile the user expectations. We developed a web application to collect the users judgments about the similarities between colors, whose results are used to estimate the metric proposed in this study. In order to index the image's colors, we used a text indexing engine to facilitate the integration of visual features in a database of text documents. The textual signature is build by weighting the image's colors in according to their occurrence in the image. The use of a textual indexing engine, provide us a simple, fast and robust solution to index images. A typical usage of the system proposed in this study, is the development of applications whose data type is both visual and textual. In order to evaluate the proposed method we chose a price comparison engine as a case of study, collecting a series of commercial offers containing the textual description and the image representing a specific commercial offer.

Keywords: Color Extraction, Content-Based Image Retrieval, Indexing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2985
618 Ontology for a Voice Transcription of OpenStreetMap Data: The Case of Space Apprehension by Visually Impaired Persons

Authors: Said Boularouk, Didier Josselin, Eitan Altman

Abstract:

In this paper, we present a vocal ontology of OpenStreetMap data for the apprehension of space by visually impaired people. Indeed, the platform based on produsage gives a freedom to data producers to choose the descriptors of geocoded locations. Unfortunately, this freedom, called also folksonomy leads to complicate subsequent searches of data. We try to solve this issue in a simple but usable method to extract data from OSM databases in order to send them to visually impaired people using Text To Speech technology. We focus on how to help people suffering from visual disability to plan their itinerary, to comprehend a map by querying computer and getting information about surrounding environment in a mono-modal human-computer dialogue.

Keywords: Ontology, OpenStreetMap, visually impaired people, TTS, taxonomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 840
617 The Effects of Immersion on Visual Attention and Detection of Signals Performance for Virtual Reality Training Systems

Authors: Shiau-Feng Lin, Chiuhsiang Joe Lin, Rou-Wen Wang, Wei-Jung Shiang

Abstract:

The Virtual Reality (VR) is becoming increasingly important for business, education, and entertainment, therefore VR technology have been applied for training purposes in the areas of military, safety training and flying simulators. In particular, the superior and high reliability VR training system is very important in immersion. Manipulation training in immersive virtual environments is difficult partly because users must do without the hap contact with real objects they rely on in the real world to orient themselves and their manipulated. In this paper, we create a convincing questionnaire of immersion and an experiment to assess the influence of immersion on performance in VR training system. The Immersion Questionnaire (IQ) included spatial immersion, Psychological immersion, and Sensory immersion. We show that users with a training system complete visual attention and detection of signals. Twenty subjects were allocated to a factorial design consisting of two different VR systems (Desktop VR and Projector VR). The results indicated that different VR representation methods significantly affected the participants- Immersion dimensions.

Keywords: Virtual Reality, Training, Immersion, Visual Attention, Visual Detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782
616 Analysis of the Visual Preference of Patterns in Pedestrian Roads

Authors: Kang, Eun Sung, Song, Hyeong Wook, Kim, Hong Kyu

Abstract:

The purpose of this study is to analyze the visual preference of patterns in pedestrian roads. In this study, animation was applied for the estimation of dynamic streetscape. Six patterns of pedestrian were selected in order to analyze the visual preference. The shapes are straight, s-curve, and zigzag. The ratio of building's height and road's width are 2:1 and 1:1. Twelve adjective pairs used in the field investigation were selected from adjectives which are used usually in the estimation of streetscape. They are interesting-boring, simple-complex, calm-noisy, open-enclosed, active-inactive, lightly-depressing, regular-irregular, unique-usual, rhythmic-not rhythmic, united-not united, stable-unstable, tidy-untidy. Dynamic streetscape must be considered important in pedestrian shopping mall and park because it will be an attraction. So, s-curve pedestrian road, which is the most beautiful as a result of this study, should be designed in this area. Also, the ratio of building's height and road's width along pedestrian road should be reduced.

Keywords: Visual preference, streetscape, animation, simulation, pedestrian.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1131
615 SySRA: A System of a Continuous Speech Recognition in Arab Language

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.

Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1344
614 Image Adaptive Watermarking with Visual Model in Orthogonal Polynomials based Transformation Domain

Authors: Krishnamoorthi R., Sheba Kezia Malarchelvi P. D.

Abstract:

In this paper, an image adaptive, invisible digital watermarking algorithm with Orthogonal Polynomials based Transformation (OPT) is proposed, for copyright protection of digital images. The proposed algorithm utilizes a visual model to determine the watermarking strength necessary to invisibly embed the watermark in the mid frequency AC coefficients of the cover image, chosen with a secret key. The visual model is designed to generate a Just Noticeable Distortion mask (JND) by analyzing the low level image characteristics such as textures, edges and luminance of the cover image in the orthogonal polynomials based transformation domain. Since the secret key is required for both embedding and extraction of watermark, it is not possible for an unauthorized user to extract the embedded watermark. The proposed scheme is robust to common image processing distortions like filtering, JPEG compression and additive noise. Experimental results show that the quality of OPT domain watermarked images is better than its DCT counterpart.

Keywords: Orthogonal Polynomials based Transformation, Digital Watermarking, Copyright Protection, Visual model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1655
613 Efficient System for Speech Recognition using General Regression Neural Network

Authors: Abderrahmane Amrouche, Jean Michel Rouvaen

Abstract:

In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural network (GRNN). The relative performances of the proposed model are compared to the similar recognition systems based on the Multilayer Perceptron (MLP), the Recurrent Neural Network (RNN) and the well known Discrete Hidden Markov Model (HMM-VQ) that we have achieved also. Experimental results obtained with Arabic digits have shown that the use of nonparametric density estimation with an appropriate smoothing factor (spread) improves the generalization power of the neural network. The word error rate (WER) is reduced significantly over the baseline HMM method. GRNN computation is a successful alternative to the other neural network and DHMM.

Keywords: Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2134
612 Web Pages Aesthetic Evaluation Using Low-Level Visual Features

Authors: Maryam Mirdehghani, S. Amirhassan Monadjemi

Abstract:

Web sites are rapidly becoming the preferred media choice for our daily works such as information search, company presentation, shopping, and so on. At the same time, we live in a period where visual appearances play an increasingly important role in our daily life. In spite of designers- effort to develop a web site which be both user-friendly and attractive, it would be difficult to ensure the outcome-s aesthetic quality, since the visual appearance is a matter of an individual self perception and opinion. In this study, it is attempted to develop an automatic system for web pages aesthetic evaluation which are the building blocks of web sites. Based on the image processing techniques and artificial neural networks, the proposed method would be able to categorize the input web page according to its visual appearance and aesthetic quality. The employed features are multiscale/multidirectional textural and perceptual color properties of the web pages, fed to perceptron ANN which has been trained as the evaluator. The method is tested using university web sites and the results suggested that it would perform well in the web page aesthetic evaluation tasks with around 90% correct categorization.

Keywords: Web Page Design, Web Page Aesthetic, Color Spaces, Texture, Neural Networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
611 A Supervised Text-Independent Speaker Recognition Approach

Authors: Tudor Barbu

Abstract:

We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.

Keywords: Text-independent speaker recognition, mel cepstral analysis, speech feature vector, Hausdorff-based metric, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784
610 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) of English and machine translation (MT) for English and Croatian and Croatian-English language pairs in the domain of business correspondence. The first part presents results of training the ASR commercial system on English data sets, enriched by error analysis. The second part presents results of machine translation performed by free online tool for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: Automatic machine translation, integrated language technologies, quality evaluation, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2860
609 Retrieval of Relevant Visual Data in Selected Machine Vision Tasks: Examples of Hardware-based and Software-based Solutions

Authors: Andrzej Śluzek

Abstract:

To illustrate diversity of methods used to extract relevant (where the concept of relevance can be differently defined for different applications) visual data, the paper discusses three groups of such methods. They have been selected from a range of alternatives to highlight how hardware and software tools can be complementarily used in order to achieve various functionalities in case of different specifications of “relevant data". First, principles of gated imaging are presented (where relevance is determined by the range). The second methodology is intended for intelligent intrusion detection, while the last one is used for content-based image matching and retrieval. All methods have been developed within projects supervised by the author.

Keywords: Relevant visual data, gated imaging, intrusion detection, image matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
608 Research on Hypermediated Images in Asian Films

Authors: Somi Nah, Timothy Yoonsuk Lee, Jinhwan Yu

Abstract:

In films, visual effects have played the role of expressing realities more realistically or describing imaginations as if they are real. Such images are immediated images representing realism, and the logic of immediation for the reality of images has been perceived dominant in visual effects. In order for immediation to have an identity as immediation, there should be the opposite concept hypermediation. In the mid 2000s, hypermediated images were settled as a code of mass culture in Asia. Thus, among Asian films highly popular in those days, this study selected five displaying hypermediated images – 2 Korean, 2 Japanese, and 1 Thailand movies – and examined the semiotic meanings of such images using Roland Barthes- directional and implicated meaning analysis and Metz-s paradigmatic analysis method, focusing on how hypermediated images work in the general context of the films, how they are associated with spaces, and what meanings they try to carry.

Keywords: Asian Films, Hypermediated Images, Semiotics, Visual Effects

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
607 Using Speech Emotion Recognition as a Longitudinal Biomarker for Alzheimer’s Disease

Authors: Yishu Gong, Liangliang Yang, Jianyu Zhang, Zhengyu Chen, Sihong He, Xusheng Zhang, Wei Zhang

Abstract:

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and is characterized by cognitive decline and behavioral changes. People living with Alzheimer’s disease often find it hard to complete routine tasks. However, there are limited objective assessments that aim to quantify the difficulty of certain tasks for AD patients compared to non-AD people. In this study, we propose to use speech emotion recognition (SER), especially the frustration level as a potential biomarker for quantifying the difficulty patients experience when describing a picture. We build an SER model using data from the IEMOCAP dataset and apply the model to the DementiaBank data to detect the AD/non-AD group difference and perform longitudinal analysis to track the AD disease progression. Our results show that the frustration level detected from the SER model can possibly be used as a cost-effective tool for objective tracking of AD progression in addition to the Mini-Mental State Examination (MMSE) score.

Keywords: Alzheimer’s disease, Speech Emotion Recognition, longitudinal biomarker, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 162
606 Spectral Analysis of Speech: A New Technique

Authors: Neeta Awasthy, J.P.Saini, D.S.Chauhan

Abstract:

ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition. The mixing matrix A is different before and after MFCC as expected. As Mel is a nonlinear scale. However, cepstrals generated from Linear Predictive Coefficient being independent prove to be the right candidate for ICA. Matlab is the tool used for all comparisons. The database used is samples of ISOLET.

Keywords: Cepstral Coefficient, Distance measures, Independent Component Analysis, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1911
605 Multi Switched Split Vector Quantization of Narrowband Speech Signals

Authors: M. Satya Sai Ram, P. Siddaiah, M. Madhavi Latha

Abstract:

Vector quantization is a powerful tool for speech coding applications. This paper deals with LPC Coding of speech signals which uses a new technique called Multi Switched Split Vector Quantization (MSSVQ), which is a hybrid of Multi, switched, split vector quantization techniques. The spectral distortion performance, computational complexity, and memory requirements of MSSVQ are compared to split vector quantization (SVQ), multi stage vector quantization(MSVQ) and switched split vector quantization (SSVQ) techniques. It has been proved from results that MSSVQ has better spectral distortion performance, lower computational complexity and lower memory requirements when compared to all the above mentioned product code vector quantization techniques. Computational complexity is measured in floating point operations (flops), and memory requirements is measured in (floats).

Keywords: Linear predictive Coding, Multi stage vectorquantization, Switched Split vector quantization, Split vectorquantization, Line Spectral Frequencies (LSF).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619
604 A Visual Educational Modeling Language to Help Teachers in Learning Scenario Design

Authors: A. Retbi, M. Khalidi Idrissi, S. Bennani

Abstract:

The success of an e-learning system is highly dependent on the quality of its educational content and how effective, complete, and simple the design tool can be for teachers. Educational modeling languages (EMLs) are proposed as design languages intended to teachers for modeling diverse teaching-learning experiences, independently of the pedagogical approach and in different contexts. However, most existing EMLs are criticized for being too abstract and too complex to be understood and manipulated by teachers. In this paper, we present a visual EML that simplifies the process of designing learning scenarios for teachers with no programming background. Based on the conceptual framework of the activity theory, our resulting visual EML focuses on using Domainspecific modeling techniques to provide a pedagogical level of abstraction in the design process.

Keywords: Educational modeling language, Domain Specific Modeling, authoring systems, learning scenario.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2274
603 Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

Authors: Zaineb Ben Messaoud, Dorra Gargouri, Saida Zribi, Ahmed Ben Hamida

Abstract:

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a time- sequence of peaks which satisfies continuity constraints on parameter; the within peaks are modelled by the LP parameters. The formant tracking LP model estimation is composed of three stages: (1) a pre-cleaning multi-band spectral subtraction stage to reduce the effect of residue noise on formants (2) estimation stage where an initial estimate of the LP model of speech for each frame is obtained (3) a formant classification using probability models of formants and Viterbi-decoders. The evaluation results for the estimation of the formant tracking LP model tested in Gaussian white noise background, demonstrate that the proposed combination of the initial noise reduction stage with formant tracking and LPC variable order analysis, results in a significant reduction in errors and distortions. The performance was evaluated with noisy natual vowels extracted from international french and English vocabulary speech signals at SNR value of 10dB. In each case, the estimated formants are compared to reference formants.

Keywords: Formants Estimation, HMM, Multi Band Spectral Subtraction, Variable order LPC coding, White Gauusien Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1927
602 OILU Tag: A Projective Invariant Fiducial System

Authors: Youssef Chahir, Messaoud Mostefai, Salah Khodja

Abstract:

This paper presents the development of a 2D visual marker, derived from a recent patented work in the field of numbering systems. The proposed fiducial uses a group of projective invariant straight-line patterns, easily detectable and remotely recognizable. Based on an efficient data coding scheme, the developed marker enables producing a large panel of unique real time identifiers with highly distinguishable patterns. The proposed marker Incorporates simultaneously decimal and binary information, making it readable by both humans and machines. This important feature opens up new opportunities for the development of efficient visual human-machine communication and monitoring protocols. Extensive experiment tests validate the robustness of the marker against acquisition and geometric distortions.

Keywords: visual marker, projective invariants, distance map, level set

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 460
601 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1136
600 A Robust Visual Tracking Algorithm with Low-Rank Region Covariance

Authors: Songtao Wu, Yuesheng Zhu, Ziqiang Sun

Abstract:

Region covariance (RC) descriptor is an effective and efficient feature for visual tracking. Current RC-based tracking algorithms use the whole RC matrix to track the target in video directly. However, there exist some issues for these whole RCbased algorithms. If some features are contaminated, the whole RC will become unreliable, which results in lost object-tracking. In addition, if some features are very discriminative to the background, other features are still processed and thus reduce the efficiency. In this paper a new robust tracking method is proposed, in which the whole RC matrix is decomposed into several low rank matrices. Those matrices are dynamically chosen and processed so as to achieve a good tradeoff between discriminability and complexity. Experimental results have shown that our method is more robust to complex environment changes, especially either when occlusion happens or when the background is similar to the target compared to other RC-based methods.

Keywords: Visual tracking, region covariance descriptor, lowrankregion covariance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533
599 A Neural Model of Object Naming

Authors: Alessio Plebe

Abstract:

One astonishing capability of humans is to recognize thousands of different objects visually, and to learn the semantic association between those objects and words referring to them. This work is an attempt to build a computational model of such capacity,simulating the process by which infants learn how to recognize objects and words through exposure to visual stimuli and vocal sounds.One of the main fact shaping the brain of a newborn is that lights and colors come from entities of the world. Gradually the visual system learn which light sensations belong to same entities, despite large changes in appearance. This experience is common between humans and several other mammals, like non-human primates. But humans only can recognize a huge variety of objects, most manufactured by himself, and make use of sounds to identify and categorize them. The aim of this model is to reproduce these processes in a biologically plausible way, by reconstructing the essential hierarchy of cortical circuits on the visual and auditory neural paths.

Keywords: Auditory cortex, object recognition, self-organizingmaps

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
598 Operation Planning of Concrete Box Girder Bridge by 4D CAD Visualization Techniques

Authors: Mohammad Rohani, Gholamali Shafabakhsh, Abdolhosein Haddad, Ehsan Asnaashari

Abstract:

Visual simulation has emerged as a key planning tool in built environment because it enables architects, engineers and project managers to visualize construction process evolution before the project actual commences. This provides an efficient technology for reducing time and cost through planning and controlling resources, machines and materials. With the development of infrastructure projects and the massive civil constructions such as bridges, urban tunnels and highways as well as sensitivity of their construction operations, it is very necessary to apply proper planning methods. Implementation of visual techniques into management of construction projects can provide a fundamental foundation for projects with massive activities and duplicate items. So, the purpose of this paper is to develop visual simulation management techniques for infrastructure projects such as highways bridges by the use of Four-Dimensional Computer-Aided design Models. This project simulates operational assembly-line for Box-Girder Concrete Bridges which it would be able to optimize the sequence and interaction of project activities and on the other hand, it would minimize any unintended conflicts prior to project start. In this paper, after introducing the various planning methods by building information model and concrete bridges in highways, an executive case study is demonstrated and then a visual technique (4D CAD) will be applied for the case. In the final step, the user feedback for interacting by this system evaluated according to six criteria.

Keywords: 4D application area, Box-Girder concrete bridges, CAD model, visual planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1521
597 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652
596 Visual Odometry and Trajectory Reconstruction for UAVs

Authors: Sandro Bartolini, Alessandro Mecocci, Alessio Medaglini

Abstract:

The growing popularity of systems based on Unmanned Aerial Vehicles (UAVs) is highlighting their vulnerability particularly in relation to the positioning system used. Typically, UAV architectures use the civilian GPS which is exposed to a number of different attacks, such as jamming or spoofing. This is why it is important to develop alternative methodologies to accurately estimate the actual UAV position without relying on GPS measurements only. In this paper we propose a position estimate method for UAVs based on monocular visual odometry. We have developed a flight control system capable of keeping track of the entire trajectory travelled, with a reduced dependency on the availability of GPS signal. Moreover, the simplicity of the developed solution makes it applicable to a wide range of commercial drones. The final goal is to allow for safer flights in all conditions, even under cyber-attacks trying to deceive the drone.

Keywords: Visual odometry, autonomous UAV, position measurement, autonomous outdoor flight.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 519
595 Evaluation of Hand Grip Strength and EMG Signal on Visual Reaction

Authors: Sung-Wook Shin, Sung-Taek Chung

Abstract:

Hand grip strength has been utilized as an indicator to evaluate the motor ability of hands, responsible for performing multiple body functions. It is, however, difficult to evaluate other factors (other than hand muscular strength) utilizing the hand grip strength only. In this study, we analyzed the motor ability of hands using EMG and the hand grip strength, simultaneously in order to evaluate concentration, muscular strength reaction time, instantaneous muscular strength change, and agility in response to visual reaction. In results, the average time (and their standard deviations) of muscular strength reaction EMG signal and hand grip strength was found to be 209.6 ± 56.2 ms and 354.3 ± 54.6 ms, respectively. In addition, the onset time which represents acceleration time to reach 90% of maximum hand grip strength, was 382.9 ± 129.9 ms.

Keywords: Hand grip strength, EMG, visual reaction, endurance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2944
594 Daylightophil Approach towards High-Performance Architecture for Hybrid-Optimization of Visual Comfort and Daylight Factor in BSk

Authors: Mohammadjavad Mahdavinejad, Hadi Yazdi

Abstract:

The greatest influence we have from the world is shaped through the visual form, thus light is an inseparable element in human life. The use of daylight in visual perception and environment readability is an important issue for users. With regard to the hazards of greenhouse gas emissions from fossil fuels, and in line with the attitudes on the reduction of energy consumption, the correct use of daylight results in lower levels of energy consumed by artificial lighting, heating and cooling systems. Windows are usually the starting points for analysis and simulations to achieve visual comfort and energy optimization; therefore, attention should be paid to the orientation of buildings to minimize electrical energy and maximize the use of daylight. In this paper, by using the Design Builder Software, the effect of the orientation of an 18m2(3m*6m) room with 3m height in city of Tehran has been investigated considering the design constraint limitations. In these simulations, the dimensions of the building have been changed with one degree and the window is located on the smaller face (3m*3m) of the building with 80% ratio. The results indicate that the orientation of building has a lot to do with energy efficiency to meet high-performance architecture and planning goals and objectives.

Keywords: Daylight, window, orientation, energy consumption, design builder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1036
593 A Vehicular Visual Tracking System Incorporating Global Positioning System

Authors: Hsien-Chou Liao, Yu-Shiang Wang

Abstract:

Surveillance system is widely used in the traffic monitoring. The deployment of cameras is moving toward a ubiquitous camera (UbiCam) environment. In our previous study, a novel service, called GPS-VT, was firstly proposed by incorporating global positioning system (GPS) and visual tracking techniques for the UbiCam environment. The first prototype is called GODTA (GPS-based Moving Object Detection and Tracking Approach). For a moving person carried GPS-enabled mobile device, he can be tracking when he enters the field-of-view (FOV) of a camera according to his real-time GPS coordinate. In this paper, GPS-VT service is applied to the tracking of vehicles. The moving speed of a vehicle is much faster than a person. It means that the time passing through the FOV is much shorter than that of a person. Besides, the update interval of GPS coordinate is once per second, it is asynchronous with the frame rate of the real-time image. The above asynchronous is worsen by the network transmission delay. These factors are the main challenging to fulfill GPS-VT service on a vehicle.In order to overcome the influence of the above factors, a back-propagation neural network (BPNN) is used to predict the possible lane before the vehicle enters the FOV of a camera. Then, a template matching technique is used for the visual tracking of a target vehicle. The experimental result shows that the target vehicle can be located and tracking successfully. The success location rate of the implemented prototype is higher than that of the previous GODTA.

Keywords: visual surveillance, visual tracking, globalpositioning system, intelligent transportation system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1879