Search results for: multimodal text
612 Development of Multimodal e-Slide Presentation to Support Self-Learning for the Visually Impaired
Authors: Rustam Asnawi, Wan Fatimah Wan Ahmad
Abstract:
Currently electronic slide (e-slide) is one of the most common styles in educational presentation. Unfortunately, the utilization of e-slide for the visually impaired is uncommon since they are unable to see the content of such e-slides which are usually composed of text, images and animation. This paper proposes a model for presenting e-slide in multimodal presentation i.e. using conventional slide concurrent with voicing, in both languages Malay and English. At the design level, live multimedia presentation concept is used, while at the implementation level several components are used. The text content of each slide is extracted using COM component, Microsoft Speech API for voicing the text in English language and the text in Malay language is voiced using dictionary approach. To support the accessibility, an auditory user interface is provided as an additional feature. A prototype of such model named as VSlide has been developed and introduced.
Keywords: presentation, self-learning, slide, visually impaired
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1569611 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme
Authors: Andrey V. Timofeev, Dmitry V. Egorov
Abstract:
This paper introduces an original method of parametric optimization of the structure for multimodal decisionlevel fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.
Keywords: Сlassification accuracy, fusion solution, total error rate.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975610 eMedI: Web-Based E-Training for Multimodal Breast Imaging
Authors: Ioannis Pratikakis, Anna Karahaliou, Katerina Vassiou, Vassilis Virvilis, Dimitrios Kosmopoulos, Stavros Perantonis
Abstract:
In this paper, a Web-based e-Training platform that is dedicated to multimodal breast imaging is presented. The assets of this platform are summarised in (i) the efficient representation of the curriculum flow that will permit efficient training; (ii) efficient tagging of multimodal content appropriate for the completion of realistic cases and (iii) ubiquitous accessibility and platform independence via a web-based approach.
Keywords: Breast imaging, e-Training, web-based learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1686609 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction
Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue
Abstract:
OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.Keywords: Open multimodal emotion corpus, annotated labels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1820608 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction
Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue
Abstract:
OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.Keywords: Open multimodal emotion corpus, annotated labels.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 389607 New Approach for Constructing a Secure Biometric Database
Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir
Abstract:
The multimodal biometric identification is the combination of several biometric systems; the challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.
Keywords: Biometric databases, Multimodal biometrics, security authentication, Digital watermarking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2090606 Continuous Text Translation Using Text Modeling in the Thetos System
Authors: Nina Suszczanska, Przemyslaw Szmal, Slawomir Kulikow
Abstract:
In the paper a method of modeling text for Polish is discussed. The method is aimed at transforming continuous input text into a text consisting of sentences in so called canonical form, whose characteristic is, among others, a complete structure as well as no anaphora or ellipses. The transformation is lossless as to the content of text being transformed. The modeling method has been worked out for the needs of the Thetos system, which translates Polish written texts into the Polish sign language. We believe that the method can be also used in various applications that deal with the natural language, e.g. in a text summary generator for Polish.Keywords: anaphora, machine translation, NLP, sign language, text syntax.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1656605 Analyzing Political Cartoons in Arabic-Language Media after Trump's Jerusalem Move: A Multimodal Discourse Perspective
Authors: Inas Hussein
Abstract:
Communication in the modern world is increasingly becoming multimodal due to globalization and the digital space we live in which have remarkably affected how people communicate. Accordingly, Multimodal Discourse Analysis (MDA) is an emerging paradigm in discourse studies with the underlying assumption that other semiotic resources such as images, colours, scientific symbolism, gestures, actions, music and sound, etc. combine with language in order to communicate meaning. One of the effective multimodal media that combines both verbal and non-verbal elements to create meaning is political cartoons. Furthermore, since political and social issues are mirrored in political cartoons, these are regarded as potential objects of discourse analysis since they not only reflect the thoughts of the public but they also have the power to influence them. The aim of this paper is to analyze some selected cartoons on the recognition of Jerusalem as Israel's capital by the American President, Donald Trump, adopting a multimodal approach. More specifically, the present research examines how the various semiotic tools and resources utilized by the cartoonists function in projecting the intended meaning. Ten political cartoons, among a surge of editorial cartoons highlighted by the Anti-Defamation League (ADL) - an international Jewish non-governmental organization based in the United States - as publications in different Arabic-language newspapers in Egypt, Saudi Arabia, UAE, Oman, Iran and UK, were purposively selected for semiotic analysis. These editorial cartoons, all published during 6th–18th December 2017, invariably suggest one theme: Jewish and Israeli domination of the United States. The data were analyzed using the framework of Visual Social Semiotics. In accordance with this methodological framework, the selected visual compositions were analyzed in terms of three aspects of meaning: representational, interactive and compositional. In analyzing the selected cartoons, an interpretative approach is being adopted. This approach prioritizes depth to breadth and enables insightful analyses of the chosen cartoons. The findings of the study reveal that semiotic resources are key elements of political cartoons due to the inherent political communication they convey. It is proved that adequate interpretation of the three aspects of meaning is a prerequisite for understanding the intended meaning of political cartoons. It is recommended that further research should be conducted to provide more insightful analyses of political cartoons from a multimodal perspective.
Keywords: Multimodal discourse analysis, multimodal text, political cartoons, visual modality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550604 A Brain Inspired Approach for Multi-View Patterns Identification
Authors: Yee Ling Boo, Damminda Alahakoon
Abstract:
Biologically human brain processes information in both unimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is demonstrated and discussed with some experimental results.
Keywords: Multimodal, Granularity, Hierarchical Clustering, Growing Self Organising Maps, Data Mining
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544603 A Grid-based Neural Network Framework for Multimodal Biometrics
Authors: Sitalakshmi Venkataraman
Abstract:
Recent scientific investigations indicate that multimodal biometrics overcome the technical limitations of unimodal biometrics, making them ideally suited for everyday life applications that require a reliable authentication system. However, for a successful adoption of multimodal biometrics, such systems would require large heterogeneous datasets with complex multimodal fusion and privacy schemes spanning various distributed environments. From experimental investigations of current multimodal systems, this paper reports the various issues related to speed, error-recovery and privacy that impede the diffusion of such systems in real-life. This calls for a robust mechanism that caters to the desired real-time performance, robust fusion schemes, interoperability and adaptable privacy policies. The main objective of this paper is to present a framework that addresses the abovementioned issues by leveraging on the heterogeneous resource sharing capacities of Grid services and the efficient machine learning capabilities of artificial neural networks (ANN). Hence, this paper proposes a Grid-based neural network framework for adopting multimodal biometrics with the view of overcoming the barriers of performance, privacy and risk issues that are associated with shared heterogeneous multimodal data centres. The framework combines the concept of Grid services for reliable brokering and privacy policy management of shared biometric resources along with a momentum back propagation ANN (MBPANN) model of machine learning for efficient multimodal fusion and authentication schemes. Real-life applications would be able to adopt the proposed framework to cater to the varying business requirements and user privacies for a successful diffusion of multimodal biometrics in various day-to-day transactions.Keywords: Back Propagation, Grid Services, MultimodalBiometrics, Neural Networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1917602 Soccer Video Edition Using a Multimodal Annotation
Authors: Fendri Emna, Ben-Abdallah Hanêne, Ben-Hamadou Abdelmajid
Abstract:
In this paper, we present an approach for soccer video edition using a multimodal annotation. We propose to associate with each video sequence of a soccer match a textual document to be used for further exploitation like search, browsing and abstract edition. The textual document contains video meta data, match meta data, and match data. This document, generated automatically while the video is analyzed, segmented and classified, can be enriched semi automatically according to the user type and/or a specialized recommendation system.Keywords: XML, Multimodal Annotation, recommendation system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1439601 Multimodal Biometric Authentication Using Choquet Integral and Genetic Algorithm
Authors: Anouar Ben Khalifa, Sami Gazzah, Najoua Essoukri BenAmara
Abstract:
The Choquet integral is a tool for the information fusion that is very effective in the case where fuzzy measures associated with it are well chosen. In this paper, we propose a new approach for calculating fuzzy measures associated with the Choquet integral in a context of data fusion in multimodal biometrics. The proposed approach is based on genetic algorithms. It has been validated in two databases: the first base is relative to synthetic scores and the second one is biometrically relating to the face, fingerprint and palmprint. The results achieved attest the robustness of the proposed approach.
Keywords: Multimodal biometrics, data fusion, Choquet integral, fuzzy measures, genetic algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2516600 OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text
Authors: A. R. Bagirzade, A. Sh. Najafova, S. M. Yessirkepova, E. S. Albert
Abstract:
This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication.
Keywords: ABBYY FineReader system, algorithm symbol recognition, OCR/ICR techniques, recognition technologies.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781599 On-Road Text Detection Platform for Driver Assistance Systems
Authors: Guezouli Larbi, Belkacem Soundes
Abstract:
The automation of the text detection process can help the human in his driving task. Its application can be very useful to help drivers to have more information about their environment by facilitating the reading of road signs such as directional signs, events, stores, etc. In this paper, a system consisting of two stages has been proposed. In the first one, we used pseudo-Zernike moments to pinpoint areas of the image that may contain text. The architecture of this part is based on three main steps, region of interest (ROI) detection, text localization, and non-text region filtering. Then, in the second step, we present a convolutional neural network architecture (On-Road Text Detection Network - ORTDN) which is considered as a classification phase. The results show that the proposed framework achieved ≈ 35 fps and an mAP of ≈ 90%, thus a low computational time with competitive accuracy.
Keywords: Text detection, CNN, PZM, deep learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 163598 Powerful Tool to Expand Business Intelligence: Text Mining
Authors: Li Gao, Elizabeth Chang, Song Han
Abstract:
With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining.Keywords: Business intelligence, document warehouse, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2660597 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques
Authors: Faisal Alshuwaier, Ali Areshey
Abstract:
Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound (BB) method to simplify the texts.
Keywords: Extraction, Max-Prod, Fuzzy Relations, Text Mining, Memberships, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184596 A Web Text Mining Flexible Architecture
Authors: M. Castellano, G. Mastronardi, A. Aprile, G. Tarricone
Abstract:
Text Mining is an important step of Knowledge Discovery process. It is used to extract hidden information from notstructured o semi-structured data. This aspect is fundamental because much of the Web information is semi-structured due to the nested structure of HTML code, much of the Web information is linked, much of the Web information is redundant. Web Text Mining helps whole knowledge mining process to mining, extraction and integration of useful data, information and knowledge from Web page contents. In this paper, we present a Web Text Mining process able to discover knowledge in a distributed and heterogeneous multiorganization environment. The Web Text Mining process is based on flexible architecture and is implemented by four steps able to examine web content and to extract useful hidden information through mining techniques. Our Web Text Mining prototype starts from the recovery of Web job offers in which, through a Text Mining process, useful information for fast classification of the same are drawn out, these information are, essentially, job offer place and skills.Keywords: Web text mining, flexible architecture, knowledgediscovery.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2665595 Key Based Text Watermarking of E-Text Documents in an Object Based Environment Using Z-Axis for Watermark Embedding
Authors: Mussarat Abdullah, Fazal Wahab
Abstract:
Data hiding into text documents itself involves pretty complexities due to the nature of text documents. A robust text watermarking scheme targeting an object based environment is presented in this research. The heart of the proposed solution describes the concept of watermarking an object based text document where each and every text string is entertained as a separate object having its own set of properties. Taking advantage of the z-ordering of objects watermark is applied with the z-axis letting zero fidelity disturbances to the text. Watermark sequence of bits generated against user key is hashed with selected properties of given document, to determine the bit sequence to embed. Bits are embedded along z-axis and the document has no fidelity issues when printed, scanned or photocopied.Keywords: Digital Watermarking, Object Based Environment, Watermark, z-ordering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687594 Using Trip Planners in Developing Proper Transportation Behavior
Authors: Grzegorz Sierpiński, Ireneusz Celiński, Marcin Staniek
Abstract:
The article discusses multimodal mobility in contemporary societies as a main planning and organization issue in the functioning of administrative bodies, a problem which really exists in the space of contemporary cities in terms of shaping modern transport systems. The article presents classification of available resources and initiatives undertaken for developing multimodal mobility. Solutions can be divided into three groups of measures – physical measures in the form of changes of the transport network infrastructure, organizational ones (including transport policy) and information measures. The latter ones include in particular direct support for people travelling in the transport network by providing information about ways of using available means of transport. A special measure contributing to this end is a trip planner. The article compares several selected planners. It includes a short description of the Green Travelling Project, which aims at developing a planner supporting environmentally friendly solutions in terms of transport network operation. The article summarizes preliminary findings of the project.
Keywords: Mobility, modal split, multimodal trip, multimodal platforms, sustainable transport.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890593 Emotions Triggered by Children’s Literature Images
Abstract:
The role of images/illustrations in communicating meanings and triggering emotions assumes an increasingly relevant role in contemporary texts, regardless of the age group for which they are intended or the nature of the texts that host them. It is no coincidence that children's books are full of illustrations and that the image/text ratio decreases as the age group grows. The vast majority of children's books can be considered as multimodal texts containing text and images/illustrations, interacting with each other, to provide the young reader with a broader and more creative understanding of the book's narrative. This interaction is very diverse, ranging from images/illustrations that are not essential for understanding the storytelling to those that contribute significantly to the meaning of the story. Usually, these books are also read by adults, namely by parents, educators, and teachers who act as mediators between the book and the children, explaining aspects that are or seem to be too complex for the child's context. It should be noted that there are books labeled as children's books, that are clearly intended for both children and adults. In this work, following a qualitative and interpretative methodology based on written productions, participant observation, and field notes, we will describe the perceptions of future teachers of the 1st cycle of basic education, attending a master’s degree at a Portuguese university, about the role of the image in literary and non-literary texts, namely in mathematical texts, and how these can constitute precious resources for emotional regulation and for the design of creative didactic situations. The analysis of the collected data allowed us to obtain evidence regarding the evolution of the participants' perception regarding the crucial role of images in children's literature, not only as an emotional regulator for young readers but also as a creative source for the design of meaningful didactical situations, crossing other scientific areas, other than the mother tongue, namely mathematics.
Keywords: Children’s literature, emotions, multimodal texts, soft skills.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 199592 Application of Smooth Ergodic Hidden Markov Model in Text to Speech Systems
Authors: Armin Ghayoori, Faramarz Hendessi, Asrar Sheikh
Abstract:
In developing a text-to-speech system, it is well known that the accuracy of information extracted from a text is crucial to produce high quality synthesized speech. In this paper, a new scheme for converting text into its equivalent phonetic spelling is introduced and developed. This method is applicable to many applications in text to speech converting systems and has many advantages over other methods. The proposed method can also complement the other methods with a purpose of improving their performance. The proposed method is a probabilistic model and is based on Smooth Ergodic Hidden Markov Model. This model can be considered as an extension to HMM. The proposed method is applied to Persian language and its accuracy in converting text to speech phonetics is evaluated using simulations.Keywords: Hidden Markov Models, text, synthesis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1549591 RB-Matcher: String Matching Technique
Authors: Rajender Singh Chillar, Barjesh Kochar
Abstract:
All Text processing systems allow their users to search a pattern of string from a given text. String matching is fundamental to database and text processing applications. Every text editor must contain a mechanism to search the current document for arbitrary strings. Spelling checkers scan an input text for words in the dictionary and reject any strings that do not match. We store our information in data bases so that later on we can retrieve the same and this retrieval can be done by using various string matching algorithms. This paper is describing a new string matching algorithm for various applications. A new algorithm has been designed with the help of Rabin Karp Matcher, to improve string matching process.Keywords: Algorithm, Complexity, Matching-patterns, Pattern, Rabin-Karp, String, text-processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766590 An Semantic Algorithm for Text Categoritation
Authors: Xu Zhao
Abstract:
Text categorization techniques are widely used to many Information Retrieval (IR) applications. In this paper, we proposed a simple but efficient method that can automatically find the relationship between any pair of terms and documents, also an indexing matrix is established for text categorization. We call this method Indexing Matrix Categorization Machine (IMCM). Several experiments are conducted to show the efficiency and robust of our algorithm.
Keywords: Text categorization, Sub-space learning, Latent Semantic Space
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467589 Using Data Fusion for Biometric Verification
Authors: Richard A. Wasniowski
Abstract:
A wide spectrum of systems require reliable personal recognition schemes to either confirm or determine the identity of an individual person. This paper considers multimodal biometric system and their applicability to access control, authentication and security applications. Strategies for feature extraction and sensor fusion are considered and contrasted. Issues related to performance assessment, deployment and standardization are discussed. Finally future directions of biometric systems development are discussed.Keywords: Multimodal, biometric, recognition, fusion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769588 Binarization of Text Region based on Fuzzy Clustering and Histogram Distribution in Signboards
Authors: Jonghyun Park, Toan Nguyen Dinh, Gueesang Lee
Abstract:
In this paper, we present a novel approach to accurately detect text regions including shop name in signboard images with complex background for mobile system applications. The proposed method is based on the combination of text detection using edge profile and region segmentation using fuzzy c-means method. In the first step, we perform an elaborate canny edge operator to extract all possible object edges. Then, edge profile analysis with vertical and horizontal direction is performed on these edge pixels to detect potential text region existing shop name in a signboard. The edge profile and geometrical characteristics of each object contour are carefully examined to construct candidate text regions and classify the main text region from background. Finally, the fuzzy c-means algorithm is performed to segment and detected binarize text region. Experimental results show that our proposed method is robust in text detection with respect to different character size and color and can provide reliable text binarization result.Keywords: Text detection, edge profile, signboard image, fuzzy clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2226587 An Edge-based Text Region Extraction Algorithm for Indoor Mobile Robot Navigation
Authors: Jagath Samarabandu, Xiaoqing Liu
Abstract:
Using bottom-up image processing algorithms to predict human eye fixations and extract the relevant embedded information in images has been widely applied in the design of active machine vision systems. Scene text is an important feature to be extracted, especially in vision-based mobile robot navigation as many potential landmarks such as nameplates and information signs contain text. This paper proposes an edge-based text region extraction algorithm, which is robust with respect to font sizes, styles, color/intensity, orientations, and effects of illumination, reflections, shadows, perspective distortion, and the complexity of image backgrounds. Performance of the proposed algorithm is compared against a number of widely used text localization algorithms and the results show that this method can quickly and effectively localize and extract text regions from real scenes and can be used in mobile robot navigation under an indoor environment to detect text based landmarks.
Keywords: Landmarks, mobile robot navigation, scene text, text localization and extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2924586 Text Retrieval Relevance Feedback Techniques for Bag of Words Model in CBIR
Authors: Nhu Van NGUYEN, Jean-Marc OGIER, Salvatore TABBONE, Alain BOUCHER
Abstract:
The state-of-the-art Bag of Words model in Content- Based Image Retrieval has been used for years but the relevance feedback strategies for this model are not fully investigated. Inspired from text retrieval, the Bag of Words model has the ability to use the wealth of knowledge and practices available in text retrieval. We study and experiment the relevance feedback model in text retrieval for adapting it to image retrieval. The experiments show that the techniques from text retrieval give good results for image retrieval and that further improvements is possible.Keywords: Relevance feedback, bag of words model, probabilistic model, vector space model, image retrieval
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117585 A System to Adapt Techniques of Text Summarizing to Polish
Authors: Marcin Ciura, Damian Grund, S
Abstract:
This paper describes a system, in which various methods of text summarizing can be adapted to Polish. A structure of the system is presented. A modular construction of the system and access to the system via the Internet are signaled.
Keywords: Automatic summary generation, linguistic analysis, text generation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1548584 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization
Authors: M. F. Zaiyadi, B. Baharudin
Abstract:
Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069583 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge
Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang
Abstract:
Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.Keywords: Text classification, Text clustering, Text similarity, Wikipedia
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117