Search results for: Multimodal
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 54

Search results for: Multimodal

54 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

This paper introduces an original method of parametric optimization of the structure for multimodal decisionlevel fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Keywords: Сlassification accuracy, fusion solution, total error rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929
53 eMedI: Web-Based E-Training for Multimodal Breast Imaging

Authors: Ioannis Pratikakis, Anna Karahaliou, Katerina Vassiou, Vassilis Virvilis, Dimitrios Kosmopoulos, Stavros Perantonis

Abstract:

In this paper, a Web-based e-Training platform that is dedicated to multimodal breast imaging is presented. The assets of this platform are summarised in (i) the efficient representation of the curriculum flow that will permit efficient training; (ii) efficient tagging of multimodal content appropriate for the completion of realistic cases and (iii) ubiquitous accessibility and platform independence via a web-based approach.

Keywords: Breast imaging, e-Training, web-based learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
52 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction

Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue

Abstract:

OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.

Keywords: Open multimodal emotion corpus, annotated labels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1772
51 OPEN_EmoRec_II- A Multimodal Corpus of Human-Computer Interaction

Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue

Abstract:

OPEN_EmoRec_II is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (facial reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes*. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and facial reactions annotations.

Keywords: Open multimodal emotion corpus, annotated labels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 317
50 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems; the challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: Biometric databases, Multimodal biometrics, security authentication, Digital watermarking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2022
49 A Brain Inspired Approach for Multi-View Patterns Identification

Authors: Yee Ling Boo, Damminda Alahakoon

Abstract:

Biologically human brain processes information in both unimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is demonstrated and discussed with some experimental results.

Keywords: Multimodal, Granularity, Hierarchical Clustering, Growing Self Organising Maps, Data Mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1498
48 A Grid-based Neural Network Framework for Multimodal Biometrics

Authors: Sitalakshmi Venkataraman

Abstract:

Recent scientific investigations indicate that multimodal biometrics overcome the technical limitations of unimodal biometrics, making them ideally suited for everyday life applications that require a reliable authentication system. However, for a successful adoption of multimodal biometrics, such systems would require large heterogeneous datasets with complex multimodal fusion and privacy schemes spanning various distributed environments. From experimental investigations of current multimodal systems, this paper reports the various issues related to speed, error-recovery and privacy that impede the diffusion of such systems in real-life. This calls for a robust mechanism that caters to the desired real-time performance, robust fusion schemes, interoperability and adaptable privacy policies. The main objective of this paper is to present a framework that addresses the abovementioned issues by leveraging on the heterogeneous resource sharing capacities of Grid services and the efficient machine learning capabilities of artificial neural networks (ANN). Hence, this paper proposes a Grid-based neural network framework for adopting multimodal biometrics with the view of overcoming the barriers of performance, privacy and risk issues that are associated with shared heterogeneous multimodal data centres. The framework combines the concept of Grid services for reliable brokering and privacy policy management of shared biometric resources along with a momentum back propagation ANN (MBPANN) model of machine learning for efficient multimodal fusion and authentication schemes. Real-life applications would be able to adopt the proposed framework to cater to the varying business requirements and user privacies for a successful diffusion of multimodal biometrics in various day-to-day transactions.

Keywords: Back Propagation, Grid Services, MultimodalBiometrics, Neural Networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870
47 Soccer Video Edition Using a Multimodal Annotation

Authors: Fendri Emna, Ben-Abdallah Hanêne, Ben-Hamadou Abdelmajid

Abstract:

In this paper, we present an approach for soccer video edition using a multimodal annotation. We propose to associate with each video sequence of a soccer match a textual document to be used for further exploitation like search, browsing and abstract edition. The textual document contains video meta data, match meta data, and match data. This document, generated automatically while the video is analyzed, segmented and classified, can be enriched semi automatically according to the user type and/or a specialized recommendation system.

Keywords: XML, Multimodal Annotation, recommendation system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1369
46 Multimodal Biometric Authentication Using Choquet Integral and Genetic Algorithm

Authors: Anouar Ben Khalifa, Sami Gazzah, Najoua Essoukri BenAmara

Abstract:

The Choquet integral is a tool for the information fusion that is very effective in the case where fuzzy measures associated with it are well chosen. In this paper, we propose a new approach for calculating fuzzy measures associated with the Choquet integral in a context of data fusion in multimodal biometrics. The proposed approach is based on genetic algorithms. It has been validated in two databases: the first base is relative to synthetic scores and the second one is biometrically relating to the face, fingerprint and palmprint. The results achieved attest the robustness of the proposed approach.

Keywords: Multimodal biometrics, data fusion, Choquet integral, fuzzy measures, genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2474
45 Using Trip Planners in Developing Proper Transportation Behavior

Authors: Grzegorz Sierpiński, Ireneusz Celiński, Marcin Staniek

Abstract:

The article discusses multimodal mobility in contemporary societies as a main planning and organization issue in the functioning of administrative bodies, a problem which really exists in the space of contemporary cities in terms of shaping modern transport systems. The article presents classification of available resources and initiatives undertaken for developing multimodal mobility. Solutions can be divided into three groups of measures – physical measures in the form of changes of the transport network infrastructure, organizational ones (including transport policy) and information measures. The latter ones include in particular direct support for people travelling in the transport network by providing information about ways of using available means of transport. A special measure contributing to this end is a trip planner. The article compares several selected planners. It includes a short description of the Green Travelling Project, which aims at developing a planner supporting environmentally friendly solutions in terms of transport network operation. The article summarizes preliminary findings of the project.

Keywords: Mobility, modal split, multimodal trip, multimodal platforms, sustainable transport.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842
44 Using Data Fusion for Biometric Verification

Authors: Richard A. Wasniowski

Abstract:

A wide spectrum of systems require reliable personal recognition schemes to either confirm or determine the identity of an individual person. This paper considers multimodal biometric system and their applicability to access control, authentication and security applications. Strategies for feature extraction and sensor fusion are considered and contrasted. Issues related to performance assessment, deployment and standardization are discussed. Finally future directions of biometric systems development are discussed.

Keywords: Multimodal, biometric, recognition, fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1711
43 Adaptive Score Normalization: A Novel Approach for Multimodal Biometric Systems

Authors: Anouar Ben Khalifa, Sami Gazzah, Najoua Essoukri BenAmara

Abstract:

Multimodal biometric systems integrate the data presented by multiple biometric sources, hence offering a better performance than the systems based on a single biometric modality. Although the coupling of biometric systems can be done at different levels, the fusion at the scores level is the most common since it has been proven effective than the rest of the fusion levels. However, the scores from different modalities are generally heterogeneous. A step of normalizing the scores is needed to transform these scores into a common domain before combining them. In this paper, we study the performance of several normalization techniques with various fusion methods in a context relating to the merger of three unimodal systems based on the face, the palmprint and the fingerprint. We also propose a new adaptive normalization method that takes into account the distribution of client scores and impostor scores. Experiments conducted on a database of 100 people show that the performances of a multimodal system depend on the choice of the normalization method and the fusion technique. The proposed normalization method has given the best results.

Keywords: Multibiometrics, Fusion, Score level, Score normalization, Adaptive normalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3498
42 Analyzing Political Cartoons in Arabic-Language Media after Trump's Jerusalem Move: A Multimodal Discourse Perspective

Authors: Inas Hussein

Abstract:

Communication in the modern world is increasingly becoming multimodal due to globalization and the digital space we live in which have remarkably affected how people communicate. Accordingly, Multimodal Discourse Analysis (MDA) is an emerging paradigm in discourse studies with the underlying assumption that other semiotic resources such as images, colours, scientific symbolism, gestures, actions, music and sound, etc. combine with language in order to  communicate meaning. One of the effective multimodal media that combines both verbal and non-verbal elements to create meaning is political cartoons. Furthermore, since political and social issues are mirrored in political cartoons, these are regarded as potential objects of discourse analysis since they not only reflect the thoughts of the public but they also have the power to influence them. The aim of this paper is to analyze some selected cartoons on the recognition of Jerusalem as Israel's capital by the American President, Donald Trump, adopting a multimodal approach. More specifically, the present research examines how the various semiotic tools and resources utilized by the cartoonists function in projecting the intended meaning. Ten political cartoons, among a surge of editorial cartoons highlighted by the Anti-Defamation League (ADL) - an international Jewish non-governmental organization based in the United States - as publications in different Arabic-language newspapers in Egypt, Saudi Arabia, UAE, Oman, Iran and UK, were purposively selected for semiotic analysis. These editorial cartoons, all published during 6th–18th December 2017, invariably suggest one theme: Jewish and Israeli domination of the United States. The data were analyzed using the framework of Visual Social Semiotics. In accordance with this methodological framework, the selected visual compositions were analyzed in terms of three aspects of meaning: representational, interactive and compositional. In analyzing the selected cartoons, an interpretative approach is being adopted. This approach prioritizes depth to breadth and enables insightful analyses of the chosen cartoons. The findings of the study reveal that semiotic resources are key elements of political cartoons due to the inherent political communication they convey. It is proved that adequate interpretation of the three aspects of meaning is a prerequisite for understanding the intended meaning of political cartoons. It is recommended that further research should be conducted to provide more insightful analyses of political cartoons from a multimodal perspective.

Keywords: Multimodal discourse analysis, multimodal text, political cartoons, visual modality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
41 Modelling of Designing a Conceptual Schema for Multimodal Freight Transportation Information System

Authors: Gia Surguladze, Lily Petriashvili, Nino Topuria, Giorgi Surguladze

Abstract:

Modelling of building processes of a multimodal freight transportation support information system is discussed based on modern CASE technologies. Functional efficiencies of ports in the eastern part of the Black Sea are analyzed taking into account their ecological, seasonal, resource usage parameters. By resources, we mean capacities of berths, cranes, automotive transport, as well as work crews and neighbouring airports. For the purpose of designing database of computer support system for Managerial (Logistics) function, using Object-Role Modeling (ORM) tool (NORMA–Natural ORM Architecture) is proposed, after which Entity Relationship Model (ERM) is generated in automated process. Software is developed based on Process-Oriented and Service-Oriented architecture, in Visual Studio.NET environment.

Keywords: Seaport resources, business-processes, multimodal transportation, CASE technology, object-role model, entity relationship model, SOA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1944
40 A Multimodal Approach for Biometric Authentication with Multiple Classifiers

Authors: Sorin Soviany, Cristina Soviany, Mariana Jurian

Abstract:

The paper presents a multimodal approach for biometric authentication, based on multiple classifiers. The proposed solution uses a post-classification biometric fusion method in which the biometric data classifiers outputs are combined in order to improve the overall biometric system performance by decreasing the classification error rates. The paper shows also the biometric recognition task improvement by means of a carefully feature selection, as much as not all of the feature vectors components support the accuracy improvement.

Keywords: biometric fusion, multiple classifiers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2037
39 Multimodal Reasoning in a Knowledge Engineering Framework for Product Support

Authors: Rossitza M. Setchi, Nikolaos Lagos

Abstract:

Problem solving has traditionally been one of the principal research areas for artificial intelligence. Yet, although artificial intelligence reasoning techniques have been employed in several product support systems, the benefit of integrating product support, knowledge engineering, and problem solving, is still unclear. This paper studies the synergy of these areas and proposes a knowledge engineering framework that integrates product support systems and artificial intelligence techniques. The framework includes four spaces; the data, problem, hypothesis, and solution ones. The data space incorporates the knowledge needed for structured reasoning to take place, the problem space contains representations of problems, and the hypothesis space utilizes a multimodal reasoning approach to produce appropriate solutions in the form of virtual documents. The solution space is used as the gateway between the system and the user. The proposed framework enables the development of product support systems in terms of smaller, more manageable steps while the combination of different reasoning techniques provides a way to overcome the lack of documentation resources.

Keywords: Knowledge engineering framework, product support, case-based reasoning, model-based reasoning, multimodal reasoning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
38 Development of Heterogeneous Parallel Genetic Simulated Annealing Using Multi-Niche Crowding

Authors: Z. G. Wang, M. Rahman, Y. S. Wong, K. S. Neo

Abstract:

In this paper, a new hybrid of genetic algorithm (GA) and simulated annealing (SA), referred to as GSA, is presented. In this algorithm, SA is incorporated into GA to escape from local optima. The concept of hierarchical parallel GA is employed to parallelize GSA for the optimization of multimodal functions. In addition, multi-niche crowding is used to maintain the diversity in the population of the parallel GSA (PGSA). The performance of the proposed algorithms is evaluated against a standard set of multimodal benchmark functions. The multi-niche crowding PGSA and normal PGSA show some remarkable improvement in comparison with the conventional parallel genetic algorithm and the breeder genetic algorithm (BGA).

Keywords: Crowding, genetic algorithm, parallel geneticalgorithm, simulated annealing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
37 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: Iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, scale invariant feature transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 829
36 Development of Multimodal e-Slide Presentation to Support Self-Learning for the Visually Impaired

Authors: Rustam Asnawi, Wan Fatimah Wan Ahmad

Abstract:

Currently electronic slide (e-slide) is one of the most common styles in educational presentation. Unfortunately, the utilization of e-slide for the visually impaired is uncommon since they are unable to see the content of such e-slides which are usually composed of text, images and animation. This paper proposes a model for presenting e-slide in multimodal presentation i.e. using conventional slide concurrent with voicing, in both languages Malay and English. At the design level, live multimedia presentation concept is used, while at the implementation level several components are used. The text content of each slide is extracted using COM component, Microsoft Speech API for voicing the text in English language and the text in Malay language is voiced using dictionary approach. To support the accessibility, an auditory user interface is provided as an additional feature. A prototype of such model named as VSlide has been developed and introduced.

Keywords: presentation, self-learning, slide, visually impaired

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514
35 Pictorial Multimodal Analysis of Selected Paintings of Salvador Dali

Authors: Shaza Melies, Abeer Refky, Nihad Mansoor

Abstract:

Multimodality involves the communication between verbal and visual components in various discourses. A painting represents a form of communication between the artist and the viewer in terms of colors, shades, objects, and the title. This paper aims to present how multimodality can be used to decode the verbal and visual dimensions a painting holds. For that purpose, this study uses Kress and van Leeuwen’s theoretical framework of visual grammar for the analysis of the multimodal semiotic resources of selected paintings of Salvador Dali. This study investigates the visual decoding of the selected paintings of Salvador Dali and analyzing their social and political meanings using Kress and van Leeuwen’s framework of visual grammar. The paper attempts to answer the following questions: 1. How far can multimodality decode the verbal and non-verbal meanings of surrealistic art? 2. How can Kress and van Leeuwen’s theoretical framework of visual grammar be applied to analyze Dali’s paintings? 3. To what extent is Kress and van Leeuwen’s theoretical framework of visual grammar apt to deliver political and social messages of Dali? The paper reached the following findings: the framework’s descriptive tools (representational, interactive, and compositional meanings) can be used to analyze the paintings’ title and their visual elements. Social and political messages were delivered by appropriate usage of color, gesture, vectors, modality, and the way social actors were represented.

Keywords: Multimodality, multimodal analysis, paintings analysis, Salvador Dali, visual grammar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 677
34 Complexity of Mathematical Expressions in Adaptive Multimodal Multimedia System Ensuring Access to Mathematics for Visually Impaired Users

Authors: Ali Awde, Yacine Bellik, Chakib Tadj

Abstract:

Our adaptive multimodal system aims at correctly presenting a mathematical expression to visually impaired users. Given an interaction context (i.e. combination of user, environment and system resources) as well as the complexity of the expression itself and the user-s preferences, the suitability scores of different presentation format are calculated. Unlike the current state-of-the art solutions, our approach takes into account the user-s situation and not imposes a solution that is not suitable to his context and capacity. In this wok, we present our methodology for calculating the mathematical expression complexity and the results of our experiment. Finally, this paper discusses the concepts and principles applied on our system as well as their validation through cases studies. This work is our original contribution to an ongoing research to make informatics more accessible to handicapped users.

Keywords: Adaptive system, intelligent multi-agent system, mathematics for visually-impaired users.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1533
33 The Whale Optimization Algorithm and Its Implementation in MATLAB

Authors: S. Adhirai, R. P. Mahapatra, Paramjit Singh

Abstract:

Optimization is an important tool in making decisions and in analysing physical systems. In mathematical terms, an optimization problem is the problem of finding the best solution from among the set of all feasible solutions. The paper discusses the Whale Optimization Algorithm (WOA), and its applications in different fields. The algorithm is tested using MATLAB because of its unique and powerful features. The benchmark functions used in WOA algorithm are grouped as: unimodal (F1-F7), multimodal (F8-F13), and fixed-dimension multimodal (F14-F23). Out of these benchmark functions, we show the experimental results for F7, F11, and F19 for different number of iterations. The search space and objective space for the selected function are drawn, and finally, the best solution as well as the best optimal value of the objective function found by WOA is presented. The algorithmic results demonstrate that the WOA performs better than the state-of-the-art meta-heuristic and conventional algorithms.

Keywords: Optimization, optimal value, objective function, optimization problems, meta-heuristic optimization algorithms, Whale Optimization Algorithm, Implementation, MATLAB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2821
32 A Bionic Approach to Dynamic, Multimodal Scene Perception and Interpretation in Buildings

Authors: Rosemarie Velik, Dietmar Bruckner

Abstract:

Today, building automation is advancing from simple monitoring and control tasks of lightning and heating towards more and more complex applications that require a dynamic perception and interpretation of different scenes occurring in a building. Current approaches cannot handle these newly upcoming demands. In this article, a bionically inspired approach for multimodal, dynamic scene perception and interpretation is presented, which is based on neuroscientific and neuro-psychological research findings about the perceptual system of the human brain. This approach bases on data from diverse sensory modalities being processed in a so-called neuro-symbolic network. With its parallel structure and with its basic elements being information processing and storing units at the same time, a very efficient method for scene perception is provided overcoming the problems and bottlenecks of classical dynamic scene interpretation systems.

Keywords: building automation, biomimetrics, dynamic scene interpretation, human-like perception, neuro-symbolic networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1564
31 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1066
30 A Survey of Sentiment Analysis Based on Deep Learning

Authors: Pingping Lin, Xudong Luo, Yifan Fan

Abstract:

Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.

Keywords: Natural language processing, sentiment analysis, document analysis, multimodal sentiment analysis, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875
29 Spreading Dynamics of a Viral Infection in a Complex Network

Authors: Khemanand Moheeput, Smita S. D. Goorah, Satish K. Ramchurn

Abstract:

We report a computational study of the spreading dynamics of a viral infection in a complex (scale-free) network. The final epidemic size distribution (FESD) was found to be unimodal or bimodal depending on the value of the basic reproductive number R0 . The FESDs occurred on time-scales long enough for intermediate-time epidemic size distributions (IESDs) to be important for control measures. The usefulness of R0 for deciding on the timeliness and intensity of control measures was found to be limited by the multimodal nature of the IESDs and by its inability to inform on the speed at which the infection spreads through the population. A reduction of the transmission probability at the hubs of the scale-free network decreased the occurrence of the larger-sized epidemic events of the multimodal distributions. For effective epidemic control, an early reduction in transmission at the index cell and its neighbors was essential.

Keywords: Basic reproductive number, epidemic control, scalefree network, viral infection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1672
28 Multimodal Biometric System Based on Near- Infra-Red Dorsal Hand Geometry and Fingerprints for Single and Whole Hands

Authors: Mohamed K. Shahin, Ahmed M. Badawi, Mohamed E. M. Rasmy

Abstract:

Prior research evidenced that unimodal biometric systems have several tradeoffs like noisy data, intra-class variations, restricted degrees of freedom, non-universality, spoof attacks, and unacceptable error rates. In order for the biometric system to be more secure and to provide high performance accuracy, more than one form of biometrics are required. Hence, the need arise for multimodal biometrics using combinations of different biometric modalities. This paper introduces a multimodal biometric system (MMBS) based on fusion of whole dorsal hand geometry and fingerprints that acquires right and left (Rt/Lt) near-infra-red (NIR) dorsal hand geometry (HG) shape and (Rt/Lt) index and ring fingerprints (FP). Database of 100 volunteers were acquired using the designed prototype. The acquired images were found to have good quality for all features and patterns extraction to all modalities. HG features based on the hand shape anatomical landmarks were extracted. Robust and fast algorithms for FP minutia points feature extraction and matching were used. Feature vectors that belong to similar biometric traits were fused using feature fusion methodologies. Scores obtained from different biometric trait matchers were fused using the Min-Max transformation-based score fusion technique. Final normalized scores were merged using the sum of scores method to obtain a single decision about the personal identity based on multiple independent sources. High individuality of the fused traits and user acceptability of the designed system along with its experimental high performance biometric measures showed that this MMBS can be considered for med-high security levels biometric identification purposes.

Keywords: Unimodal, Multi-Modal, Biometric System, NIR Imaging, Dorsal Hand Geometry, Fingerprint, Whole Hands, Feature Extraction, Feature Fusion, Score Fusion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
27 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: Affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, Signal Detection Theory, student engagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1189
26 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith

Abstract:

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Keywords: Multimodal image registration, GAN, cycle consistency, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 729
25 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework

Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin

Abstract:

During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.

Keywords: Artificial intelligence, depression detection, facial emotion recognition, natural language processing, mental disorder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1076