Search results for: Multimodal Annotation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 98

Search results for: Multimodal Annotation

68 Grammatically Coded Corpus of Spoken Lithuanian: Methodology and Development

Authors: L. Kamandulytė-Merfeldienė

Abstract:

The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.

Keywords: CHILDES, Corpus of Spoken Lithuanian, grammatical annotation, grammatical disambiguation, lexicon, Lithuanian.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
67 An Experiment on Personal Archiving and Retrieving Image System (PARIS)

Authors: Pei-Jeng Kuo, Terumasa Aoki, Hiroshi Yasuda

Abstract:

PARIS (Personal Archiving and Retrieving Image System) is an experiment personal photograph library, which includes more than 80,000 of consumer photographs accumulated within a duration of approximately five years, metadata based on our proposed MPEG-7 annotation architecture, Dozen Dimensional Digital Content (DDDC), and a relational database structure. The DDDC architecture is specially designed for facilitating the managing, browsing and retrieving of personal digital photograph collections. In annotating process, we also utilize a proposed Spatial and Temporal Ontology (STO) designed based on the general characteristic of personal photograph collections. This paper explains PRAIS system.

Keywords: Ontology, Databases and Information Retrieval, MPEG-7, Spatial-Temporal, Digital Library Designs l, metadata, Semantic Web, semi-automatic annotation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074
66 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: Iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, scale invariant feature transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 834
65 Development of Multimodal e-Slide Presentation to Support Self-Learning for the Visually Impaired

Authors: Rustam Asnawi, Wan Fatimah Wan Ahmad

Abstract:

Currently electronic slide (e-slide) is one of the most common styles in educational presentation. Unfortunately, the utilization of e-slide for the visually impaired is uncommon since they are unable to see the content of such e-slides which are usually composed of text, images and animation. This paper proposes a model for presenting e-slide in multimodal presentation i.e. using conventional slide concurrent with voicing, in both languages Malay and English. At the design level, live multimedia presentation concept is used, while at the implementation level several components are used. The text content of each slide is extracted using COM component, Microsoft Speech API for voicing the text in English language and the text in Malay language is voiced using dictionary approach. To support the accessibility, an auditory user interface is provided as an additional feature. A prototype of such model named as VSlide has been developed and introduced.

Keywords: presentation, self-learning, slide, visually impaired

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519
64 Dynamic Capitalization and Visualization Strategy in Collaborative Knowledge Management System for EI Process

Authors: Bolanle F. Oladejo, Victor T. Odumuyiwa, Amos A. David

Abstract:

Knowledge is attributed to human whose problemsolving behavior is subjective and complex. In today-s knowledge economy, the need to manage knowledge produced by a community of actors cannot be overemphasized. This is due to the fact that actors possess some level of tacit knowledge which is generally difficult to articulate. Problem-solving requires searching and sharing of knowledge among a group of actors in a particular context. Knowledge expressed within the context of a problem resolution must be capitalized for future reuse. In this paper, an approach that permits dynamic capitalization of relevant and reliable actors- knowledge in solving decision problem following Economic Intelligence process is proposed. Knowledge annotation method and temporal attributes are used for handling the complexity in the communication among actors and in contextualizing expressed knowledge. A prototype is built to demonstrate the functionalities of a collaborative Knowledge Management system based on this approach. It is tested with sample cases and the result showed that dynamic capitalization leads to knowledge validation hence increasing reliability of captured knowledge for reuse. The system can be adapted to various domains.

Keywords: Actors' communication, knowledge annotation, recursive knowledge capitalization, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1332
63 Pictorial Multimodal Analysis of Selected Paintings of Salvador Dali

Authors: Shaza Melies, Abeer Refky, Nihad Mansoor

Abstract:

Multimodality involves the communication between verbal and visual components in various discourses. A painting represents a form of communication between the artist and the viewer in terms of colors, shades, objects, and the title. This paper aims to present how multimodality can be used to decode the verbal and visual dimensions a painting holds. For that purpose, this study uses Kress and van Leeuwen’s theoretical framework of visual grammar for the analysis of the multimodal semiotic resources of selected paintings of Salvador Dali. This study investigates the visual decoding of the selected paintings of Salvador Dali and analyzing their social and political meanings using Kress and van Leeuwen’s framework of visual grammar. The paper attempts to answer the following questions: 1. How far can multimodality decode the verbal and non-verbal meanings of surrealistic art? 2. How can Kress and van Leeuwen’s theoretical framework of visual grammar be applied to analyze Dali’s paintings? 3. To what extent is Kress and van Leeuwen’s theoretical framework of visual grammar apt to deliver political and social messages of Dali? The paper reached the following findings: the framework’s descriptive tools (representational, interactive, and compositional meanings) can be used to analyze the paintings’ title and their visual elements. Social and political messages were delivered by appropriate usage of color, gesture, vectors, modality, and the way social actors were represented.

Keywords: Multimodality, multimodal analysis, paintings analysis, Salvador Dali, visual grammar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 682
62 Complexity of Mathematical Expressions in Adaptive Multimodal Multimedia System Ensuring Access to Mathematics for Visually Impaired Users

Authors: Ali Awde, Yacine Bellik, Chakib Tadj

Abstract:

Our adaptive multimodal system aims at correctly presenting a mathematical expression to visually impaired users. Given an interaction context (i.e. combination of user, environment and system resources) as well as the complexity of the expression itself and the user-s preferences, the suitability scores of different presentation format are calculated. Unlike the current state-of-the art solutions, our approach takes into account the user-s situation and not imposes a solution that is not suitable to his context and capacity. In this wok, we present our methodology for calculating the mathematical expression complexity and the results of our experiment. Finally, this paper discusses the concepts and principles applied on our system as well as their validation through cases studies. This work is our original contribution to an ongoing research to make informatics more accessible to handicapped users.

Keywords: Adaptive system, intelligent multi-agent system, mathematics for visually-impaired users.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
61 The Whale Optimization Algorithm and Its Implementation in MATLAB

Authors: S. Adhirai, R. P. Mahapatra, Paramjit Singh

Abstract:

Optimization is an important tool in making decisions and in analysing physical systems. In mathematical terms, an optimization problem is the problem of finding the best solution from among the set of all feasible solutions. The paper discusses the Whale Optimization Algorithm (WOA), and its applications in different fields. The algorithm is tested using MATLAB because of its unique and powerful features. The benchmark functions used in WOA algorithm are grouped as: unimodal (F1-F7), multimodal (F8-F13), and fixed-dimension multimodal (F14-F23). Out of these benchmark functions, we show the experimental results for F7, F11, and F19 for different number of iterations. The search space and objective space for the selected function are drawn, and finally, the best solution as well as the best optimal value of the objective function found by WOA is presented. The algorithmic results demonstrate that the WOA performs better than the state-of-the-art meta-heuristic and conventional algorithms.

Keywords: Optimization, optimal value, objective function, optimization problems, meta-heuristic optimization algorithms, Whale Optimization Algorithm, Implementation, MATLAB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2828
60 A Bionic Approach to Dynamic, Multimodal Scene Perception and Interpretation in Buildings

Authors: Rosemarie Velik, Dietmar Bruckner

Abstract:

Today, building automation is advancing from simple monitoring and control tasks of lightning and heating towards more and more complex applications that require a dynamic perception and interpretation of different scenes occurring in a building. Current approaches cannot handle these newly upcoming demands. In this article, a bionically inspired approach for multimodal, dynamic scene perception and interpretation is presented, which is based on neuroscientific and neuro-psychological research findings about the perceptual system of the human brain. This approach bases on data from diverse sensory modalities being processed in a so-called neuro-symbolic network. With its parallel structure and with its basic elements being information processing and storing units at the same time, a very efficient method for scene perception is provided overcoming the problems and bottlenecks of classical dynamic scene interpretation systems.

Keywords: building automation, biomimetrics, dynamic scene interpretation, human-like perception, neuro-symbolic networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568
59 Using the Semantic Web in Ubiquitous and Mobile Computing: the Morfeo Experience

Authors: José M. Cantera, Miguel Jiménez, Genoveva López, Javier Soriano

Abstract:

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called MorfeoSMC, enabling the development of mobility applications and services according to a channel model based on Services Oriented Architecture (SOA) principles. It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation of mobile Web contents. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering, as well as to exploit these semantic annotations in a novel user profile-aware content adaptation process. Semantic Web content adaptation is a way of adding value to and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

Keywords: Semantic web, ubiquitous and mobile computing, web content transcoding, semantic markup, mobile computing middleware and services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1577
58 Machine Learning for Music Aesthetic Annotation Using MIDI Format: A Harmony-Based Classification Approach

Authors: Lin Yang, Zhian Mi, Jiacheng Xiao, Rong Li

Abstract:

Swimming with the tide of deep learning, the field of music information retrieval (MIR) experiences parallel development and a sheer variety of feature-learning models has been applied to music classification and tagging tasks. Among those learning techniques, the deep convolutional neural networks (CNNs) have been widespreadly used with better performance than the traditional approach especially in music genre classification and prediction. However, regarding the music recommendation, there is a large semantic gap between the corresponding audio genres and the various aspects of a song that influence user preference. In our study, aiming to bridge the gap, we strive to construct an automatic music aesthetic annotation model with MIDI format for better comparison and measurement of the similarity between music pieces in the way of harmonic analysis. We use the matrix of qualification converted from MIDI files as input to train two different classifiers, support vector machine (SVM) and Decision Tree (DT). Experimental results in performance of a tag prediction task have shown that both learning algorithms are capable of extracting high-level properties in an end-to end manner from music information. The proposed model is helpful to learn the audience taste and then the resulting recommendations are likely to appeal to a niche consumer.

Keywords: Harmonic analysis, machine learning, music classification and tagging, MIDI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 678
57 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1070
56 A Survey of Sentiment Analysis Based on Deep Learning

Authors: Pingping Lin, Xudong Luo, Yifan Fan

Abstract:

Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.

Keywords: Natural language processing, sentiment analysis, document analysis, multimodal sentiment analysis, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886
55 Virulent-GO: Prediction of Virulent Proteins in Bacterial Pathogens Utilizing Gene Ontology Terms

Authors: Chia-Ta Tsai, Wen-Lin Huang, Shinn-Jang Ho, Li-Sun Shu, Shinn-Ying Ho

Abstract:

Prediction of bacterial virulent protein sequences can give assistance to identification and characterization of novel virulence-associated factors and discover drug/vaccine targets against proteins indispensable to pathogenicity. Gene Ontology (GO) annotation which describes functions of genes and gene products as a controlled vocabulary of terms has been shown effectively for a variety of tasks such as gene expression study, GO annotation prediction, protein subcellular localization, etc. In this study, we propose a sequence-based method Virulent-GO by mining informative GO terms as features for predicting bacterial virulent proteins. Each protein in the datasets used by the existing method VirulentPred is annotated by using BLAST to obtain its homologies with known accession numbers for retrieving GO terms. After investigating various popular classifiers using the same five-fold cross-validation scheme, Virulent-GO using the single kind of GO term features with an accuracy of 82.5% is slightly better than VirulentPred with 81.8% using five kinds of sequence-based features. For the evaluation of independent test, Virulent-GO also yields better results (82.0%) than VirulentPred (80.7%). When evaluating single kind of feature with SVM, the GO term feature performs much well, compared with each of the five kinds of features.

Keywords: Bacterial virulence factors, GO terms, prediction, protein sequence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143
54 Spreading Dynamics of a Viral Infection in a Complex Network

Authors: Khemanand Moheeput, Smita S. D. Goorah, Satish K. Ramchurn

Abstract:

We report a computational study of the spreading dynamics of a viral infection in a complex (scale-free) network. The final epidemic size distribution (FESD) was found to be unimodal or bimodal depending on the value of the basic reproductive number R0 . The FESDs occurred on time-scales long enough for intermediate-time epidemic size distributions (IESDs) to be important for control measures. The usefulness of R0 for deciding on the timeliness and intensity of control measures was found to be limited by the multimodal nature of the IESDs and by its inability to inform on the speed at which the infection spreads through the population. A reduction of the transmission probability at the hubs of the scale-free network decreased the occurrence of the larger-sized epidemic events of the multimodal distributions. For effective epidemic control, an early reduction in transmission at the index cell and its neighbors was essential.

Keywords: Basic reproductive number, epidemic control, scalefree network, viral infection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677
53 Semantic Mobility Channel (SMC): Ubiquitous and Mobile Computing Meets the Semantic Web

Authors: José M. Cantera, Miguel Jiménez, Genoveva López, Javier Soriano

Abstract:

With the advent of emerging personal computing paradigms such as ubiquitous and mobile computing, Web contents are becoming accessible from a wide range of mobile devices. Since these devices do not have the same rendering capabilities, Web contents need to be adapted for transparent access from a variety of client agents. Such content adaptation is exploited for either an individual element or a set of consecutive elements in a Web document and results in better rendering and faster delivery to the client device. Nevertheless, Web content adaptation sets new challenges for semantic markup. This paper presents an advanced components platform, called SMC, enabling the development of mobility applications and services according to a channel model based on the principles of Services Oriented Architecture (SOA). It then goes on to describe the potential for integration with the Semantic Web through a novel framework of external semantic annotation that prescribes a scheme for representing semantic markup files and a way of associating Web documents with these external annotations. The role of semantic annotation in this framework is to describe the contents of individual documents themselves, assuring the preservation of the semantics during the process of adapting content rendering. Semantic Web content adaptation is a way of adding value to Web contents and facilitates repurposing of Web contents (enhanced browsing, Web Services location and access, etc).

Keywords: Semantic web, ubiquitous and mobile computing, web content transcoding. semantic mark-up, mobile computing, middleware and services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1761
52 Multimodal Biometric System Based on Near- Infra-Red Dorsal Hand Geometry and Fingerprints for Single and Whole Hands

Authors: Mohamed K. Shahin, Ahmed M. Badawi, Mohamed E. M. Rasmy

Abstract:

Prior research evidenced that unimodal biometric systems have several tradeoffs like noisy data, intra-class variations, restricted degrees of freedom, non-universality, spoof attacks, and unacceptable error rates. In order for the biometric system to be more secure and to provide high performance accuracy, more than one form of biometrics are required. Hence, the need arise for multimodal biometrics using combinations of different biometric modalities. This paper introduces a multimodal biometric system (MMBS) based on fusion of whole dorsal hand geometry and fingerprints that acquires right and left (Rt/Lt) near-infra-red (NIR) dorsal hand geometry (HG) shape and (Rt/Lt) index and ring fingerprints (FP). Database of 100 volunteers were acquired using the designed prototype. The acquired images were found to have good quality for all features and patterns extraction to all modalities. HG features based on the hand shape anatomical landmarks were extracted. Robust and fast algorithms for FP minutia points feature extraction and matching were used. Feature vectors that belong to similar biometric traits were fused using feature fusion methodologies. Scores obtained from different biometric trait matchers were fused using the Min-Max transformation-based score fusion technique. Final normalized scores were merged using the sum of scores method to obtain a single decision about the personal identity based on multiple independent sources. High individuality of the fused traits and user acceptability of the designed system along with its experimental high performance biometric measures showed that this MMBS can be considered for med-high security levels biometric identification purposes.

Keywords: Unimodal, Multi-Modal, Biometric System, NIR Imaging, Dorsal Hand Geometry, Fingerprint, Whole Hands, Feature Extraction, Feature Fusion, Score Fusion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2174
51 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: Affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, Signal Detection Theory, student engagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1204
50 On the Move to Semantic Web Services

Authors: Jorge Cardoso

Abstract:

Semantic Web services will enable the semiautomatic and automatic annotation, advertisement, discovery, selection, composition, and execution of inter-organization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. There is a growing consensus that Web services alone will not be sufficient to develop valuable solutions due the degree of heterogeneity, autonomy, and distribution of the Web. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the Semantic Web. It presents the synergies that can be created between Web Services and Semantic Web technologies to provide a new generation of eservices.

Keywords: Semantic Web, Web service, Web process, WWW.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1244
49 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith

Abstract:

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Keywords: Multimodal image registration, GAN, cycle consistency, deep learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 738
48 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework

Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin

Abstract:

During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.

Keywords: Artificial intelligence, depression detection, facial emotion recognition, natural language processing, mental disorder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1086
47 Security Analysis of Password Hardened Multimodal Biometric Fuzzy Vault

Authors: V. S. Meenakshi, G. Padmavathi

Abstract:

Biometric techniques are gaining importance for personal authentication and identification as compared to the traditional authentication methods. Biometric templates are vulnerable to variety of attacks due to their inherent nature. When a person-s biometric is compromised his identity is lost. In contrast to password, biometric is not revocable. Therefore, providing security to the stored biometric template is very crucial. Crypto biometric systems are authentication systems, which blends the idea of cryptography and biometrics. Fuzzy vault is a proven crypto biometric construct which is used to secure the biometric templates. However fuzzy vault suffer from certain limitations like nonrevocability, cross matching. Security of the fuzzy vault is affected by the non-uniform nature of the biometric data. Fuzzy vault when hardened with password overcomes these limitations. Password provides an additional layer of security and enhances user privacy. Retina has certain advantages over other biometric traits. Retinal scans are used in high-end security applications like access control to areas or rooms in military installations, power plants, and other high risk security areas. This work applies the idea of fuzzy vault for retinal biometric template. Multimodal biometric system performance is well compared to single modal biometric systems. The proposed multi modal biometric fuzzy vault includes combined feature points from retina and fingerprint. The combined vault is hardened with user password for achieving high level of security. The security of the combined vault is measured using min-entropy. The proposed password hardened multi biometric fuzzy vault is robust towards stored biometric template attacks.

Keywords: Biometric Template Security, Crypto Biometric Systems, Hardening Fuzzy Vault, Min-Entropy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
46 Saudi Twitter Corpus for Sentiment Analysis

Authors: Adel Assiri, Ahmed Emam, Hmood Al-Dossari

Abstract:

Sentiment analysis (SA) has received growing attention in Arabic language research. However, few studies have yet to directly apply SA to Arabic due to lack of a publicly available dataset for this language. This paper partially bridges this gap due to its focus on one of the Arabic dialects which is the Saudi dialect. This paper presents annotated data set of 4700 for Saudi dialect sentiment analysis with (K= 0.807). Our next work is to extend this corpus and creation a large-scale lexicon for Saudi dialect from the corpus.

Keywords: Arabic, Sentiment Analysis, Twitter, annotation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3991
45 Approaches to Developing Semantic Web Services

Authors: Jorge Cardoso

Abstract:

It has been recognized that due to the autonomy and heterogeneity, of Web services and the Web itself, new approaches should be developed to describe and advertise Web services. The most notable approaches rely on the description of Web services using semantics. This new breed of Web services, termed semantic Web services, will enable the automatic annotation, advertisement, discovery, selection, composition, and execution of interorganization business logic, making the Internet become a common global platform where organizations and individuals communicate with each other to carry out various commercial activities and to provide value-added services. This paper deals with two of the hottest R&D and technology areas currently associated with the Web – Web services and the semantic Web. It describes how semantic Web services extend Web services as the semantic Web improves the current Web, and presents three different conceptual approaches to deploying semantic Web services, namely, WSDL-S, OWL-S, and WSMO.

Keywords: Semantic Web, Web service, Web process, WWW

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1388
44 Emotions Triggered by Children’s Literature Images

Authors: A. Breda, C. Cruz

Abstract:

The role of images/illustrations in communicating meanings and triggering emotions assumes an increasingly relevant role in contemporary texts, regardless of the age group for which they are intended or the nature of the texts that host them. It is no coincidence that children's books are full of illustrations and that the image/text ratio decreases as the age group grows. The vast majority of children's books can be considered as multimodal texts containing text and images/illustrations, interacting with each other, to provide the young reader with a broader and more creative understanding of the book's narrative. This interaction is very diverse, ranging from images/illustrations that are not essential for understanding the storytelling to those that contribute significantly to the meaning of the story. Usually, these books are also read by adults, namely by parents, educators, and teachers who act as mediators between the book and the children, explaining aspects that are or seem to be too complex for the child's context. It should be noted that there are books labeled as children's books, that are clearly intended for both children and adults. In this work, following a qualitative and interpretative methodology based on written productions, participant observation, and field notes, we will describe the perceptions of future teachers of the 1st cycle of basic education, attending a master’s degree at a Portuguese university, about the role of the image in literary and non-literary texts, namely in mathematical texts, and how these can constitute precious resources for emotional regulation and for the design of creative didactic situations. The analysis of the collected data allowed us to obtain evidence regarding the evolution of the participants' perception regarding the crucial role of images in children's literature, not only as an emotional regulator for young readers but also as a creative source for the design of meaningful didactical situations, crossing other scientific areas, other than the mother tongue, namely mathematics.

Keywords: Children’s literature, emotions, multimodal texts, soft skills.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 110
43 Academic Program Administration via Semantic Web – A Case Study

Authors: Qurban A Memon, Shakeel A. Khoja

Abstract:

Generally, administrative systems in an academic environment are disjoint and support independent queries. The objective in this work is to semantically connect these independent systems to provide support to queries run on the integrated platform. The proposed framework, by enriching educational material in the legacy systems, provides a value-added semantics layer where activities such as annotation, query and reasoning can be carried out to support management requirements. We discuss the development of this ontology framework with a case study of UAE University program administration to show how semantic web technologies can be used by administration to develop student profiles for better academic program management.

Keywords: Academic Program Administration, Semantic Web, Web Technology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
42 Development System for Emotion Detection Based on Brain Signals and Facial Images

Authors: Suprijanto, Linda Sari, Vebi Nadhira , IGN. Merthayasa. Farida I.M

Abstract:

Detection of human emotions has many potential applications. One of application is to quantify attentiveness audience in order evaluate acoustic quality in concern hall. The subjective audio preference that based on from audience is used. To obtain fairness evaluation of acoustic quality, the research proposed system for multimodal emotion detection; one modality based on brain signals that measured using electroencephalogram (EEG) and the second modality is sequences of facial images. In the experiment, an audio signal was customized which consist of normal and disorder sounds. Furthermore, an audio signal was played in order to stimulate positive/negative emotion feedback of volunteers. EEG signal from temporal lobes, i.e. T3 and T4 was used to measured brain response and sequence of facial image was used to monitoring facial expression during volunteer hearing audio signal. On EEG signal, feature was extracted from change information in brain wave, particularly in alpha and beta wave. Feature of facial expression was extracted based on analysis of motion images. We implement an advance optical flow method to detect the most active facial muscle form normal to other emotion expression that represented in vector flow maps. The reduce problem on detection of emotion state, vector flow maps are transformed into compass mapping that represents major directions and velocities of facial movement. The results showed that the power of beta wave is increasing when disorder sound stimulation was given, however for each volunteer was giving different emotion feedback. Based on features derived from facial face images, an optical flow compass mapping was promising to use as additional information to make decision about emotion feedback.

Keywords: Multimodal Emotion Detection, EEG, Facial Image, Optical Flow, compass mapping, Brain Wave

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2249
41 A Recommender Agent to Support Virtual Learning Activities

Authors: P. Valdiviezo, G. Riofrio, R. Reategui

Abstract:

This article describes the implementation of an intelligent agent that provides recommendations for educational resources in a virtual learning environment (VLE). It aims to support pending (undeveloped) student learning activities. It begins by analyzing the proposed VLE data model entities in the recommender process. The pending student activities are then identified, which constitutes the input information for the agent. By using the attribute-based recommender technique, the information can be processed and resource recommendations can be obtained. These serve as support for pending activity development in the course. To integrate this technique, we used an ontology. This served as support for the semantic annotation of attributes and recommended files recovery.

Keywords: Learning activities, educational resource, recommender agent, recommendation technique, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
40 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1592
39 Research on Strategy for Automated Scaleless-Map Compilation

Authors: Yin Jie, Qin Qiming, Yin Yong

Abstract:

As a tool for human spatial cognition and thinking, the map has been playing an important role. Maps are perhaps as fundamental to society as language and the written word. Economic and social development requires extensive and in-depth understanding of their own living environment, from the scope of the overall global to urban housing. This has brought unprecedented opportunities and challenges for traditional cartography . This paper first proposed the concept of scaleless-map and its basic characteristics, through the analysis of the existing multi-scale representation techniques. Then some strategies are presented for automated mapping compilation. Taking into account the demand of automated map compilation, detailed proposed the software - WJ workstation must have four technical features, which are generalization operators, symbol primitives, dynamically annotation and mapping process template. This paper provides a more systematic new idea and solution to improve the intelligence and automation of the scaleless cartography.

Keywords: scaleless-map, strategy, map generalization, automated compilation, WJ workstation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1396