Search results for: multimodal data
25187 Biosignal Recognition for Personal Identification
Authors: Hadri Hussain, M.Nasir Ibrahim, Chee-Ming Ting, Mariani Idroas, Fuad Numan, Alias Mohd Noor
Abstract:
A biometric security system has become an important application in client identification and verification system. A conventional biometric system is normally based on unimodal biometric that depends on either behavioural or physiological information for authentication purposes. The behavioural biometric depends on human body biometric signal (such as speech) and biosignal biometric (such as electrocardiogram (ECG) and phonocardiogram or heart sound (HS)). The speech signal is commonly used in a recognition system in biometric, while the ECG and the HS have been used to identify a person’s diseases uniquely related to its cluster. However, the conventional biometric system is liable to spoof attack that will affect the performance of the system. Therefore, a multimodal biometric security system is developed, which is based on biometric signal of ECG, HS, and speech. The biosignal data involved in the biometric system is initially segmented, with each segment Mel Frequency Cepstral Coefficients (MFCC) method is exploited for extracting the feature. The Hidden Markov Model (HMM) is used to model the client and to classify the unknown input with respect to the modal. The recognition system involved training and testing session that is known as client identification (CID). In this project, twenty clients are tested with the developed system. The best overall performance at 44 kHz was 93.92% for ECG and the worst overall performance was ECG at 88.47%. The results were compared to the best overall performance at 44 kHz for (20clients) to increment of clients, which was 90.00% for HS and the worst overall performance falls at ECG at 79.91%. It can be concluded that the difference multimodal biometric has a substantial effect on performance of the biometric system and with the increment of data, even with higher frequency sampling, the performance still decreased slightly as predicted.Keywords: electrocardiogram, phonocardiogram, hidden markov model, mel frequency cepstral coeffiecients, client identification
Procedia PDF Downloads 27925186 On Increase and Development Prospects of Competitiveness of Georgia’s Transport-Logistical System on the Contemporary Stage
Authors: Ketevan Goletiani
Abstract:
MMultimodal transport is Europe-Asia’s rational decision of the XXI century. Success prerequisite of this form of cargo carriage is not technologic decision, but the comprehensive attitude towards it. Integration of the transport industry must refer to both technical and organizational-economic fields. Support of the multimodal’s must be the priority of the transport policy in different organizations of Europe and Asia. The method of approach to the transport as a unified system has been changed to a certain extent in the market conditions. Nowadays the competition between the different kinds of transport is not to be considered as a competition of one kind of transport towards another one, but is to be considered as a stimulator of the transport development. Basically, transport logistic, as the recent methodology and organization of the rationally flow of cargos at the specialized logistic centres during their procession provides effective rise of such flow of cargos, decreases non-operating expenses and gives the opportunity to the transport companies to come along with the time, to meet market clients’ requirements. It is apparent that the advanced transport-forwarding and logistic firms are being analized.Keywords: transport systems, multimodal transport, competition, transport logistics
Procedia PDF Downloads 43625185 Metaphors of Love and Passion in Lithuanian Comics
Authors: Saulutė Juzelėnienė, Skirmantė Šarkauskienė
Abstract:
In this paper, it is aimed to analyse the multimodal representations of the concepts of LOVE and PASSION in Lithuanian graphic novel “Gertrūda”, by Gerda Jord. The research is based on the earlier findings by Forceville (2005), Eerden (2009) as well as insights made by Shihara and Matsunaka (2009) and Kövecses (2000). The domains of target and source of LOVE and PASSION metaphors in comics are expressed by verbal and non-verbal cues. The analysis of non-verbal cues adopts the concepts of rune and indexes. A pictorial rune is a graphic representation of an object that does not exist in reality in comics, such as lines, dashes, text "balloons", and pictorial index – a graphically represented object of reality, a real symptom expressing a certain emotion, such as a wide smile, furrowed eyebrows, etc. Indexes are often hyperbolized in comics. The research revealed that most frequent source domains are CLOSINESS/UNITY, NATURAL/ PHYSICAL FORCE, VALUABLE OBJECT, PRESSURE. The target is the emotion of LOVE/PASSION which belongs to a more abstract domain of psychological experience. In this kind of metaphor, the picture can be interpreted as representing the emotion of happiness. Data are taken from Lithuanian comic books and Internet sites, where comics have been presented. The data and the analysis we are providing in this article aims to reveal that there are pictorial metaphors that manifest conceptual metaphors that are also expressed verbally and that methodological framework constructed for the analysis in the papers by Forceville at all is applicable to other emotions and culture specific pictorial manifestations.Keywords: multimodal metaphor, conceptual metaphor, comics, graphic novel, concept of love/passion
Procedia PDF Downloads 6425184 Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe
Authors: Babatunde Olumide Olawale, Oyebode Olumide Oyediran
Abstract:
The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe.Keywords: access control, multimodal biometrics, pattern recognition, security safe
Procedia PDF Downloads 33225183 Leveraging Multimodal Neuroimaging Techniques to in vivo Address Compensatory and Disintegration Patterns in Neurodegenerative Disorders: Evidence from Cortico-Cerebellar Connections in Multiple Sclerosis
Authors: Efstratios Karavasilis, Foteini Christidi, Georgios Velonakis, Agapi Plousi, Kalliopi Platoni, Nikolaos Kelekis, Ioannis Evdokimidis, Efstathios Efstathopoulos
Abstract:
Introduction: Advanced structural and functional neuroimaging techniques contribute to the study of anatomical and functional brain connectivity and its role in the pathophysiology and symptoms’ heterogeneity in several neurodegenerative disorders, including multiple sclerosis (MS). Aim: In the present study, we applied multiparametric neuroimaging techniques to investigate the structural and functional cortico-cerebellar changes in MS patients. Material: We included 51 MS patients (28 with clinically isolated syndrome [CIS], 31 with relapsing-remitting MS [RRMS]) and 51 age- and gender-matched healthy controls (HC) who underwent MRI in a 3.0T MRI scanner. Methodology: The acquisition protocol included high-resolution 3D T1 weighted, diffusion-weighted imaging and echo planar imaging sequences for the analysis of volumetric, tractography and functional resting state data, respectively. We performed between-group comparisons (CIS, RRMS, HC) using CAT12 and CONN16 MATLAB toolboxes for the analysis of volumetric (cerebellar gray matter density) and functional (cortico-cerebellar resting-state functional connectivity) data, respectively. Brainance suite was used for the analysis of tractography data (cortico-cerebellar white matter integrity; fractional anisotropy [FA]; axial and radial diffusivity [AD; RD]) to reconstruct the cerebellum tracts. Results: Patients with CIS did not show significant gray matter (GM) density differences compared with HC. However, they showed decreased FA and increased diffusivity measures in cortico-cerebellar tracts, and increased cortico-cerebellar functional connectivity. Patients with RRMS showed decreased GM density in cerebellar regions, decreased FA and increased diffusivity measures in cortico-cerebellar WM tracts, as well as a pattern of increased and mostly decreased functional cortico-cerebellar connectivity compared to HC. The comparison between CIS and RRMS patients revealed significant GM density difference, reduced FA and increased diffusivity measures in WM cortico-cerebellar tracts and increased/decreased functional connectivity. The identification of decreased WM integrity and increased functional cortico-cerebellar connectivity without GM changes in CIS and the pattern of decreased GM density decreased WM integrity and mostly decreased functional connectivity in RRMS patients emphasizes the role of compensatory mechanisms in early disease stages and the disintegration of structural and functional networks with disease progression. Conclusions: In conclusion, our study highlights the added value of multimodal neuroimaging techniques for the in vivo investigation of cortico-cerebellar brain changes in neurodegenerative disorders. An extension and future opportunity to leverage multimodal neuroimaging data inevitably remain the integration of such data in the recently-applied mathematical approaches of machine learning algorithms to more accurately classify and predict patients’ disease course.Keywords: advanced neuroimaging techniques, cerebellum, MRI, multiple sclerosis
Procedia PDF Downloads 13925182 A Multimodal Discourse Analysis of Gender Representation on Health and Fitness Magazine Cover Pages
Authors: Nashwa Elyamany
Abstract:
In visual cultures, namely that of the United States, media representations are such influential and pervasive reflections of societal norms and expectations to the extent that they impact the manner in which both genders view themselves. Health and fitness magazines fall within the realm of visual culture. Since the main goal of communication is to ensure proper dissemination of information in order for the target audience to grasp the intended messages, it becomes imperative that magazine publishers, editors, advertisers and image producers use different modes of communication within their reach to convey messages to their readers and viewers. A rapid waxing flow of multimodality floods popular discourse, particularly health and fitness magazine cover pages. The use of well-crafted cover lines and visual images is imbued with agendas, consumerist ideologies and properties capable of effectively conveying implicit and explicit meaning to potential readers and viewers. In essence, the primary goal of this thesis is to interrogate the multi-semiotic operations and manifestations of hegemonic masculinity and femininity in male and female body culture, particularly on the cover pages of the twin American magazines Men's Health and Women's Health using corpora that spanned from 2011 to the mid of 2016. The researcher explores the semiotic resources that contribute to shaping and legitimizing a new form of postmodern, consumerist, gendered discourse that positions the reader-viewer ideologically. Methodologically, the researcher carries out analysis on the macro and micro levels. On the macro level, the researcher takes on a critical stance to illuminate the ideological nature of the multimodal ensemble of the cover pages, and, on the micro level, seeks to put forward new theoretical and methodological routes through which the semiotic choices well invested on the media texts can be more objectively scrutinized. On the macro level, a 'themes' analysis is initially conducted to isolate the overarching themes that dominate the fitness discourse on the cover pages under study. It is argued that variation in terms of frequencies of such themes is indicative, broadly speaking, of which facets of hegemonic masculinity and femininity are infused in the fitness discourse on the cover pages. On the micro level, this research work encompasses three sub-levels of analysis. The researcher follows an SF-MMDA approach, drawing on a trio of analytical frameworks: Halliday's SFG for the verbal analysis; Kress & van Leeuween's VG for the visual analysis; and CMT in relation to Sperber & Wilson's RT for the pragma-cognitive analysis of multimodal metaphors and metonymies. The data is presented in terms of detailed descriptions in conjunction with frequency tables, ANOVA with alpha=0.05 and MANOVA in the multiple phases of analysis. Insights and findings from this multi-faceted, social-semiotic analysis are interpreted in light of Cultivation Theory, Self-objectification Theory and the literature to date. Implications for future research include the implementation of a multi-dimensional approach whereby linguistic and visual analytical models are deployed with special regards to cultural variation.Keywords: gender, hegemony, magazine cover page, multimodal discourse analysis, multimodal metaphor, multimodal metonymy, systemic functional grammar, visual grammar
Procedia PDF Downloads 34825181 Transmedia and Platformized Political Discourse in a Growing Democracy: A Study of Nigeria’s 2023 General Elections
Authors: Tunde Ope-Davies
Abstract:
Transmediality and platformization as online content-sharing protocols have continued to accentuate the growing impact of the unprecedented digital revolution across the world. The rapid transformation across all sectors as a result of this revolution has continued to spotlight the increasing importance of new media technologies in redefining and reshaping the rhythm and dynamics of our private and public discursive practices. Equally, social and political activities are being impacted daily through the creation and transmission of political discourse content through multi-channel platforms such as mobile telephone communication, social media networks and the internet. It has been observed that digital platforms have become central to the production, processing, and distribution of multimodal social data and cultural content. The platformization paradigm thus underpins our understanding of how digital platforms enhance the production and heterogenous distribution of media and cultural content through these platforms and how this process facilitates socioeconomic and political activities. The use of multiple digital platforms to share and transmit political discourse material synchronously and asynchronously has gained some exciting momentum in the last few years. Nigeria’s 2023 general elections amplified the usage of social media and other online platforms as tools for electioneering campaigns, socio-political mobilizations and civic engagement. The study, therefore, focuses on transmedia and platformed political discourse as a new strategy to promote political candidates and their manifesto in order to mobilize support and woo voters. This innovative transmedia digital discourse model involves a constellation of online texts and images transmitted through different online platforms almost simultaneously. The data for the study was extracted from the 2023 general elections campaigns in Nigeria between January- March 2023 through media monitoring, manual download and the use of software to harvest the online electioneering campaign material. I adopted a discursive-analytic qualitative technique with toolkits drawn from a computer-mediated multimodal discourse paradigm. The study maps the progressive development of digital political discourse in this young democracy. The findings also demonstrate the inevitable transformation of modern democratic practice through platform-dependent and transmedia political discourse. Political actors and media practitioners now deploy layers of social media network platforms to convey messages and mobilize supporters in order to aggregate and maximize the impact of their media campaign projects and audience reach.Keywords: social media, digital humanities, political discourse, platformized discourse, multimodal discourse
Procedia PDF Downloads 8325180 Assessing the Physical Conditions of Motorcycle Taxi Stands and Comfort Conditions of the Drivers in the Central Business District of Bangkok
Authors: Nissa Phloimontri
Abstract:
This research explores the current physical conditions of motorcycle taxi stands located near the BTS stations in the central business district (CBD) and the comfort conditions of motorcycle taxi drivers. The criteria set up for physical stand survey and assessment are the integration of multimodal access design guidelines. After the survey, stands that share similar characteristics are classified into a series of typologies. Based on the environmental comfort model, questionnaires and in-depth interviews are conducted to evaluate the comfort levels of drivers including physical, functional, and psychological comfort. The results indicate that there are a number of motorcycle taxi stands that are not up to standard and are not conducive to the work-related activities of drivers. The study concludes by recommending public policy for integrated paratransit stops that support the multimodal transportation and seamless mobility concepts within the specific context of Bangkok as well as promote the quality of work life of motorcycle taxi drivers.Keywords: motorcycle taxi, paratransit stops, environmental comfort, quality of work life
Procedia PDF Downloads 11125179 A Multimodal Measurement Approach Using Narratives and Eye Tracking to Investigate Visual Behaviour in Perceiving Naturalistic and Urban Environments
Authors: Khizar Z. Choudhrya, Richard Coles, Salman Qureshi, Robert Ashford, Salim Khan, Rabia R. Mir
Abstract:
Abstract: The majority of existing landscape research has been derived by conducting heuristic evaluations, without having empirical insight of real participant visual response. In this research, a modern multimodal measurement approach (using narratives and eye tracking) was applied to investigate visual behaviour in perceiving naturalistic and urban environments. This research is unique in exploring gaze behaviour on environmental images possessing different levels of saliency. Eye behaviour is predominantly attracted by salient locations. The concept of methodology of this research on naturalistic and urban environments is drawn from the approaches in market research. Borrowing methodologies from market research that examine visual responses and qualities provided a critical and hitherto unexplored approach. This research has been conducted by using mixed methodological quantitative and qualitative approaches. On the whole, the results of this research corroborated existing landscape research findings, but they also identified potential refinements. The research contributes both methodologically and empirically to human-environment interaction (HEI). This study focused on initial impressions of environmental images with the help of eye tracking. Taking under consideration the importance of the image, this study explored the factors that influence initial fixations in relation to expectations and preferences. In terms of key findings of this research it is noticed that each participant has his own unique navigation style while surfing through different elements of landscape images. This individual navigation style is given the name of ‘visual signature’. This study adds the necessary clarity that would complete the picture and bring an insight for future landscape researchers.Keywords: human-environment interaction (HEI), multimodal measurement, narratives, eye tracking
Procedia PDF Downloads 33725178 The Social Aspects of Code-Switching in Online Interaction: The Case of Saudi Bilinguals
Authors: Shirin Alabdulqader
Abstract:
This research aims to investigate the concept of code-switching (CS) between English, Arabic, and the CS practices of Saudi online users via a Translanguaging (TL) lens for more inclusive view towards the nature of the data from the study. It employs Digitally Mediated Communication (DMC), specifically the WhatsApp and Twitter platforms, in order to understand how the users employ online resources to communicate with others on a daily basis. This project looks beyond language and considers the multimodal affordances (visual and audio means) that interlocutors utilise in their online communicative practices to shape their online social existence. This exploratory study is based on a data-driven interpretivist epistemology as it aims to understand how meaning (reality) is created by individuals within different contexts. This project used a mixed-method approach, combining a qualitative and a quantitative approach. In the former, data were collected from online chats and interview responses, while in the latter a questionnaire was employed to understand the frequency and relations between the participants’ linguistic and non-linguistic practices and their social behaviours. The participants were eight bilingual Saudi nationals (both men and women, aged between 20 and 50 years old) who interacted with others online. These participants provided their online interactions, participated in an interview and responded to a questionnaire. The study data were gathered from 194 WhatsApp chats and 122 Tweets. These data were analysed and interpreted according to three levels: conversational turn taking and CS; the linguistic description of the data; and CS and persona. This project contributes to the emerging field of analysing online Arabic data systematically, and the field of multimodality and bilingual sociolinguistics. The findings are reported for each of the three levels. For conversational turn taking, the CS analysis revealed that it was used to accomplish negotiation and develop meaning in the conversation. With regard to the linguistic practices of the CS data, the majority of the code-switched words were content morphemes. The third level of data interpretation is CS and its relationship with identity; two types of identity were indexed; absolute identity and contextual identity. This study contributes to the DMC literature and bridges some of the existing gaps. The findings of this study are that CS by its nature, and most of the findings, if not all, support the notion of TL that multiliteracy is one’s ability to decode multimodal communication, and that this multimodality contributes to the meaning. Either this is applicable to the online affordances used by monolinguals or multilinguals and perceived not only by specific generations but also by any online multiliterates, the study provides the linguistic features of CS utilised by Saudi bilinguals and it determines the relationship between these features and the contexts in which they appear.Keywords: social media, code-switching, translanguaging, online interaction, saudi bilinguals
Procedia PDF Downloads 13025177 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework
Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin
Abstract:
During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.Keywords: artificial intelligence, COVID-19, depression detection, psychiatric disorder
Procedia PDF Downloads 13025176 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration
Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith
Abstract:
Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.Keywords: cycle consistency, deformable multimodal image registration, deep learning, GAN
Procedia PDF Downloads 13125175 Using Trip Planners in Developing Proper Transportation Behavior
Authors: Grzegorz Sierpiński, Ireneusz Celiński, Marcin Staniek
Abstract:
The article discusses multi modal mobility in contemporary societies as a main planning and organization issue in the functioning of administrative bodies, a problem which really exists in the space of contemporary cities in terms of shaping modern transport systems. The article presents classification of available resources and initiatives undertaken for developing multi modal mobility. Solutions can be divided into three groups of measures–physical measures in the form of changes of the transport network infrastructure, organizational ones (including transport policy) and information measures. The latter ones include in particular direct support for people travelling in the transport network by providing information about ways of using available means of transport. A special measure contributing to this end is a trip planner. The article compares several selected planners. It includes a short description of the Green Travelling Project, which aims at developing a planner supporting environmentally friendly solutions in terms of transport network operation. The article summarizes preliminary findings of the project.Keywords: mobility, modal split, multimodal trip, multimodal platforms, sustainable transport
Procedia PDF Downloads 41025174 Fu Hao From the East: Between Chinese Traditions and Western Pop Cultures
Abstract:
Having been studied and worked in North America and Europe, we, two Chinese art educators, have been enormously influenced by eastern and western cultures. Thus, we aim to enhance students’ learning experiences by exploring and amalgamating both cultures for art creating. This text draws on our action research study of students’ visual literacy practices in a foundation sketching course in a major Chinese university, exploring art forms by cross-utilizing various cultural aspects. Instead of relying on the predominant western observational drawing skills in our classroom, we taught students about ancient Chinese art in the provincial museum, using Fu Hao owl-shaped vessel, a Shang Dynasty national treasure, as the final sketch project of this course. We took up multimodal literacy, which emphasized students’ critical use of creativity to exploit the semiotic potentials of communicative modes to address diverse cultural issues through their multimodal design. We used the Hong Kong-based artist Tik Ka’s artworks to demonstrate the cultural amalgamation of Chinese traditions and western pop cultures. Collectively, these approaches create a dialogical space for students to experience, analyze, and negotiate with complex modes and potentially transform their understanding of both cultures by redesigning Fu Hao.Keywords: Chinese traditions, western pop cultures, Fu Hao, arts education, design sketch
Procedia PDF Downloads 11625173 Multimodal Biometric Cryptography Based Authentication in Cloud Environment to Enhance Information Security
Authors: D. Pugazhenthi, B. Sree Vidya
Abstract:
Cloud computing is one of the emerging technologies that enables end users to use the services of cloud on ‘pay per usage’ strategy. This technology grows in a fast pace and so is its security threat. One among the various services provided by cloud is storage. In this service, security plays a vital factor for both authenticating legitimate users and protection of information. This paper brings in efficient ways of authenticating users as well as securing information on the cloud. Initial phase proposed in this paper deals with an authentication technique using multi-factor and multi-dimensional authentication system with multi-level security. Unique identification and slow intrusive formulates an advanced reliability on user-behaviour based biometrics than conventional means of password authentication. By biometric systems, the accounts are accessed only by a legitimate user and not by a nonentity. The biometric templates employed here do not include single trait but multiple, viz., iris and finger prints. The coordinating stage of the authentication system functions on Ensemble Support Vector Machine (SVM) and optimization by assembling weights of base SVMs for SVM ensemble after individual SVM of ensemble is trained by the Artificial Fish Swarm Algorithm (AFSA). Thus it helps in generating a user-specific secure cryptographic key of the multimodal biometric template by fusion process. Data security problem is averted and enhanced security architecture is proposed using encryption and decryption system with double key cryptography based on Fuzzy Neural Network (FNN) for data storing and retrieval in cloud computing . The proposing scheme aims to protect the records from hackers by arresting the breaking of cipher text to original text. This improves the authentication performance that the proposed double cryptographic key scheme is capable of providing better user authentication and better security which distinguish between the genuine and fake users. Thus, there are three important modules in this proposed work such as 1) Feature extraction, 2) Multimodal biometric template generation and 3) Cryptographic key generation. The extraction of the feature and texture properties from the respective fingerprint and iris images has been done initially. Finally, with the help of fuzzy neural network and symmetric cryptography algorithm, the technique of double key encryption technique has been developed. As the proposed approach is based on neural networks, it has the advantage of not being decrypted by the hacker even though the data were hacked already. The results prove that authentication process is optimal and stored information is secured.Keywords: artificial fish swarm algorithm (AFSA), biometric authentication, decryption, encryption, fingerprint, fusion, fuzzy neural network (FNN), iris, multi-modal, support vector machine classification
Procedia PDF Downloads 25925172 The Integration of Digital Humanities into the Sociology of Knowledge Approach to Discourse Analysis
Authors: Gertraud Koch, Teresa Stumpf, Alejandra Tijerina García
Abstract:
Discourse analysis research approaches belong to the central research strategies applied throughout the humanities; they focus on the countless forms and ways digital texts and images shape present-day notions of the world. Despite the constantly growing number of relevant digital, multimodal discourse resources, digital humanities (DH) methods are thus far not systematically developed and accessible for discourse analysis approaches. Specifically, the significance of multimodality and meaning plurality modelling are yet to be sufficiently addressed. In order to address this research gap, the D-WISE project aims to develop a prototypical working environment as digital support for the sociology of knowledge approach to discourse analysis and new IT-analysis approaches for the use of context-oriented embedding representations. Playing an essential role throughout our research endeavor is the constant optimization of hermeneutical methodology in the use of (semi)automated processes and their corresponding epistemological reflection. Among the discourse analyses, the sociology of knowledge approach to discourse analysis is characterised by the reconstructive and accompanying research into the formation of knowledge systems in social negotiation processes. The approach analyses how dominant understandings of a phenomenon develop, i.e., the way they are expressed and consolidated by various actors in specific arenas of discourse until a specific understanding of the phenomenon and its socially accepted structure are established. This article presents insights and initial findings from D-WISE, a joint research project running since 2021 between the Institute of Anthropological Studies in Culture and History and the Language Technology Group of the Department of Informatics at the University of Hamburg. As an interdisciplinary team, we develop central innovations with regard to the availability of relevant DH applications by building up a uniform working environment, which supports the procedure of the sociology of knowledge approach to discourse analysis within open corpora and heterogeneous, multimodal data sources for researchers in the humanities. We are hereby expanding the existing range of DH methods by developing contextualized embeddings for improved modelling of the plurality of meaning and the integrated processing of multimodal data. The alignment of this methodological and technical innovation is based on the epistemological working methods according to grounded theory as a hermeneutic methodology. In order to systematically relate, compare, and reflect the approaches of structural-IT and hermeneutic-interpretative analysis, the discourse analysis is carried out both manually and digitally. Using the example of current discourses on digitization in the healthcare sector and the associated issues regarding data protection, we have manually built an initial data corpus of which the relevant actors and discourse positions are analysed in conventional qualitative discourse analysis. At the same time, we are building an extensive digital corpus on the same topic based on the use and further development of entity-centered research tools such as topic crawlers and automated newsreaders. In addition to the text material, this consists of multimodal sources such as images, video sequences, and apps. In a blended reading process, the data material is filtered, annotated, and finally coded with the help of NLP tools such as dependency parsing, named entity recognition, co-reference resolution, entity linking, sentiment analysis, and other project-specific tools that are being adapted and developed. The coding process is carried out (semi-)automated by programs that propose coding paradigms based on the calculated entities and their relationships. Simultaneously, these can be specifically trained by manual coding in a closed reading process and specified according to the content issues. Overall, this approach enables purely qualitative, fully automated, and semi-automated analyses to be compared and reflected upon.Keywords: entanglement of structural IT and hermeneutic-interpretative analysis, multimodality, plurality of meaning, sociology of knowledge approach to discourse analysis
Procedia PDF Downloads 22625171 A Multimodal Dialogue Management System for Achieving Natural Interaction with Embodied Conversational Agents
Authors: Ozge Nilay Yalcin
Abstract:
Dialogue has been proposed to be the natural basis for the human-computer interaction, which is behaviorally rich and includes different modalities such as gestures, posture changes, gaze, para-linguistic parameters and linguistic context. However, equipping the system with these capabilities might have consequences on the usability of the system. One issue is to be able to find a good balance between rich behavior and fluent behavior, as planning and generating these behaviors is computationally expensive. In this work, we propose a multi-modal dialogue management system that automates the conversational flow from text-based dialogue examples and uses synchronized verbal and non-verbal conversational cues to achieve a fluent interaction. Our system is integrated with Smartbody behavior realizer to provide real-time interaction with embodied agent. The nonverbal behaviors are used according to turn-taking behavior, emotions, and personality of the user and linguistic analysis of the dialogue. The verbal behaviors are responsive to the emotional value of the utterance and the feedback from the user. Our system is aimed for online planning of these affective multi-modal components, in order to achieve enhanced user experience with richer and more natural interaction.Keywords: affect, embodied conversational agents, human-agent interaction, multimodal interaction, natural interfaces
Procedia PDF Downloads 17425170 Emotions Triggered by Children’s Literature Images
Authors: Ana Maria Reis d'Azevedo Breda, Catarina Maria Neto da Cruz
Abstract:
The role of images/illustrations in communicating meanings and triggering emotions assumes an increasingly relevant role in contemporary texts, regardless of the age group for which they are intended or the nature of the texts that host them. It is no coincidence that children's books are full of illustrations and that the image/text ratio decreases as the age group grows. The vast majority of children's books can be considered multimodal texts containing text and images/illustrations interacting with each other to provide the young reader with a broader and more creative understanding of the book's narrative. This interaction is very diverse, ranging from images/illustrations that are not essential for understanding the storytelling to those that contribute significantly to the meaning of the story. Usually, these books are also read by adults, namely by parents, educators, and teachers who act as mediators between the book and the children, explaining aspects that are or seem to be too complex for the child's context. It should be noted that there are books labeled as children's books that are clearly intended for both children and adults. In this work, following a qualitative and interpretative methodology based on written productions, participant observation, and field notes, we will describe the perceptions of future teachers of the 1st cycle of basic education, attending a master's degree at a Portuguese university, about the role of the image in literary and non-literary texts, namely in mathematical texts, and how these can constitute precious resources for emotional regulation and for the design of creative didactic situations. The analysis of the collected data allowed us to obtain evidence regarding the evolution of the participants' perception regarding the crucial role of images in children's literature, not only as an emotional regulator for young readers but also as a creative source for the design of meaningful didactical situations, crossing other scientific areas, other than the mother tongue, namely mathematics.Keywords: children’s literature, emotions, multimodal texts, soft skills
Procedia PDF Downloads 9425169 Embodied Communication - Examining Multimodal Actions in a Digital Primary School Project
Authors: Anne Öman
Abstract:
Today in Sweden and in other countries, a variety of digital artefacts, such as laptops, tablets, interactive whiteboards, are being used at all school levels. From an educational perspective, digital artefacts challenge traditional teaching because they provide a range of modes for expression and communication and are not limited to the traditional medium of paper. Digital technologies offer new opportunities for representations and physical interactions with objects, which put forward the role of the body in interaction and learning. From a multimodal perspective the emphasis is on the use of multiple semiotic resources for meaning- making and the study presented here has examined the differential use of semiotic resources by pupils interacting in a digitally designed task in a primary school context. The instances analyzed in this paper come from a case study where the learning task was to create an advertising film in a film-software. The study in focus involves the analysis of a single case with the emphasis on the examination of the classroom setting. The research design used in this paper was based on a micro ethnographic perspective and the empirical material was collected through video recordings of small-group work in order to explore pupils’ communication within the group activity. The designed task described here allowed students to build, share, collaborate upon and publish the redesigned products. The analysis illustrates the variety of communicative modes such as body position, gestures, visualizations, speech and the interaction between these modes and the representations made by the pupils. The findings pointed out the importance of embodied communication during the small- group processes from a learning perspective as well as a pedagogical understanding of pupils’ representations, which were similar from a cultural literacy perspective. These findings open up for discussions with further implications for the school practice concerning the small- group processes as well as the redesigned products. Wider, the findings could point out how multimodal interactions shape the learning experience in the meaning-making processes taking into account that language in a globalized society is more than reading and writing skills.Keywords: communicative learning, interactive learning environments, pedagogical issues, primary school education
Procedia PDF Downloads 40625168 Multimodal Direct Neural Network Positron Emission Tomography Reconstruction
Authors: William Whiteley, Jens Gregor
Abstract:
In recent developments of direct neural network based positron emission tomography (PET) reconstruction, two prominent architectures have emerged for converting measurement data into images: 1) networks that contain fully-connected layers; and 2) networks that primarily use a convolutional encoder-decoder architecture. In this paper, we present a multi-modal direct PET reconstruction method called MDPET, which is a hybrid approach that combines the advantages of both types of networks. MDPET processes raw data in the form of sinograms and histo-images in concert with attenuation maps to produce high quality multi-slice PET images (e.g., 8x440x440). MDPET is trained on a large whole-body patient data set and evaluated both quantitatively and qualitatively against target images reconstructed with the standard PET reconstruction benchmark of iterative ordered subsets expectation maximization. The results show that MDPET outperforms the best previously published direct neural network methods in measures of bias, signal-to-noise ratio, mean absolute error, and structural similarity.Keywords: deep learning, image reconstruction, machine learning, neural network, positron emission tomography
Procedia PDF Downloads 10925167 Data Collection Techniques for Robotics to Identify the Facial Expressions of Traumatic Brain Injured Patients
Authors: Chaudhary Muhammad Aqdus Ilyas, Matthias Rehm, Kamal Nasrollahi, Thomas B. Moeslund
Abstract:
This paper presents the investigation of data collection procedures, associated with robots when placed with traumatic brain injured (TBI) patients for rehabilitation purposes through facial expression and mood analysis. Rehabilitation after TBI is very crucial due to nature of injury and variation in recovery time. It is advantageous to analyze these emotional signals in a contactless manner, due to the non-supportive behavior of patients, limited muscle movements and increase in negative emotional expressions. This work aims at the development of framework where robots can recognize TBI emotions through facial expressions to perform rehabilitation tasks by physical, cognitive or interactive activities. The result of these studies shows that with customized data collection strategies, proposed framework identify facial and emotional expressions more accurately that can be utilized in enhancing recovery treatment and social interaction in robotic context.Keywords: computer vision, convolution neural network- long short term memory network (CNN-LSTM), facial expression and mood recognition, multimodal (RGB-thermal) analysis, rehabilitation, robots, traumatic brain injured patients
Procedia PDF Downloads 15225166 Investigating Self-Confidence Influence on English as a Foreign Language Student English Language Proficiency Level
Authors: Ali A. Alshahrani
Abstract:
This study aims to identify Saudi English as a Foreign Language (EFL) students' perspectives towards using the English language in their studies. The study explores students' self-confident and its association with students' actual performance in English courses in their different academic programs. A multimodal methodology was used to fulfill the research purpose and answer the research questions. A 25-item survey questionnaire and final examination grades were used to collect data. Two hundred forty-one students agreed to participate in the study. They completed the questionnaire and agreed to release their final grades to be a part of the collected data. The data were coded and analyzed by SPSS software. The findings indicated a significant difference in students' performance in English courses between participants' academic programs on the one hand. Students' self-confidence in their English language skills, on the other hand, was not significantly different between participants' academic programs. Data analysis also revealed no correlational relationship between students' self-confidence level and their language skills and their performance. The study raises more questions about other vital factors such as course instructors' views of the materials, faculty members of the target department, family belief in the usefulness of the program, potential employers. These views and beliefs shape the student's preparation process and, therefore, should be explored further.Keywords: English language intensive program, language proficiency, performance, self-confidence
Procedia PDF Downloads 13425165 Analgesic Efficacy of IPACK Block in Primary Total Knee Arthroplasty (90 CASES)
Authors: Fedili Benamar, Beloulou Mohamed Lamine, Ouahes Hassane, Ghattas Samir
Abstract:
Background and aims: Peripheral regional anesthesia has been integrated into most analgesia protocols for total knee arthroplasty which considered among the most painful surgeries with a huge potential for chronicization. The adductor canal block (ACB) has gained popularity. Similarly, the IPACK block has been described to provide analgesia of the posterior knee capsule. This study aimed to evaluate the analgesic efficacy of this block in patients undergoing primary PTG. Methods: 90 patients were randomized to receive either an IPACK, an anterior sciatic block, or a sham block (30 patients in each group + multimodal analgesia and a catheter in the KCA adductor canal). GROUP 1 KCA GROUP 2 KCA+BSA GROUP 3 KCA+IPACK The analgesic blocks were done under echo-guidance preoperatively respecting the safety rules, the dose administered was 20 cc of ropivacaine 0.25% was used. We were to assess posterior knee pain 6 hours after surgery. Other endpoints included quality of recovery after surgery, pain scores, opioid requirements (PCA morphine)(EPI info 7.2 analysis). Results: -groups were matched -A predominance of women (4F/1H). -average age: 68 +/-7 years -the average BMI =31.75 kg/m2 +/- 4. -70% of patients ASA2 ,20% ASA3. -The average duration of the intervention: 89 +/- 19 minutes. -Morphine consumption (PCA) significantly higher in group 1 (16mg) & group 2 (8mg) group 3 (4mg) - The groups were matched . -There was a correlation between the use of the ipack block and postoperative pain Conclusions :In a multimodal analgesic protocol, the addition of IPACK block decreased pain scores and morphine consumption ,Keywords: regional anesthesia, analgesia, total knee arthroplasty, the adductor canal block (acb), the ipack block, pain
Procedia PDF Downloads 7225164 Adaptation of the Scenario Test for Greek-speaking People with Aphasia: Reliability and Validity Study
Authors: Marina Charalambous, Phivos Phylactou, Thekla Elriz, Loukia Psychogios, Jean-Marie Annoni
Abstract:
Background: Evidence-based practices for the evaluation and treatment of people with aphasia (PWA) in Greek are mainly impairment-based. Functional and multimodal communication is usually under assessed and neglected by clinicians. This study explores the adaptation and psychometric testing of the Greek (GR) version of The Scenario Test. The Scenario Test assesses the everyday functional communication of PWA in an interactive multimodal communication setting with the support of an active communication facilitator. Aims: To define the reliability and validity of The Scenario Test GR and discuss its clinical value. Methods & Procedures: The Scenario Test-GR was administered to 54 people with chronic stroke (6+ months post-stroke): 32 PWA and 22 people with stroke without aphasia. Participants were recruited from Greece and Cyprus. All measures were performed in an interview format. Standard psychometric criteria were applied to evaluate reliability (internal consistency, test-retest, and interrater reliability) and validity (construct and known – groups validity) of the Scenario Test GR. Video analysis was performed for the qualitative examination of the communication modes used. Outcomes & Results: The Scenario Test-GR shows high levels of reliability and validity. High scores of internal consistency (Cronbach’s α = .95), test-retest reliability (ICC = .99), and interrater reliability (ICC = .99) were found. Interrater agreement in scores on individual items fell between good and excellent levels of agreement. Correlations with a tool measuring language function in aphasia (the Aphasia Severity Rating Scale of the Boston Diagnostic Aphasia Examination), a measure of functional communication (the Communicative Effectiveness Index), and two instruments examining the psychosocial impact of aphasia (the Stroke and Aphasia Quality of Life questionnaire and the Aphasia Impact Questionnaire) revealed good convergent validity (all ps< .05). Results showed good known – groups validity (Mann-Whitney U = 96.5, p < .001), with significantly higher scores for participants without aphasia compared to those with aphasia. Conclusions: The psychometric qualities of The Scenario Test-GR support the reliability and validity of the tool for the assessment of functional communication for Greek-speaking PWA. The Scenario Test-GR can be used to assess multimodal functional communication, orient aphasia rehabilitation goal setting towards the activity and participation level, and be used as an outcome measure of everyday communication. Future studies will focus on the measurement of sensitivity to change in PWA with severe non-fluent aphasia.Keywords: the scenario test GR, functional communication assessment, people with aphasia (PWA), tool validation
Procedia PDF Downloads 12825163 Self-Organizing Maps for Credit Card Fraud Detection
Authors: ChunYi Peng, Wei Hsuan CHeng, Shyh Kuang Ueng
Abstract:
This study focuses on the application of self-organizing maps (SOM) technology in analyzing credit card transaction data, aiming to enhance the accuracy and efficiency of fraud detection. Som, as an artificial neural network, is particularly suited for pattern recognition and data classification, making it highly effective for the complex and variable nature of credit card transaction data. By analyzing transaction characteristics with SOM, the research identifies abnormal transaction patterns that could indicate potentially fraudulent activities. Moreover, this study has developed a specialized visualization tool to intuitively present the relationships between SOM analysis outcomes and transaction data, aiding financial institution personnel in quickly identifying and responding to potential fraud, thereby reducing financial losses. Additionally, the research explores the integration of SOM technology with composite intelligent system technologies (including finite state machines, fuzzy logic, and decision trees) to further improve fraud detection accuracy. This multimodal approach provides a comprehensive perspective for identifying and understanding various types of fraud within credit card transactions. In summary, by integrating SOM technology with visualization tools and composite intelligent system technologies, this research offers a more effective method of fraud detection for the financial industry, not only enhancing detection accuracy but also deepening the overall understanding of fraudulent activities.Keywords: self-organizing map technology, fraud detection, information visualization, data analysis, composite intelligent system technologies, decision support technologies
Procedia PDF Downloads 5625162 Self-Organizing Maps for Credit Card Fraud Detection and Visualization
Authors: Peng Chun-Yi, Chen Wei-Hsuan, Ueng Shyh-Kuang
Abstract:
This study focuses on the application of self-organizing maps (SOM) technology in analyzing credit card transaction data, aiming to enhance the accuracy and efficiency of fraud detection. Som, as an artificial neural network, is particularly suited for pattern recognition and data classification, making it highly effective for the complex and variable nature of credit card transaction data. By analyzing transaction characteristics with SOM, the research identifies abnormal transaction patterns that could indicate potentially fraudulent activities. Moreover, this study has developed a specialized visualization tool to intuitively present the relationships between SOM analysis outcomes and transaction data, aiding financial institution personnel in quickly identifying and responding to potential fraud, thereby reducing financial losses. Additionally, the research explores the integration of SOM technology with composite intelligent system technologies (including finite state machines, fuzzy logic, and decision trees) to further improve fraud detection accuracy. This multimodal approach provides a comprehensive perspective for identifying and understanding various types of fraud within credit card transactions. In summary, by integrating SOM technology with visualization tools and composite intelligent system technologies, this research offers a more effective method of fraud detection for the financial industry, not only enhancing detection accuracy but also deepening the overall understanding of fraudulent activities.Keywords: self-organizing map technology, fraud detection, information visualization, data analysis, composite intelligent system technologies, decision support technologies
Procedia PDF Downloads 5925161 Speech Perception by Video Hosting Services Actors: Urban Planning Conflicts
Authors: M. Pilgun
Abstract:
The report presents the results of a study of the specifics of speech perception by actors of video hosting services on the material of urban planning conflicts. To analyze the content, the multimodal approach using neural network technologies is employed. Analysis of word associations and associative networks of relevant stimulus revealed the evaluative reactions of the actors. Analysis of the data identified key topics that generated negative and positive perceptions from the participants. The calculation of social stress and social well-being indices based on user-generated content made it possible to build a rating of road transport construction objects according to the degree of negative and positive perception by actors.Keywords: social media, speech perception, video hosting, networks
Procedia PDF Downloads 14725160 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification
Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar
Abstract:
Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings
Procedia PDF Downloads 17325159 CookIT: A Web Portal for the Preservation and Dissemination of Traditional Italian Recipes
Authors: M. T. Artese, G. Ciocca, I. Gagliardi
Abstract:
Food is a social and cultural aspect of every individual. Food products, processing, and traditions have been identified as cultural objects carrying history and identity of social groups. Traditional recipes are passed down from one generation to the other, often to strengthen the link with the territory. The paper presents CookIT, a web portal developed to collect Italian traditional recipes related to regional cuisine, with the purpose to disseminate the knowledge of typical Italian recipes and the Mediterranean diet which is a significant part of Italian cuisine. The system designed is completed with multimodal means of browsing and data retrieval. Stored recipes can be retrieved integrating and combining a number of different methods and keys, while the results are displayed using classical styles, such as list and mosaic, and also using maps and graphs, with which users can play using available keys for interaction.Keywords: collaborative portal, Italian cuisine, intangible cultural heritage, traditional recipes, searching and browsing
Procedia PDF Downloads 14925158 Multimodal Analysis of News Magazines' Front-Page Portrayals of the US, Germany, China, and Russia
Authors: Alena Radina
Abstract:
On the global stage, national image is shaped by historical memory of wars and alliances, government ideology and particularly media stereotypes which represent countries in positive or negative ways. News magazine covers are a key site for national representation. The object of analysis in this paper is the portrayals of the US, Germany, China, and Russia in the front pages and cover stories of “Time”, “Der Spiegel”, “Beijing Review”, and “Expert”. Political comedy helps people learn about current affairs even if politics is not their area of interest, and thus satire indirectly sets the public agenda. Coupled with satirical messages, cover images and the linguistic messages embedded in the covers become persuasive visual and verbal factors, known to drive about 80% of magazine sales. Preliminary analysis identified satirical elements in magazine covers, which are known to influence and frame understandings and attract younger audiences. Multimodal and transnational comparative framing analyses lay the groundwork to investigate why journalists, editors and designers deploy certain frames rather than others. This research investigates to what degree frames used in covers correlate with frames within the cover stories and what these framings can tell us about media professionals’ representations of their own and other nations. The study sample includes 32 covers consisting of two covers representing each of the four chosen countries from the four magazines. The sampling framework considers two time periods to compare countries’ representation with two different presidents, and between men and women when present. The countries selected for analysis represent each category of the international news flows model: the core nations are the US and Germany; China is a semi-peripheral country; and Russia is peripheral. Examining textual and visual design elements on the covers and images in the cover stories reveals not only what editors believe visually attracts the reader’s attention to the magazine but also how the magazines frame and construct national images and national leaders. The cover is the most powerful editorial and design page in a magazine because images incorporate less intrusive framing tools. Thus, covers require less cognitive effort of audiences who may therefore be more likely to accept the visual frame without question. Analysis of design and linguistic elements in magazine covers helps to understand how media outlets shape their audience’s perceptions and how magazines frame global issues. While previous multimodal research of covers has focused mostly on lifestyle magazines or newspapers, this paper examines the power of current affairs magazines’ covers to shape audience perception of national image.Keywords: framing analysis, magazine covers, multimodality, national image, satire
Procedia PDF Downloads 100