Search results for: multimodal indexing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 267

Search results for: multimodal indexing

207 Optimizing Multimodal Teaching Strategies for Enhanced Engagement and Performance

Authors: Victor Milanes, Martha Hubertz

Abstract:

In the wake of COVID-19, all aspects of life have been estranged, and humanity has been forced to shift toward a more technologically integrated mode of operation. Essential work such as Healthcare, business, and public policy are a few notable industries that were initially dependent upon face-to-face modality but have completely reimagined their operation style. Unique to these fields, education was particularly strained because academics, teachers, and professors alike were obligated to shift their curriculums online over the course of a few weeks while also maintaining the expectation that they were educating their students to a similar level accomplished pre-pandemic. This was notable as research indicates two key concepts: Students prefer face-to-face modality, and due to the disruption in academic continuity/style, there was a negative impact on student's overall education and performance. With these two principles in mind, this study aims to inquire what online strategies could be best employed by teachers to educate their students, as well as what strategies could be adopted in a multimodal setting if deemed necessary by the instructor or outside convoluting factors (Such as the case of COVID-19, or a personal matter that demands the teacher's attention away from the classroom). Strategies and methods will be cross-analyzed via a ranking system derived from various recognized teaching assessments, in which engagement, retention, flexibility, interest, and performance are specifically accounted for. We expect to see an emphasis on positive social pressure as a dominant factor in the improved propensity for education, as well as a preference for visual aids across platforms, as research indicates most individuals are visual learners.

Keywords: technological integration, multimodal teaching, education, student engagement

Procedia PDF Downloads 25
206 Multimodal Rhetoric in the Wildlife Documentary, “My Octopus Teacher”

Authors: Visvaganthie Moodley

Abstract:

While rhetoric goes back as far as Aristotle who focalised its meaning as the “art of persuasion”, most scholars have focused on elocutio and dispositio canons, neglecting the rhetorical impact of multimodal texts, such as documentaries. Film documentaries are being increasingly rhetoric, often used by wildlife conservationists for influencing people to become more mindful about humanity’s connection with nature. This paper examines the award-winning film documentary, “My Octopus Teacher”, which depicts naturalist, Craig Foster’s unique discovery and relationship with a female octopus in the southern tip of Africa, the Cape of Storms in South Africa. It is anchored in Leech and Short’s (2007) framework of linguistic and stylistic categories – comprising lexical items, grammatical features, figures of speech and other rhetoric features, and cohesiveness – with particular foci on diction, anthropomorphic language, metaphors and symbolism. It also draws on Kress and van Leeuwen’s (2006) multimodal analysis to show how verbal cues (the narrator’s commentary), visual images in motion, visual images as metaphors and symbolism, and aural sensory images such as music and sound synergise for rhetoric effect. In addition, the analysis of “My Octopus Teacher” is guided by Nichol’s (2010) narrative theory; features of a documentary which foregrounds the credibility of the narrative as a text that represents real events with real people; and its modes of construction, viz., the poetic mode, the expository mode, observational mode and participatory mode, and their integration – forging documentaries as multimodal texts. This paper presents a multimodal rhetoric discussion on the sequence of salient episodes captured in the slow moving one-and-a-half-hour documentary. These are: (i) The prologue: on the brink of something extraordinary; (ii) The day it all started; (iii) The narrator’s turmoil: getting back into the ocean; (iv) The incredible encounter with the octopus; (v) Establishing a relationship; (vi) Outwitting the predatory pyjama shark; (vii) The cycle of life; and (viii) The conclusion: lessons from an octopus. The paper argues that wildlife documentaries, characterized by plausibility and which provide researchers the lens to examine the ideologies about animals and humans, offer an assimilation of the various senses – vocal, visual and audial – for engaging viewers in stylized compelling way; they have the ability to persuade people to think and act in particular ways. As multimodal texts, with its use of lexical items; diction; anthropomorphic language; linguistic, visual and aural metaphors and symbolism; and depictions of anthropocentrism, wildlife documentaries are powerful resources for promoting wildlife conservation and conscientizing people of the need for establishing a harmonious relationship with nature and humans alike.

Keywords: documentaries, multimodality, rhetoric, style, wildlife, conservation

Procedia PDF Downloads 59
205 On Increase and Development Prospects of Competitiveness of Georgia’s Transport-Logistical System on the Contemporary Stage

Authors: Ketevan Goletiani

Abstract:

MMultimodal transport is Europe-Asia’s rational decision of the XXI century. Success prerequisite of this form of cargo carriage is not technologic decision, but the comprehensive attitude towards it. Integration of the transport industry must refer to both technical and organizational-economic fields. Support of the multimodal’s must be the priority of the transport policy in different organizations of Europe and Asia. The method of approach to the transport as a unified system has been changed to a certain extent in the market conditions. Nowadays the competition between the different kinds of transport is not to be considered as a competition of one kind of transport towards another one, but is to be considered as a stimulator of the transport development. Basically, transport logistic, as the recent methodology and organization of the rationally flow of cargos at the specialized logistic centres during their procession provides effective rise of such flow of cargos, decreases non-operating expenses and gives the opportunity to the transport companies to come along with the time, to meet market clients’ requirements. It is apparent that the advanced transport-forwarding and logistic firms are being analized.

Keywords: transport systems, multimodal transport, competition, transport logistics

Procedia PDF Downloads 402
204 TMIF: Transformer-Based Multi-Modal Interactive Fusion for Rumor Detection

Authors: Jiandong Lv, Xingang Wang, Cuiling Shao

Abstract:

The rapid development of social media platforms has made it one of the important news sources. While it provides people with convenient real-time communication channels, fake news and rumors are also spread rapidly through social media platforms, misleading the public and even causing bad social impact in view of the slow speed and poor consistency of artificial rumor detection. We propose an end-to-end rumor detection model-TIMF, which captures the dependencies between multimodal data based on the interactive attention mechanism, uses a transformer for cross-modal feature sequence mapping and combines hybrid fusion strategies to obtain decision results. This paper verifies two multi-modal rumor detection datasets and proves the superior performance and early detection performance of the proposed model.

Keywords: hybrid fusion, multimodal fusion, rumor detection, social media, transformer

Procedia PDF Downloads 188
203 Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe

Authors: Babatunde Olumide Olawale, Oyebode Olumide Oyediran

Abstract:

The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe.

Keywords: access control, multimodal biometrics, pattern recognition, security safe

Procedia PDF Downloads 295
202 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 317
201 A Survey of Sentiment Analysis Based on Deep Learning

Authors: Pingping Lin, Xudong Luo, Yifan Fan

Abstract:

Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.

Keywords: document analysis, deep learning, multimodal sentiment analysis, natural language processing

Procedia PDF Downloads 126
200 The Effect of an Occupational Therapy Programme on Sewing Machine Operators

Authors: N. Dunleavy, E. Lovemore, K. Siljeur, D. Jackson, M. Hendricks, M. Hoosain, N. Plastow, S. Marais

Abstract:

Background: The work requirements of sewing machine operators cause physical and emotional strain. Past ergonomic interventions have been provided to alleviate physical concerns; however, a holistic, multimodal intervention was needed to improve these factors. Aim: The study aimed to examine the effect of an occupational therapy programme on sewing machine operators’ pain, mental health, and productivity within a factory in the South African context. Methods: A pilot randomised control trial was conducted with 22 sewing machine operators within a single factory. Stratified randomisation was used to determine the experimental (EG) and control groups (CG), using measures for pain intensity, level of depression (mental health), and productivity rates as stratification variables. The EG received the multimodal intervention, incorporating education, seating adaptations, and mental health intervention. In three months, the CG will receive the same intervention. Pre- and post-intervention testing have occurred with upcoming three- and six-month follow-ups. Results: Immediate results indicate a statistically significant decrease in pain in both experimental and control groups; no change in productivity scores and depression between the two groups. This may be attributed to external factors. The values for depression further showed no statistical significance between the two groups and within pre-and post-test results. The Statistical Program for Social Sciences (SPSS) version-24 was used as the data analysis testing, where all the tests will be evaluated at a 5% significance level. Contribution of research: The research adds to the body of knowledge informing the Occupational Therapy role in work settings, providing evidence on the effectiveness of workplace-based multimodal interventions. Conclusion: The study provides initial data on the effectiveness of a pilot randomised control trial on pain and mental health in South Africa. Results indicated no quantitative change between the experimental and control groups; however, qualitative data suggest a clinical significance of the findings.

Keywords: ergonomics programme, occupational therapy, sewing machine operators, workplace-based multimodal interventions

Procedia PDF Downloads 47
199 The Use of Videoconferencing in a Task-Based Beginners' Chinese Class

Authors: Sijia Guo

Abstract:

The development of new technologies and the falling cost of high-speed Internet access have made it easier for institutes and language teachers to opt different ways to communicate with students at distance. The emergence of web-conferencing applications, which integrate text, chat, audio / video and graphic facilities, offers great opportunities for language learning to through the multimodal environment. This paper reports on data elicited from a Ph.D. study of using web-conferencing in the teaching of first-year Chinese class in order to promote learners’ collaborative learning. Firstly, a comparison of four desktop videoconferencing (DVC) tools was conducted to determine the pedagogical value of the videoconferencing tool-Blackboard Collaborate. Secondly, the evaluation of 14 campus-based Chinese learners who conducted five one-hour online sessions via the multimodal environment reveals the users’ choice of modes and their learning preference. The findings show that the tasks designed for the web-conferencing environment contributed to the learners’ collaborative learning and second language acquisition.

Keywords: computer-mediated communication (CMC), CALL evaluation, TBLT, web-conferencing, online Chinese teaching

Procedia PDF Downloads 277
198 Ascribing Identities and Othering: A Multimodal Discourse Analysis of a BBC Documentary on YouTube

Authors: Shomaila Sadaf, Margarethe Olbertz-Siitonen

Abstract:

This study looks at identity and othering in discourses around sensitive issues in social media. More specifically, the study explores the multimodal resources and narratives through which the other is formed, and identities are ascribed in online spaces. As an integral part of social life, media spaces have become an important site for negotiating and ascribing identities. In line with recent research, identity is seen hereas constructions of belonging which go hand in hand with processes of in- and out-group formations that in some cases may lead to othering. Previous findings underline that identities are neither fixed nor limited but rather contextual, intersectional, and interactively achieved. The goal of this study is to explore and develop an understanding of how people co-construct the ‘other’ and ascribe certain identities in social media using multiple modes. In the beginning of the year 2018, the British government decided to include relationships, sexual orientation, and sex education into the curriculum of state funded primary schools. However, the addition of information related to LGBTQ+in the curriculum has been met with resistance, particularly from religious parents.For example, the British Muslim community has voiced their concerns and protested against the actions taken by the British government. YouTube has been used by news companies to air video stories covering the protest and narratives of the protestors along with the position ofschool officials. The analysis centers on a YouTube video dealing with the protest ofa local group of parents against the addition of information about LGBTQ+ in the curriculum in the UK. The video was posted in 2019. By the time of this study, the videos had approximately 169,000 views andaround 6000 comments. In deference to multimodal nature of YouTube videos, this study utilizes multimodal discourse analysis as a method of choice. The study is still ongoing and therefore has not yet yielded any final results. However, the initial analysis indicates a hierarchy of ascribing identities in the data. Drawing on multimodal resources, the media works with social categorizations throughout the documentary, presenting and classifying involved conflicting parties in the light of their own visible and audible identifications. The protesters can be seen to construct a strong group identity as Muslim parents (e.g., clothing and reference to shared values). While the video appears to be designed as a documentary that puts forward facts, the media does not seem to succeed in taking a neutral position consistently throughout the video. At times, the use of images, soundsand language contributes to the formation of “us” vs. “them”, where the audience is implicitly encouraged to pick a side. Only towards the end of the documentary this problematic opposition is addressed and critically reflected through an expert interview that is – interestingly – visually located outside the previously presented ‘battlefield’. This study contributes to the growing understanding of the discursive construction of the ‘other’ in social media. Videos available online are a rich source for examining how the different social actors ascribe multiple identities and form the other.

Keywords: identity, multimodal discourse analysis, othering, youtube

Procedia PDF Downloads 84
197 Assessing the Physical Conditions of Motorcycle Taxi Stands and Comfort Conditions of the Drivers in the Central Business District of Bangkok

Authors: Nissa Phloimontri

Abstract:

This research explores the current physical conditions of motorcycle taxi stands located near the BTS stations in the central business district (CBD) and the comfort conditions of motorcycle taxi drivers. The criteria set up for physical stand survey and assessment are the integration of multimodal access design guidelines. After the survey, stands that share similar characteristics are classified into a series of typologies. Based on the environmental comfort model, questionnaires and in-depth interviews are conducted to evaluate the comfort levels of drivers including physical, functional, and psychological comfort. The results indicate that there are a number of motorcycle taxi stands that are not up to standard and are not conducive to the work-related activities of drivers. The study concludes by recommending public policy for integrated paratransit stops that support the multimodal transportation and seamless mobility concepts within the specific context of Bangkok as well as promote the quality of work life of motorcycle taxi drivers.

Keywords: motorcycle taxi, paratransit stops, environmental comfort, quality of work life

Procedia PDF Downloads 63
196 A Multimodal Measurement Approach Using Narratives and Eye Tracking to Investigate Visual Behaviour in Perceiving Naturalistic and Urban Environments

Authors: Khizar Z. Choudhrya, Richard Coles, Salman Qureshi, Robert Ashford, Salim Khan, Rabia R. Mir

Abstract:

Abstract: The majority of existing landscape research has been derived by conducting heuristic evaluations, without having empirical insight of real participant visual response. In this research, a modern multimodal measurement approach (using narratives and eye tracking) was applied to investigate visual behaviour in perceiving naturalistic and urban environments. This research is unique in exploring gaze behaviour on environmental images possessing different levels of saliency. Eye behaviour is predominantly attracted by salient locations. The concept of methodology of this research on naturalistic and urban environments is drawn from the approaches in market research. Borrowing methodologies from market research that examine visual responses and qualities provided a critical and hitherto unexplored approach. This research has been conducted by using mixed methodological quantitative and qualitative approaches. On the whole, the results of this research corroborated existing landscape research findings, but they also identified potential refinements. The research contributes both methodologically and empirically to human-environment interaction (HEI). This study focused on initial impressions of environmental images with the help of eye tracking. Taking under consideration the importance of the image, this study explored the factors that influence initial fixations in relation to expectations and preferences. In terms of key findings of this research it is noticed that each participant has his own unique navigation style while surfing through different elements of landscape images. This individual navigation style is given the name of ‘visual signature’. This study adds the necessary clarity that would complete the picture and bring an insight for future landscape researchers.

Keywords: human-environment interaction (HEI), multimodal measurement, narratives, eye tracking

Procedia PDF Downloads 309
195 Using Trip Planners in Developing Proper Transportation Behavior

Authors: Grzegorz Sierpiński, Ireneusz Celiński, Marcin Staniek

Abstract:

The article discusses multi modal mobility in contemporary societies as a main planning and organization issue in the functioning of administrative bodies, a problem which really exists in the space of contemporary cities in terms of shaping modern transport systems. The article presents classification of available resources and initiatives undertaken for developing multi modal mobility. Solutions can be divided into three groups of measures–physical measures in the form of changes of the transport network infrastructure, organizational ones (including transport policy) and information measures. The latter ones include in particular direct support for people travelling in the transport network by providing information about ways of using available means of transport. A special measure contributing to this end is a trip planner. The article compares several selected planners. It includes a short description of the Green Travelling Project, which aims at developing a planner supporting environmentally friendly solutions in terms of transport network operation. The article summarizes preliminary findings of the project.

Keywords: mobility, modal split, multimodal trip, multimodal platforms, sustainable transport

Procedia PDF Downloads 378
194 Multi Biomertric Personal Identification System Based On Hybird Intellegence Method

Authors: Laheeb M. Ibrahim, Ibrahim A. Salih

Abstract:

Biometrics is a technology that has been widely used in many official and commercial identification applications. The increased concerns in security during recent years (especially during the last decades) have essentially resulted in more attention being given to biometric-based verification techniques. Here, a novel fusion approach of palmprint, dental traits has been suggested. These traits which are authentication techniques have been employed in a range of biometric applications that can identify any postmortem PM person and antemortem AM. Besides improving the accuracy, the fusion of biometrics has several advantages such as increasing, deterring spoofing activities and reducing enrolment failure. In this paper, a first unimodel biometric system has been made by using (palmprint and dental) traits, for each one classification applying an artificial neural network and a hybrid technique that combines swarm intelligence and neural network together, then attempt has been made to combine palmprint and dental biometrics. Principally, the fusion of palmprint and dental biometrics and their potential application has been explored as biometric identifiers. To address this issue, investigations have been carried out about the relative performance of several statistical data fusion techniques for integrating the information in both unimodal and multimodal biometrics. Also the results of the multimodal approach have been compared with each one of these two traits authentication approaches. This paper studies the features and decision fusion levels in multimodal biometrics. To determine the accuracy of GAR to parallel system decision-fusion including (AND, OR, Majority fating) has been used. The backpropagation method has been used for classification and has come out with result (92%, 99%, 97%) respectively for GAR, while the GAR) for this algorithm using hybrid technique for classification (95%, 99%, 98%) respectively. To determine the accuracy of the multibiometric system for feature level fusion has been used, while the same preceding methods have been used for classification. The results have been (98%, 99%) respectively while to determine the GAR of feature level different methods have been used and have come out with (98%).

Keywords: back propagation neural network BP ANN, multibiometric system, parallel system decision-fusion, practical swarm intelligent PSO

Procedia PDF Downloads 503
193 Fu Hao From the East: Between Chinese Traditions and Western Pop Cultures

Authors: Yi Meng, YunGao

Abstract:

Having been studied and worked in North America and Europe, we, two Chinese art educators, have been enormously influenced by eastern and western cultures. Thus, we aim to enhance students’ learning experiences by exploring and amalgamating both cultures for art creating. This text draws on our action research study of students’ visual literacy practices in a foundation sketching course in a major Chinese university, exploring art forms by cross-utilizing various cultural aspects. Instead of relying on the predominant western observational drawing skills in our classroom, we taught students about ancient Chinese art in the provincial museum, using Fu Hao owl-shaped vessel, a Shang Dynasty national treasure, as the final sketch project of this course. We took up multimodal literacy, which emphasized students’ critical use of creativity to exploit the semiotic potentials of communicative modes to address diverse cultural issues through their multimodal design. We used the Hong Kong-based artist Tik Ka’s artworks to demonstrate the cultural amalgamation of Chinese traditions and western pop cultures. Collectively, these approaches create a dialogical space for students to experience, analyze, and negotiate with complex modes and potentially transform their understanding of both cultures by redesigning Fu Hao.

Keywords: Chinese traditions, western pop cultures, Fu Hao, arts education, design sketch

Procedia PDF Downloads 66
192 Selecting the Best Sub-Region Indexing the Images in the Case of Weak Segmentation Based on Local Color Histograms

Authors: Mawloud Mosbah, Bachir Boucheham

Abstract:

Color Histogram is considered as the oldest method used by CBIR systems for indexing images. In turn, the global histograms do not include the spatial information; this is why the other techniques coming later have attempted to encounter this limitation by involving the segmentation task as a preprocessing step. The weak segmentation is employed by the local histograms while other methods as CCV (Color Coherent Vector) are based on strong segmentation. The indexation based on local histograms consists of splitting the image into N overlapping blocks or sub-regions, and then the histogram of each block is computed. The dissimilarity between two images is reduced, as consequence, to compute the distance between the N local histograms of the both images resulting then in N*N values; generally, the lowest value is taken into account to rank images, that means that the lowest value is that which helps to designate which sub-region utilized to index images of the collection being asked. In this paper, we make under light the local histogram indexation method in the hope to compare the results obtained against those given by the global histogram. We address also another noteworthy issue when Relying on local histograms namely which value, among N*N values, to trust on when comparing images, in other words, which sub-region among the N*N sub-regions on which we base to index images. Based on the results achieved here, it seems that relying on the local histograms, which needs to pose an extra overhead on the system by involving another preprocessing step naming segmentation, does not necessary mean that it produces better results. In addition to that, we have proposed here some ideas to select the local histogram on which we rely on to encode the image rather than relying on the local histogram having lowest distance with the query histograms.

Keywords: CBIR, color global histogram, color local histogram, weak segmentation, Euclidean distance

Procedia PDF Downloads 332
191 Autism Spectrum Disorder Classification Algorithm Using Multimodal Data Based on Graph Convolutional Network

Authors: Yuntao Liu, Lei Wang, Haoran Xia

Abstract:

Machine learning has shown extensive applications in the development of classification models for autism spectrum disorder (ASD) using neural image data. This paper proposes a fusion multi-modal classification network based on a graph neural network. First, the brain is segmented into 116 regions of interest using a medical segmentation template (AAL, Anatomical Automatic Labeling). The image features of sMRI and the signal features of fMRI are extracted, which build the node and edge embedding representations of the brain map. Then, we construct a dynamically updated brain map neural network and propose a method based on a dynamic brain map adjacency matrix update mechanism and learnable graph to further improve the accuracy of autism diagnosis and recognition results. Based on the Autism Brain Imaging Data Exchange I dataset(ABIDE I), we reached a prediction accuracy of 74% between ASD and TD subjects. Besides, to study the biomarkers that can help doctors analyze diseases and interpretability, we used the features by extracting the top five maximum and minimum ROI weights. This work provides a meaningful way for brain disorder identification.

Keywords: autism spectrum disorder, brain map, supervised machine learning, graph network, multimodal data, model interpretability

Procedia PDF Downloads 15
190 A Multimodal Discourse Analysis of Gender Representation on Health and Fitness Magazine Cover Pages

Authors: Nashwa Elyamany

Abstract:

In visual cultures, namely that of the United States, media representations are such influential and pervasive reflections of societal norms and expectations to the extent that they impact the manner in which both genders view themselves. Health and fitness magazines fall within the realm of visual culture. Since the main goal of communication is to ensure proper dissemination of information in order for the target audience to grasp the intended messages, it becomes imperative that magazine publishers, editors, advertisers and image producers use different modes of communication within their reach to convey messages to their readers and viewers. A rapid waxing flow of multimodality floods popular discourse, particularly health and fitness magazine cover pages. The use of well-crafted cover lines and visual images is imbued with agendas, consumerist ideologies and properties capable of effectively conveying implicit and explicit meaning to potential readers and viewers. In essence, the primary goal of this thesis is to interrogate the multi-semiotic operations and manifestations of hegemonic masculinity and femininity in male and female body culture, particularly on the cover pages of the twin American magazines Men's Health and Women's Health using corpora that spanned from 2011 to the mid of 2016. The researcher explores the semiotic resources that contribute to shaping and legitimizing a new form of postmodern, consumerist, gendered discourse that positions the reader-viewer ideologically. Methodologically, the researcher carries out analysis on the macro and micro levels. On the macro level, the researcher takes on a critical stance to illuminate the ideological nature of the multimodal ensemble of the cover pages, and, on the micro level, seeks to put forward new theoretical and methodological routes through which the semiotic choices well invested on the media texts can be more objectively scrutinized. On the macro level, a 'themes' analysis is initially conducted to isolate the overarching themes that dominate the fitness discourse on the cover pages under study. It is argued that variation in terms of frequencies of such themes is indicative, broadly speaking, of which facets of hegemonic masculinity and femininity are infused in the fitness discourse on the cover pages. On the micro level, this research work encompasses three sub-levels of analysis. The researcher follows an SF-MMDA approach, drawing on a trio of analytical frameworks: Halliday's SFG for the verbal analysis; Kress & van Leeuween's VG for the visual analysis; and CMT in relation to Sperber & Wilson's RT for the pragma-cognitive analysis of multimodal metaphors and metonymies. The data is presented in terms of detailed descriptions in conjunction with frequency tables, ANOVA with alpha=0.05 and MANOVA in the multiple phases of analysis. Insights and findings from this multi-faceted, social-semiotic analysis are interpreted in light of Cultivation Theory, Self-objectification Theory and the literature to date. Implications for future research include the implementation of a multi-dimensional approach whereby linguistic and visual analytical models are deployed with special regards to cultural variation.

Keywords: gender, hegemony, magazine cover page, multimodal discourse analysis, multimodal metaphor, multimodal metonymy, systemic functional grammar, visual grammar

Procedia PDF Downloads 308
189 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, student engagement

Procedia PDF Downloads 62
188 A Multimodal Dialogue Management System for Achieving Natural Interaction with Embodied Conversational Agents

Authors: Ozge Nilay Yalcin

Abstract:

Dialogue has been proposed to be the natural basis for the human-computer interaction, which is behaviorally rich and includes different modalities such as gestures, posture changes, gaze, para-linguistic parameters and linguistic context. However, equipping the system with these capabilities might have consequences on the usability of the system. One issue is to be able to find a good balance between rich behavior and fluent behavior, as planning and generating these behaviors is computationally expensive. In this work, we propose a multi-modal dialogue management system that automates the conversational flow from text-based dialogue examples and uses synchronized verbal and non-verbal conversational cues to achieve a fluent interaction. Our system is integrated with Smartbody behavior realizer to provide real-time interaction with embodied agent. The nonverbal behaviors are used according to turn-taking behavior, emotions, and personality of the user and linguistic analysis of the dialogue. The verbal behaviors are responsive to the emotional value of the utterance and the feedback from the user. Our system is aimed for online planning of these affective multi-modal components, in order to achieve enhanced user experience with richer and more natural interaction.

Keywords: affect, embodied conversational agents, human-agent interaction, multimodal interaction, natural interfaces

Procedia PDF Downloads 141
187 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure

Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer

Abstract:

The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.

Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition

Procedia PDF Downloads 69
186 Common Orthodontic Indices and Classification in the United Kingdom

Authors: Ashwini Mohan, Haris Batley

Abstract:

An orthodontic index is used to rate or categorise an individual’s occlusion using a numeric or alphanumeric score. Indexing of malocclusions and their correction is important in epidemiology, diagnosis, communication between clinicians as well as their patients and assessing treatment outcomes. Many useful indices have been put forward, but to the author’s best knowledge, no one method to this day appears to be equally suitable for the use of epidemiologists, public health program planners and clinicians. This article describes the common clinical orthodontic indices and classifications used in United Kingdom.

Keywords: classification, indices, orthodontics, validity

Procedia PDF Downloads 111
185 Embodied Communication - Examining Multimodal Actions in a Digital Primary School Project

Authors: Anne Öman

Abstract:

Today in Sweden and in other countries, a variety of digital artefacts, such as laptops, tablets, interactive whiteboards, are being used at all school levels. From an educational perspective, digital artefacts challenge traditional teaching because they provide a range of modes for expression and communication and are not limited to the traditional medium of paper. Digital technologies offer new opportunities for representations and physical interactions with objects, which put forward the role of the body in interaction and learning. From a multimodal perspective the emphasis is on the use of multiple semiotic resources for meaning- making and the study presented here has examined the differential use of semiotic resources by pupils interacting in a digitally designed task in a primary school context. The instances analyzed in this paper come from a case study where the learning task was to create an advertising film in a film-software. The study in focus involves the analysis of a single case with the emphasis on the examination of the classroom setting. The research design used in this paper was based on a micro ethnographic perspective and the empirical material was collected through video recordings of small-group work in order to explore pupils’ communication within the group activity. The designed task described here allowed students to build, share, collaborate upon and publish the redesigned products. The analysis illustrates the variety of communicative modes such as body position, gestures, visualizations, speech and the interaction between these modes and the representations made by the pupils. The findings pointed out the importance of embodied communication during the small- group processes from a learning perspective as well as a pedagogical understanding of pupils’ representations, which were similar from a cultural literacy perspective. These findings open up for discussions with further implications for the school practice concerning the small- group processes as well as the redesigned products. Wider, the findings could point out how multimodal interactions shape the learning experience in the meaning-making processes taking into account that language in a globalized society is more than reading and writing skills.

Keywords: communicative learning, interactive learning environments, pedagogical issues, primary school education

Procedia PDF Downloads 389
184 Multimodal Database of Retina Images for Africa: The First Open Access Digital Repository for Retina Images in Sub Saharan Africa

Authors: Simon Arunga, Teddy Kwaga, Rita Kageni, Michael Gichangi, Nyawira Mwangi, Fred Kagwa, Rogers Mwavu, Amos Baryashaba, Luis F. Nakayama, Katharine Morley, Michael Morley, Leo A. Celi, Jessica Haberer, Celestino Obua

Abstract:

Purpose: The main aim for creating the Multimodal Database of Retinal Images for Africa (MoDRIA) was to provide a publicly available repository of retinal images for responsible researchers to conduct algorithm development in a bid to curb the challenges of ophthalmic artificial intelligence (AI) in Africa. Methods: Data and retina images were ethically sourced from sites in Uganda and Kenya. Data on medical history, visual acuity, ocular examination, blood pressure, and blood sugar were collected. Retina images were captured using fundus cameras (Foru3-nethra and Canon CR-Mark-1). Images were stored on a secure online database. Results: The database consists of 7,859 retinal images in portable network graphics format from 1,988 participants. Images from patients with human immunodeficiency virus were 18.9%, 18.2% of images were from hypertensive patients, 12.8% from diabetic patients, and the rest from normal’ participants. Conclusion: Publicly available data repositories are a valuable asset in the development of AI technology. Therefore, is a need for the expansion of MoDRIA so as to provide larger datasets that are more representative of Sub-Saharan data.

Keywords: retina images, MoDRIA, image repository, African database

Procedia PDF Downloads 79
183 Malaysian ESL Writing Process: A Comparison with England’s

Authors: Henry Nicholas Lee, George Thomas, Juliana Johari, Carmilla Freddie, Caroline Val Madin

Abstract:

Research in comparative and international education often provides value-laden views of an education system within and in between other countries. These views are frequently used by policy makers or educators to explore similarities and differences for, among others, benchmarking purposes. In this study, a comparison is made between Malaysia and England, focusing on the process of writing children went through to create a text, using a multimodal theoretical framework to analyse this comparison. The main purpose is political in nature as it served as an answer to Malaysia’s call for benchmarking of best practices for language learning. Furthermore, the focus on writing in this study adds into more empirical findings about early writers’ writing development and writing improvement, especially for children at the ages of 5-9. In research, comparative studies in English as a Second Language (ESL) writing pedagogy – particularly in Malaysia since the introduction of the Standard- based English Language Curriculum (KSSR) in 2011 as a draft and its full implementation in 2017; reviewed 2018 KSSR-CEFR aligned – has not been done comparatively. In theory, a multimodal theoretical framework somehow allows a logical comparison between first language and ESL which would provide useful insights to illuminate the writing process between Malaysia and England. The comparisons are not representative because of the different school systems in both countries. So far, the literature informs us that the curriculum for language learning is very much emphasised on children’s linguistic abilities, which include their proficiency and mastery of the language, its conventions, and technicalities. However, recent empirical findings suggested that literacy in its concepts and characters need change. In view of this suggestion, the comparison will look at how the process of writing is implemented through the five modes of communication: linguistic, visual, aural, spatial, and gestural. This project draws on data from Malaysia and England, involving 10 teachers, 26 classroom observations, 20 lesson plans, 20 interviews, and 20 brief conversations with teachers. The research focused upon 20 primary children of different genders aged 5-9, and in addition to primary data descriptions, 40 children’s works, 40 brief classroom conversations, 30 classroom photographs, and 30 school compound photographs were undertaken to investigate teachers and children’s use of modes and semiotic resources to design a text. The data were analysed by means of within-case analysis, cross-case analysis, and constant comparative analysis, with an initial stage of data categorisation, followed by general and specific coding, which clustered the data into thematic groups. The study highlights the importance of teachers’ and children’s engagement and interaction with various modes of communication, an adaptation from the English approaches to teaching writing within the KSSR framework and providing ‘voice’ to ESL writers to ensure that both have access to the knowledge and skills required to make decisions in developing multimodal texts and artefacts.

Keywords: comparative education, early writers, KSSR, multimodal theoretical framework, writing development

Procedia PDF Downloads 33
182 Early Depression Detection for Young Adults with a Psychiatric and AI Interdisciplinary Multimodal Framework

Authors: Raymond Xu, Ashley Hua, Andrew Wang, Yuru Lin

Abstract:

During COVID-19, the depression rate has increased dramatically. Young adults are most vulnerable to the mental health effects of the pandemic. Lower-income families have a higher ratio to be diagnosed with depression than the general population, but less access to clinics. This research aims to achieve early depression detection at low cost, large scale, and high accuracy with an interdisciplinary approach by incorporating clinical practices defined by American Psychiatric Association (APA) as well as multimodal AI framework. The proposed approach detected the nine depression symptoms with Natural Language Processing sentiment analysis and a symptom-based Lexicon uniquely designed for young adults. The experiments were conducted on the multimedia survey results from adolescents and young adults and unbiased Twitter communications. The result was further aggregated with the facial emotional cues analyzed by the Convolutional Neural Network on the multimedia survey videos. Five experiments each conducted on 10k data entries reached consistent results with an average accuracy of 88.31%, higher than the existing natural language analysis models. This approach can reach 300+ million daily active Twitter users and is highly accessible by low-income populations to promote early depression detection to raise awareness in adolescents and young adults and reveal complementary cues to assist clinical depression diagnosis.

Keywords: artificial intelligence, COVID-19, depression detection, psychiatric disorder

Procedia PDF Downloads 95
181 NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Authors: Mirjana Ruppel, Rajendra Persad, Amit Bahl, Sanja Dogramadzi, Chris Melhuish, Lyndon Smith

Abstract:

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Keywords: cycle consistency, deformable multimodal image registration, deep learning, GAN

Procedia PDF Downloads 93
180 Video Summarization: Techniques and Applications

Authors: Zaynab El Khattabi, Youness Tabii, Abdelhamid Benkaddour

Abstract:

Nowadays, huge amount of multimedia repositories make the browsing, retrieval and delivery of video contents very slow and even difficult tasks. Video summarization has been proposed to improve faster browsing of large video collections and more efficient content indexing and access. In this paper, we focus on approaches to video summarization. The video summaries can be generated in many different forms. However, two fundamentals ways to generate summaries are static and dynamic. We present different techniques for each mode in the literature and describe some features used for generating video summaries. We conclude with perspective for further research.

Keywords: video summarization, static summarization, video skimming, semantic features

Procedia PDF Downloads 367
179 Analgesic Efficacy of IPACK Block in Primary Total Knee Arthroplasty (90 CASES)

Authors: Fedili Benamar, Beloulou Mohamed Lamine, Ouahes Hassane, Ghattas Samir

Abstract:

 Background and aims: Peripheral regional anesthesia has been integrated into most analgesia protocols for total knee arthroplasty which considered among the most painful surgeries with a huge potential for chronicization. The adductor canal block (ACB) has gained popularity. Similarly, the IPACK block has been described to provide analgesia of the posterior knee capsule. This study aimed to evaluate the analgesic efficacy of this block in patients undergoing primary PTG. Methods: 90 patients were randomized to receive either an IPACK, an anterior sciatic block, or a sham block (30 patients in each group + multimodal analgesia and a catheter in the KCA adductor canal). GROUP 1 KCA GROUP 2 KCA+BSA GROUP 3 KCA+IPACK The analgesic blocks were done under echo-guidance preoperatively respecting the safety rules, the dose administered was 20 cc of ropivacaine 0.25% was used. We were to assess posterior knee pain 6 hours after surgery. Other endpoints included quality of recovery after surgery, pain scores, opioid requirements (PCA morphine)(EPI info 7.2 analysis). Results: -groups were matched -A predominance of women (4F/1H). -average age: 68 +/-7 years -the average BMI =31.75 kg/m2 +/- 4. -70% of patients ASA2 ,20% ASA3. -The average duration of the intervention: 89 +/- 19 minutes. -Morphine consumption (PCA) significantly higher in group 1 (16mg) & group 2 (8mg) group 3 (4mg) - The groups were matched . -There was a correlation between the use of the ipack block and postoperative pain Conclusions :In a multimodal analgesic protocol, the addition of IPACK block decreased pain scores and morphine consumption ,

Keywords: regional anesthesia, analgesia, total knee arthroplasty, the adductor canal block (acb), the ipack block, pain

Procedia PDF Downloads 31
178 Adaptation of the Scenario Test for Greek-speaking People with Aphasia: Reliability and Validity Study

Authors: Marina Charalambous, Phivos Phylactou, Thekla Elriz, Loukia Psychogios, Jean-Marie Annoni

Abstract:

Background: Evidence-based practices for the evaluation and treatment of people with aphasia (PWA) in Greek are mainly impairment-based. Functional and multimodal communication is usually under assessed and neglected by clinicians. This study explores the adaptation and psychometric testing of the Greek (GR) version of The Scenario Test. The Scenario Test assesses the everyday functional communication of PWA in an interactive multimodal communication setting with the support of an active communication facilitator. Aims: To define the reliability and validity of The Scenario Test GR and discuss its clinical value. Methods & Procedures: The Scenario Test-GR was administered to 54 people with chronic stroke (6+ months post-stroke): 32 PWA and 22 people with stroke without aphasia. Participants were recruited from Greece and Cyprus. All measures were performed in an interview format. Standard psychometric criteria were applied to evaluate reliability (internal consistency, test-retest, and interrater reliability) and validity (construct and known – groups validity) of the Scenario Test GR. Video analysis was performed for the qualitative examination of the communication modes used. Outcomes & Results: The Scenario Test-GR shows high levels of reliability and validity. High scores of internal consistency (Cronbach’s α = .95), test-retest reliability (ICC = .99), and interrater reliability (ICC = .99) were found. Interrater agreement in scores on individual items fell between good and excellent levels of agreement. Correlations with a tool measuring language function in aphasia (the Aphasia Severity Rating Scale of the Boston Diagnostic Aphasia Examination), a measure of functional communication (the Communicative Effectiveness Index), and two instruments examining the psychosocial impact of aphasia (the Stroke and Aphasia Quality of Life questionnaire and the Aphasia Impact Questionnaire) revealed good convergent validity (all ps< .05). Results showed good known – groups validity (Mann-Whitney U = 96.5, p < .001), with significantly higher scores for participants without aphasia compared to those with aphasia. Conclusions: The psychometric qualities of The Scenario Test-GR support the reliability and validity of the tool for the assessment of functional communication for Greek-speaking PWA. The Scenario Test-GR can be used to assess multimodal functional communication, orient aphasia rehabilitation goal setting towards the activity and participation level, and be used as an outcome measure of everyday communication. Future studies will focus on the measurement of sensitivity to change in PWA with severe non-fluent aphasia.

Keywords: the scenario test GR, functional communication assessment, people with aphasia (PWA), tool validation

Procedia PDF Downloads 100