Search results for: multimodal fusion

658 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

This paper introduces an original method of parametric optimization of the structure for multimodal decision-level fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Keywords: classification accuracy, fusion solution, total error rate, multimodal fusion classifier

Procedia PDF Downloads 428

657 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 73

656 Identity Verification Based on Multimodal Machine Learning on Red Green Blue (RGB) Red Green Blue-Depth (RGB-D) Voice Data

Authors: LuoJiaoyang, Yu Hongyang

Abstract:

In this paper, we experimented with a new approach to multimodal identification using RGB, RGB-D and voice data. The multimodal combination of RGB and voice data has been applied in tasks such as emotion recognition and has shown good results and stability, and it is also the same in identity recognition tasks. We believe that the data of different modalities can enhance the effect of the model through mutual reinforcement. We try to increase the three modalities on the basis of the dual modalities and try to improve the effectiveness of the network by increasing the number of modalities. We also implemented the single-modal identification system separately, tested the data of these different modalities under clean and noisy conditions, and compared the performance with the multimodal model. In the process of designing the multimodal model, we tried a variety of different fusion strategies and finally chose the fusion method with the best performance. The experimental results show that the performance of the multimodal system is better than that of the single modality, especially in dealing with noise, and the multimodal system can achieve an average improvement of 5%.

Keywords: multimodal, three modalities, RGB-D, identity verification

Procedia PDF Downloads 41

655 Dual Biometrics Fusion Based Recognition System

Authors: Prakash, Vikash Kumar, Vinay Bansal, L. N. Das

Abstract:

Dual biometrics is a subpart of multimodal biometrics, which refers to the use of a variety of modalities to identify and authenticate persons rather than just one. We limit the risks of mistakes by mixing several modals, and hackers have a tiny possibility of collecting information. Our goal is to collect the precise characteristics of iris and palmprint, produce a fusion of both methodologies, and ensure that authentication is only successful when the biometrics match a particular user. After combining different modalities, we created an effective strategy with a mean DI and EER of 2.41 and 5.21, respectively. A biometric system has been proposed.

Keywords: multimodal, fusion, palmprint, Iris, EER, DI

Procedia PDF Downloads 107

654 TMIF: Transformer-Based Multi-Modal Interactive Fusion for Rumor Detection

Authors: Jiandong Lv, Xingang Wang, Cuiling Shao

Abstract:

The rapid development of social media platforms has made it one of the important news sources. While it provides people with convenient real-time communication channels, fake news and rumors are also spread rapidly through social media platforms, misleading the public and even causing bad social impact in view of the slow speed and poor consistency of artificial rumor detection. We propose an end-to-end rumor detection model-TIMF, which captures the dependencies between multimodal data based on the interactive attention mechanism, uses a transformer for cross-modal feature sequence mapping and combines hybrid fusion strategies to obtain decision results. This paper verifies two multi-modal rumor detection datasets and proves the superior performance and early detection performance of the proposed model.

Keywords: hybrid fusion, multimodal fusion, rumor detection, social media, transformer

Procedia PDF Downloads 186

653 Multi Biomertric Personal Identification System Based On Hybird Intellegence Method

Authors: Laheeb M. Ibrahim, Ibrahim A. Salih

Abstract:

Biometrics is a technology that has been widely used in many official and commercial identification applications. The increased concerns in security during recent years (especially during the last decades) have essentially resulted in more attention being given to biometric-based verification techniques. Here, a novel fusion approach of palmprint, dental traits has been suggested. These traits which are authentication techniques have been employed in a range of biometric applications that can identify any postmortem PM person and antemortem AM. Besides improving the accuracy, the fusion of biometrics has several advantages such as increasing, deterring spoofing activities and reducing enrolment failure. In this paper, a first unimodel biometric system has been made by using (palmprint and dental) traits, for each one classification applying an artificial neural network and a hybrid technique that combines swarm intelligence and neural network together, then attempt has been made to combine palmprint and dental biometrics. Principally, the fusion of palmprint and dental biometrics and their potential application has been explored as biometric identifiers. To address this issue, investigations have been carried out about the relative performance of several statistical data fusion techniques for integrating the information in both unimodal and multimodal biometrics. Also the results of the multimodal approach have been compared with each one of these two traits authentication approaches. This paper studies the features and decision fusion levels in multimodal biometrics. To determine the accuracy of GAR to parallel system decision-fusion including (AND, OR, Majority fating) has been used. The backpropagation method has been used for classification and has come out with result (92%, 99%, 97%) respectively for GAR, while the GAR) for this algorithm using hybrid technique for classification (95%, 99%, 98%) respectively. To determine the accuracy of the multibiometric system for feature level fusion has been used, while the same preceding methods have been used for classification. The results have been (98%, 99%) respectively while to determine the GAR of feature level different methods have been used and have come out with (98%).

Keywords: back propagation neural network BP ANN, multibiometric system, parallel system decision-fusion, practical swarm intelligent PSO

Procedia PDF Downloads 502

652 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 59

651 A Comparative Study on Multimodal Metaphors in Public Service Advertising of China and Germany

Authors: Xing Lyu

Abstract:

Multimodal metaphor promotes the further development and refinement of multimodal discourse study. Cultural aspects matter a lot not only in creating but also in comprehending multimodal metaphor. By analyzing the target domain and the source domain in 10 public service advertisements of China and Germany about environmental protection, this paper compares the source when the target is alike in each multimodal metaphor in order to seek similarities and differences across cultures. The findings are as follows: first, the multimodal metaphors center around three major topics: the earth crisis, consequences of environmental damage, and appeal for environmental protection; second, the multimodal metaphors mainly grounded in three universal conceptual metaphors which focused on high level is up; earth is mother and all lives are precious. However, there are five Chinese culture-specific multimodal metaphors which are not discovered in Germany ads: east is high leve; a purposeful life is a journey; a nation is a person; good is clean, and water is mother. Since metaphors are excellent instruments on studying ideology, this study can be helpful on intercultural/cross-cultural communication.

Keywords: multimodal metaphor, cultural aspects, public service advertising, cross-cultural communication

Procedia PDF Downloads 135

650 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 78

649 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee

Abstract:

The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 183

648 Age Determination from Epiphyseal Union of Bones at Shoulder Joint in Girls of Central India

Authors: B. Tirpude, V. Surwade, P. Murkey, P. Wankhade, S. Meena

Abstract:

There is no statistical data to establish variation in epiphyseal fusion in girls in central India population. This significant oversight can lead to exclusion of persons of interest in a forensic investigation. Epiphyseal fusion of proximal end of humerus in eighty females were analyzed on radiological basis to assess the range of variation of epiphyseal fusion at each age. In the study, the X ray films of the subjects were divided into three groups on the basis of degree of fusion. Firstly, those which were showing No Epiphyseal Fusion (N), secondly those showing Partial Union (PC), and thirdly those showing Complete Fusion (C). Observations made were compared with the previous studies.

Keywords: epiphyseal union, shoulder joint, proximal end of humerus

Procedia PDF Downloads 454

647 Performance of Hybrid Image Fusion: Implementation of Dual-Tree Complex Wavelet Transform Technique

Authors: Manoj Gupta, Nirmendra Singh Bhadauria

Abstract:

Most of the applications in image processing require high spatial and high spectral resolution in a single image. For example satellite image system, the traffic monitoring system, and long range sensor fusion system all use image processing. However, most of the available equipment is not capable of providing this type of data. The sensor in the surveillance system can only cover the view of a small area for a particular focus, yet the demanding application of this system requires a view with a high coverage of the field. Image fusion provides the possibility of combining different sources of information. In this paper, we have decomposed the image using DTCWT and then fused using average and hybrid of (maxima and average) pixel level techniques and then compared quality of both the images using PSNR.

Keywords: image fusion, DWT, DT-CWT, PSNR, average image fusion, hybrid image fusion

Procedia PDF Downloads 566

646 OPEN-EmoRec-II-A Multimodal Corpus of Human-Computer Interaction

Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue

Abstract:

OPEN-EmoRecII is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (mimic reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and mimic annotations.

Keywords: open multimodal emotion corpus, annotated labels, intelligent interaction

Procedia PDF Downloads 376

645 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems. The challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: biometric databases, multimodal biometrics, security authentication, digital watermarking

Procedia PDF Downloads 335

644 Teaching and Learning with Picturebooks: Developing Multimodal Literacy with a Community of Primary School Teachers in China

Authors: Fuling Deng

Abstract:

Today’s children are frequently exposed to multimodal texts that adopt diverse modes to communicate myriad meanings within different cultural contexts. To respond to the new textual landscape, scholars have considered new literacy theories which propose picturebooks as important educational resources. Picturebooks are multimodal, with their meaning conveyed through the synchronisation of multiple modes, including linguistic, visual, spatial, and gestural acting as access to multimodal literacy. Picturebooks have been popular reading materials in primary educational settings in China. However, often viewed as “easy” texts directed at the youngest readers, picturebooks remain on the margins of Chinese upper primary classrooms, where they are predominantly used for linguistic tasks, with little value placed on their multimodal affordances. Practices with picturebooks in the upper grades in Chinese primary schools also encounter many challenges associated with the curation of texts for use, designing curriculum, and assessment. To respond to these issues, a qualitative study was conducted with a community of Chinese primary teachers using multi-methods such as interviews, focus groups, and documents. The findings showed the impact of the teachers’ increased awareness of picturebooks' multimodal affordances on their pedagogical decisions in using picturebooks as educational resources in upper primary classrooms.

Keywords: picturebook education, multimodal literacy, teachers' response to contemporary picturebooks, community of practice

Procedia PDF Downloads 100

643 Changes in the Median Sacral Crest Associated with Sacrocaudal Fusion in the Greyhound

Authors: S. M. Ismail, H-H Yen, C. M. Murray, H. M. S. Davies

Abstract:

A recent study reported a 33% incidence of complete sacrocaudal fusion in greyhounds compared to a 3% incidence in other dogs. In the dog, the median sacral crest is formed by the fusion of sacral spinous processes. Separation of the 1st spinous process from the median crest of the sacrum in the dog has been reported as a diagnostic tool of type one lumbosacral transitional vertebra (LTV). LTV is a congenital spinal anomaly, which includes either sacralization of the caudal lumbar part or lumbarization of the most cranial sacral segment of the spine. In this study, the absence or reduction of fusion (presence of separation) between the 1st and 2ndspinous processes of the median sacral crest has been identified in association with sacrocaudal fusion in the greyhound, without any feature of LTV. In order to provide quantitative data on the absence or reduction of fusion in the median sacral crest between the 1st and 2nd sacral spinous processes, in association with sacrocaudal fusion. 204 dog sacrums free of any pathological changes (192 greyhound, 9 beagles and 3 labradors) were grouped based on the occurrence and types of fusion and the presence, absence, or reduction in the median sacral crest between the 1st and 2nd sacral spinous processes., Sacrums were described and classified as follows: F: Complete fusion (crest is present), N: Absence (fusion is absent), and R: Short crest (fusion reduced but not absent (reduction). The incidence of sacrocaudal fusion in the 204 sacrums: 57% of the sacrums were standard (3 vertebrae) and 43% were fused (4 vertebrae). Type of sacrum had a significant (p < .05) association with the absence and reduction of fusion between the 1st and 2nd sacral spinous processes of the median sacral crest. In the 108 greyhounds with standard sacrums (3 vertebrae) the percentages of F, N and R were 45% 23% and 23% respectively, while in the 84 fused (4 vertebrae) sacrums, the percentages of F, N and R were 3%, 87% and 10% respectively and these percentages were significantly different between standard (3 vertebrae) and fused (4 vertebrae) sacrums (p < .05). This indicates that absence of spinous process fusion in the median sacral crest was found in a large percentage of the greyhounds in this study and was found to be particularly prevalent in those with sacrocaudal fusion – therefore in this breed, at least, absence of sacral spinous process fusion may be unlikely to be associated with LTV.

Keywords: greyhound, median sacral crest, sacrocaudal fusion, sacral spinous process

Procedia PDF Downloads 411

642 Implementation of Sensor Fusion Structure of 9-Axis Sensors on the Multipoint Control Unit

Authors: Jun Gil Ahn, Jong Tae Kim

Abstract:

In this paper, we study the sensor fusion structure on the multipoint control unit (MCU). Sensor fusion using Kalman filter for 9-axis sensors is considered. The 9-axis inertial sensor is the combination of 3-axis accelerometer, 3-axis gyroscope and 3-axis magnetometer. We implement the sensor fusion structure among the sensor hubs in MCU and measure the execution time, power consumptions, and total energy. Experiments with real data from 9-axis sensor in 20Mhz show that the average power consumptions are 44mW and 48mW on Cortx-M0 and Cortex-M3 MCU, respectively. Execution times are 613.03 us and 305.6 us respectively.

Keywords: 9-axis sensor, Kalman filter, MCU, sensor fusion

Procedia PDF Downloads 466

641 Efficient Feature Fusion for Noise Iris in Unconstrained Environment

Authors: Yao-Hong Tsai

Abstract:

This paper presents an efficient fusion algorithm for iris images to generate stable feature for recognition in unconstrained environment. Recently, iris recognition systems are focused on real scenarios in our daily life without the subject’s cooperation. Under large variation in the environment, the objective of this paper is to combine information from multiple images of the same iris. The result of image fusion is a new image which is more stable for further iris recognition than each original noise iris image. A wavelet-based approach for multi-resolution image fusion is applied in the fusion process. The detection of the iris image is based on Adaboost algorithm and then local binary pattern (LBP) histogram is then applied to texture classification with the weighting scheme. Experiment showed that the generated features from the proposed fusion algorithm can improve the performance for verification system through iris recognition.

Keywords: image fusion, iris recognition, local binary pattern, wavelet

Procedia PDF Downloads 340

640 Multimodal Content: Fostering Students’ Language and Communication Competences

Authors: Victoria L. Malakhova

Abstract:

The research is devoted to multimodal content and its effectiveness in developing students’ linguistic and intercultural communicative competences as an indefeasible constituent of their future professional activity. Description of multimodal content both as a linguistic and didactic phenomenon makes the study relevant. The objective of the article is the analysis of creolized texts and the effect they have on fostering higher education students’ skills and their productivity. The main methods used are linguistic text analysis, qualitative and quantitative methods, deduction, generalization. The author studies texts with full and partial creolization, their features and role in composing multimodal textual space. The main verbal and non-verbal markers and paralinguistic means that enhance the linguo-pragmatic potential of creolized texts are covered. To reveal the efficiency of multimodal content application in English teaching, the author conducts an experiment among both undergraduate students and teachers. This allows specifying main functions of creolized texts in the process of language learning, detecting ways of enhancing students’ competences, and increasing their motivation. The described stages of using creolized texts can serve as an algorithm for work with multimodal content in teaching English as a foreign language. The findings contribute to improving the efficiency of the academic process.

Keywords: creolized text, English language learning, higher education, language and communication competences, multimodal content

Procedia PDF Downloads 82

639 Sampling Two-Channel Nonseparable Wavelets and Its Applications in Multispectral Image Fusion

Authors: Bin Liu, Weijie Liu, Bin Sun, Yihui Luo

Abstract:

In order to solve the problem of lower spatial resolution and block effect in the fusion method based on separable wavelet transform in the resulting fusion image, a new sampling mode based on multi-resolution analysis of two-channel non separable wavelet transform, whose dilation matrix is [1,1;1,-1], is presented and a multispectral image fusion method based on this kind of sampling mode is proposed. Filter banks related to this kind of wavelet are constructed, and multiresolution decomposition of the intensity of the MS and panchromatic image are performed in the sampled mode using the constructed filter bank. The low- and high-frequency coefficients are fused by different fusion rules. The experiment results show that this method has good visual effect. The fusion performance has been noted to outperform the IHS fusion method, as well as, the fusion methods based on DWT, IHS-DWT, IHS-Contourlet transform, and IHS-Curvelet transform in preserving both spectral quality and high spatial resolution information. Furthermore, when compared with the fusion method based on nonsubsampled two-channel non separable wavelet, the proposed method has been observed to have higher spatial resolution and good global spectral information.

Keywords: image fusion, two-channel sampled nonseparable wavelets, multispectral image, panchromatic image

Procedia PDF Downloads 397

638 A Proposal of Multi-modal Teaching Model for College English

Authors: Huang Yajing

Abstract:

Multimodal discourse refers to the phenomenon of using various senses such as hearing, vision, and touch to communicate through various means and symbolic resources such as language, images, sounds, and movements. With the development of modern technology and multimedia, language and technology have become inseparable, and foreign language teaching is becoming more and more modal. Teacher-student communication resorts to multiple senses and uses multiple symbol systems to construct and interpret meaning. The classroom is a semiotic space where multimodal discourses are intertwined. College English multi-modal teaching is to rationally utilize traditional teaching methods while mobilizing and coordinating various modern teaching methods to form a joint force to promote teaching and learning. Multimodal teaching makes full and reasonable use of various meaning resources and can maximize the advantages of multimedia and network environments. Based upon the above theories about multimodal discourse and multimedia technology, the present paper will propose a multi-modal teaching model for college English in China.

Keywords: multimodal discourse, multimedia technology, English education, applied linguistics

Procedia PDF Downloads 13

637 Multimodal Pedagogy for Students’ Creative Expressions in Visual Literacy Education

Authors: Yi Meng, Yun Gao

Abstract:

Having spent significant periods studying and working in North America and Europe, we, as two Chinese art educators, have been profoundly shaped by both Eastern and Western cultures. Consequently, our ambition is to enrich students' learning experiences by delving into and merging both cultural perspectives for innovative, creative expressions. This exposition draws on our action research study on students' visual literacy practices in a visual literacy course at a prominent Chinese university. The central premise was to explore innovative art forms by cross-utilizing various aspects of diverse cultures. By examining distinct cultural elements, we encouraged students to break away from familiar approaches and forge new paths in their creative endeavors. In implementing our curriculum, we utilized a multimodal pedagogy that deviated from the predominant print-based presentations typically employed in our classroom settings. This pedagogical approach effectively encouraged students to critically analyze the artifact, imbue it with their understanding and perspectives, and then produce an original piece. This approach also motivated students to leverage the semiotic potential of various communicative modes to address diverse cultural issues through their multimodal designs. To demonstrate the potential for cultural amalgamation, we utilized the artwork of Hong Kong-based artist Tik Ka. His works epitomize the fusion of Chinese traditions with Western pop culture, which served as a visual and conceptual reference point for students. Seeing how these distinct cultural elements could coexist and enrich each other in Tik Ka's work was inspiring and motivating for the students. Taken together, these pedagogical strategies helped create a dialogical space where students could actively experience, analyze, and negotiate complex modes of expression. This environment fostered active learning, encouraging students to apply their knowledge, question their assumptions, and reconsider their perspectives. Overall, such a unique approach to visual literacy education has the potential to reshape students' understanding of both cultures. By encouraging them to critically engage with their multimodal designs, we promoted an in-depth, nuanced appreciation of these diverse cultural heritages. The students no longer just interpreted and replicated images—they actively contributed to a dynamic and ongoing conversation between cultures.

Keywords: multimodal pedagogy, creative expressions, visual literacy education, multimodal designs

Procedia PDF Downloads 39

636 An Exploration of Promoting EFL Students’ Language Learning Autonomy Using Multimodal Teaching - A Case Study of an Art University in Western China

Authors: Dian Guan

Abstract:

With the wide application of multimedia and the Internet, the development of teaching theories, and the implementation of teaching reforms, many different university English classroom teaching modes have emerged. The university English teaching mode is changing from the traditional teaching mode based on conversation and text to the multimodal English teaching mode containing discussion, pictures, audio, film, etc. Applying university English teaching models is conducive to cultivating lifelong learning skills. In addition, lifelong learning skills can also be called learners' autonomous learning skills. Learners' independent learning ability has a significant impact on English learning. However, many university students, especially art and design students, don't know how to learn individually. When they become university students, their English foundation is a relative deficiency because they always remember the language in a traditional way, which, to a certain extent, neglects the cultivation of English learners' independent ability. As a result, the autonomous learning ability of most university students is not satisfactory. The participants in this study were 60 students and one teacher in their first year at a university in western China. Two observations and interviews were conducted inside and outside the classroom to understand the impact of a multimodal teaching model of university English on students' autonomous learning ability. The results were analyzed, and it was found that the multimodal teaching model of university English significantly affected learners' autonomy. Incorporating classroom presentations and poster exhibitions into multimodal teaching can increase learners' interest in learning and enhance their learning ability outside the classroom. However, further exploration is needed to develop multimodal teaching materials and evaluate multimodal teaching outcomes. Despite the limitations of this study, the study adopts a scientific research method to analyze the impact of the multimodal teaching mode of university English on students' independent learning ability. It puts forward a different outlook for further research on this topic.

Keywords: art university, EFL education, learner autonomy, multimodal pedagogy

Procedia PDF Downloads 37

635 Multimodal Characterization of Emotion within Multimedia Space

Authors: Dayo Samuel Banjo, Connice Trimmingham, Niloofar Yousefi, Nitin Agarwal

Abstract:

Technological advancement and its omnipresent connection have pushed humans past the boundaries and limitations of a computer screen, physical state, or geographical location. It has provided a depth of avenues that facilitate human-computer interaction that was once inconceivable such as audio and body language detection. Given the complex modularities of emotions, it becomes vital to study human-computer interaction, as it is the commencement of a thorough understanding of the emotional state of users and, in the context of social networks, the producers of multimodal information. This study first acknowledges the accuracy of classification found within multimodal emotion detection systems compared to unimodal solutions. Second, it explores the characterization of multimedia content produced based on their emotions and the coherence of emotion in different modalities by utilizing deep learning models to classify emotion across different modalities.

Keywords: affective computing, deep learning, emotion recognition, multimodal

Procedia PDF Downloads 107

634 Variations in the Angulation of the First Sacral Spinous Process Angle Associated with Sacrocaudal Fusion in Greyhounds

Authors: Sa'ad M. Ismail, Hung-Hsun Yen, Christina M. Murray, Helen M. S. Davies

Abstract:

In the dog, the median sacral crest is formed by the fusion of three sacral spinous processes. In greyhounds with standard sacrums, this fusion in the median sacral crest consists of the fusion of three sacral spinous processes while it consists of four in greyhounds with sacrocaudal fusion. In the present study, variations in the angulation of the first sacral spinous process in association with different types of sacrocaudal fusion in the greyhound were investigated. Sacrums were collected from 207 greyhounds (102 sacrums; type A (unfused) and 105 with different types of sacrocaudal fusion; types: B, C and D). Sacrums were cleaned by boiling and dried and then were placed on their ventral surface on a flat surface and photographed from the left side using a digital camera at a fixed distance. The first sacral spinous process angle (1st SPA) was defined as the angle formed between the cranial border of the cranial ridge of the first sacral spinous process and the line extending across the most dorsal surface points of the spinous processes of the S1, S2, and S3. Image-Pro Express Version 5.0 imaging software was used to draw and measure the angles. Two photographs were taken for each sacrum and two repeat measurements were also taken of each angle. The mean value of the 1st SPA in greyhounds with sacrocaudal fusion was less (98.99°, SD ± 11, n = 105) than those in greyhounds with standard sacrums (99.77°, SD ± 9.18, n = 102) but was not significantly different (P < 0.05). Among greyhounds with different types of sacrocaudal fusion the mean value of the 1st SPA was as follows: type B; 97.73°, SD ± 10.94, n = 39, type C: 101.42°, SD ± 10.51, n = 52, and type D: 94.22°, SD ± 11.30, n = 12. For all types of fusion these angles were significantly different from each other (P < 0.05). Comparing the mean value of the1st SPA in standard sacrums (Type A) with that for each type of fusion separately showed that the only significantly different angulation (P < 0.05) was between standard sacrums and sacrums with sacrocaudal fusion sacrum type D (only body fusion between the S1 and Ca1). Different types of sacrocaudal fusion were associated with variations in the angle of the first sacral spinous process. These variations may affect the alignment and biomechanics of the sacral area and the pattern of movement and/or the force produced by both hind limbs to the cranial parts of the body and may alter the loading of other parts of the body. We concluded that any variations in the sacrum anatomical features might change the function of the sacrum or surrounding anatomical structures during movement.

Keywords: angulation of first sacral spinous process, biomechanics, greyhound, locomotion, sacrocaudal fusion

Procedia PDF Downloads 272

633 Multi-Channel Information Fusion in C-OTDR Monitoring Systems: Various Approaches to Classify of Targeted Events

Authors: Andrey V. Timofeev

Abstract:

The paper presents new results concerning selection of optimal information fusion formula for ensembles of C-OTDR channels. The goal of information fusion is to create an integral classificator designed for effective classification of seismoacoustic target events. The LPBoost (LP-β and LP-B variants), the Multiple Kernel Learning, and Weighing of Inversely as Lipschitz Constants (WILC) approaches were compared. The WILC is a brand new approach to optimal fusion of Lipschitz Classifiers Ensembles. Results of practical usage are presented.

Keywords: Lipschitz Classifier, classifiers ensembles, LPBoost, C-OTDR systems

Procedia PDF Downloads 426

632 Variations in the 7th Lumbar (L7) Vertebra Length Associated with Sacrocaudal Fusion in Greyhounds

Authors: Sa`ad M. Ismail, Hung-Hsun Yen, Christina M. Murray, Helen M. S. Davies

Abstract:

The lumbosacral junction (where the 7th lumbar vertebra (L7) articulates with the sacrum) is a clinically important area in the dog. The 7th lumbar vertebra (L7) is normally shorter than other lumbar vertebrae, and it has been reported that variations in the L7 length may be associated with other abnormal anatomical findings. These variations included the reduction or absence of the portion of the median sacral crest. In this study, 53 greyhound cadavers were placed in right lateral recumbency, and two lateral radiographs were taken of the lumbosacral region for each greyhound. The length of the 6th lumbar (L6) vertebra and L7 were measured using radiographic measurement software and was defined to be the mean of three lines drawn from the caudal to the cranial edge of the L6 and L7 vertebrae (a dorsal, middle, and ventral line) between specific landmarks. Sacrocaudal fusion was found in 41.5% of the greyhounds. The mean values of the length of L6, L7, and the ratio of the L6/L7 length of the greyhounds with sacrocaudal fusion were all greater than those with standard sacrums (three sacral vertebrae). There was a significant difference (P < 0.05) in the mean values of the length of L7 between the greyhounds without sacrocaudal fusion (mean = 29.64, SD ± 2.07) and those with sacrocaudal fusion (mean = 30.86, SD ± 1.80), but, there was no significant difference in the mean value of the length of the L6 measurement. Among different types of sacrocaudal fusion, the longest L7 was found in greyhounds with sacrum type D, intermediate length in those with sacrum type B, and the shortest was found in those with sacrums type C, and the mean values of the ratio of the L6/L7 were 1.11 (SD ± 0.043), 1.15, (SD ± 0.025), and 1.15 (SD ± 0.011) for the types B, C, and D respectively. No significant differences in the mean values of the length of L6 or L7 were found among the different types of sacrocaudal fusion. The occurrence of sacrocaudal fusion might affect direct anatomically connected structures such as the L7. The variation in the length of L7 between greyhounds with sacrocaudal fusion and those without may reflect the possible sequences of the process of fusion. Variations in the length of the L7 vertebra in greyhounds may be associated with the occurrence of sacrocaudal fusion. The variation in the vertebral length may affect the alignment and biomechanical properties of the sacrum and may alter the loading. We concluded that any variations in the sacrum anatomical features might change the function of the sacrum or the surrounding anatomical structures.

Keywords: biomechanics, Greyhound, sacrocaudal fusion, locomotion, 6th Lumbar (L6) Vertebra, 7th Lumbar (L7) Vertebra, ratio of the L6/L7 length

Procedia PDF Downloads 324

631 Clinical Relevance of TMPRSS2-ERG Fusion Marker for Prostate Cancer

Authors: Shalu Jain, Anju Bansal, Anup Kumar, Sunita Saxena

Abstract:

Objectives: The novel TMPRSS2:ERG gene fusion is a common somatic event in prostate cancer that in some studies is linked with a more aggressive disease phenotype. Thus, this study aims to determine whether clinical variables are associated with the presence of TMPRSS2:ERG-fusion gene transcript in Indian patients of prostate cancer. Methods: We evaluated the clinical variables with presence and absence of TMPRSS2:ERG gene fusion in prostate cancer and BPH association of clinical patients. Patients referred for prostate biopsy because of abnormal DRE or/and elevated sPSA were enrolled for this prospective clinical study. TMPRSS2:ERG mRNA copies in samples were quantified using a Taqman chemistry by real time PCR assay in prostate biopsy samples (N=42). The T2:ERG assay detects the gene fusion mRNA isoform TMPRSS2 exon1 to ERG exon4. Results: Histopathology report has confirmed 25 cases as prostate cancer adenocarcinoma (PCa) and 17 patients as benign prostate hyperplasia (BPH). Out of 25 PCa cases, 16 (64%) were T2: ERG fusion positive. All 17 BPH controls were fusion negative. The T2:ERG fusion transcript was exclusively specific for prostate cancer as no case of BPH was detected having T2:ERG fusion, showing 100% specificity. The positive predictive value of fusion marker for prostate cancer is thus 100% and the negative predictive value is 65.3%. The T2:ERG fusion marker is significantly associated with clinical variables like no. of positive cores in prostate biopsy, Gleason score, serum PSA, perineural invasion, perivascular invasion and periprostatic fat involvement. Conclusions: Prostate cancer is a heterogeneous disease that may be defined by molecular subtypes such as the TMPRSS2:ERG fusion. In the present prospective study, the T2:ERG quantitative assay demonstrated high specificity for predicting biopsy outcome; sensitivity was similar to the prevalence of T2:ERG gene fusions in prostate tumors. These data suggest that further improvement in diagnostic accuracy could be achieved using a nomogram that combines T2:ERG with other markers and risk factors for prostate cancer.

Keywords: prostate cancer, genetic rearrangement, TMPRSS2:ERG fusion, clinical variables

Procedia PDF Downloads 414

630 Integrating Time-Series and High-Spatial Remote Sensing Data Based on Multilevel Decision Fusion

Authors: Xudong Guan, Ainong Li, Gaohuan Liu, Chong Huang, Wei Zhao

Abstract:

Due to the low spatial resolution of MODIS data, the accuracy of small-area plaque extraction with a high degree of landscape fragmentation is greatly limited. To this end, the study combines Landsat data with higher spatial resolution and MODIS data with higher temporal resolution for decision-level fusion. Considering the importance of the land heterogeneity factor in the fusion process, it is superimposed with the weighting factor, which is to linearly weight the Landsat classification result and the MOIDS classification result. Three levels were used to complete the process of data fusion, that is the pixel of MODIS data, the pixel of Landsat data, and objects level that connect between these two levels. The multilevel decision fusion scheme was tested in two sites of the lower Mekong basin. We put forth a comparison test, and it was proved that the classification accuracy was improved compared with the single data source classification results in terms of the overall accuracy. The method was also compared with the two-level combination results and a weighted sum decision rule-based approach. The decision fusion scheme is extensible to other multi-resolution data decision fusion applications.

Keywords: image classification, decision fusion, multi-temporal, remote sensing

Procedia PDF Downloads 92

629 Multimodal Sentiment Analysis With Web Based Application

Authors: Shreyansh Singh, Afroz Ahmed

Abstract:

Sentiment Analysis intends to naturally reveal the hidden mentality that we hold towards an entity. The total of this assumption over a populace addresses sentiment surveying and has various applications. Current text-based sentiment analysis depends on the development of word embeddings and Machine Learning models that take in conclusion from enormous text corpora. Sentiment Analysis from text is presently generally utilized for consumer loyalty appraisal and brand insight investigation. With the expansion of online media, multimodal assessment investigation is set to carry new freedoms with the appearance of integral information streams for improving and going past text-based feeling examination using the new transforms methods. Since supposition can be distinguished through compelling follows it leaves, like facial and vocal presentations, multimodal opinion investigation offers good roads for examining facial and vocal articulations notwithstanding the record or printed content. These methodologies use the Recurrent Neural Networks (RNNs) with the LSTM modes to increase their performance. In this study, we characterize feeling and the issue of multimodal assessment investigation and audit ongoing advancements in multimodal notion examination in various spaces, including spoken surveys, pictures, video websites, human-machine, and human-human connections. Difficulties and chances of this arising field are additionally examined, promoting our theory that multimodal feeling investigation holds critical undiscovered potential.

Keywords: sentiment analysis, RNN, LSTM, word embeddings

Procedia PDF Downloads 77