Search results for: multimodal optimization
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3456

Search results for: multimodal optimization

3456 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

This paper introduces an original method of parametric optimization of the structure for multimodal decision-level fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Keywords: classification accuracy, fusion solution, total error rate, multimodal fusion classifier

Procedia PDF Downloads 466
3455 The Whale Optimization Algorithm and Its Implementation in MATLAB

Authors: S. Adhirai, R. P. Mahapatra, Paramjit Singh

Abstract:

Optimization is an important tool in making decisions and in analysing physical systems. In mathematical terms, an optimization problem is the problem of finding the best solution from among the set of all feasible solutions. The paper discusses the Whale Optimization Algorithm (WOA), and its applications in different fields. The algorithm is tested using MATLAB because of its unique and powerful features. The benchmark functions used in WOA algorithm are grouped as: unimodal (F1-F7), multimodal (F8-F13), and fixed-dimension multimodal (F14-F23). Out of these benchmark functions, we show the experimental results for F7, F11, and F19 for different number of iterations. The search space and objective space for the selected function are drawn, and finally, the best solution as well as the best optimal value of the objective function found by WOA is presented. The algorithmic results demonstrate that the WOA performs better than the state-of-the-art meta-heuristic and conventional algorithms.

Keywords: optimization, optimal value, objective function, optimization problems, meta-heuristic optimization algorithms, Whale Optimization Algorithm, implementation, MATLAB

Procedia PDF Downloads 371
3454 Identity Verification Based on Multimodal Machine Learning on Red Green Blue (RGB) Red Green Blue-Depth (RGB-D) Voice Data

Authors: LuoJiaoyang, Yu Hongyang

Abstract:

In this paper, we experimented with a new approach to multimodal identification using RGB, RGB-D and voice data. The multimodal combination of RGB and voice data has been applied in tasks such as emotion recognition and has shown good results and stability, and it is also the same in identity recognition tasks. We believe that the data of different modalities can enhance the effect of the model through mutual reinforcement. We try to increase the three modalities on the basis of the dual modalities and try to improve the effectiveness of the network by increasing the number of modalities. We also implemented the single-modal identification system separately, tested the data of these different modalities under clean and noisy conditions, and compared the performance with the multimodal model. In the process of designing the multimodal model, we tried a variety of different fusion strategies and finally chose the fusion method with the best performance. The experimental results show that the performance of the multimodal system is better than that of the single modality, especially in dealing with noise, and the multimodal system can achieve an average improvement of 5%.

Keywords: multimodal, three modalities, RGB-D, identity verification

Procedia PDF Downloads 70
3453 Improved Particle Swarm Optimization with Cellular Automata and Fuzzy Cellular Automata

Authors: Ramin Javadzadeh

Abstract:

The particle swarm optimization are Meta heuristic optimization method, which are used for clustering and pattern recognition applications are abundantly. These algorithms in multimodal optimization problems are more efficient than genetic algorithms. A major drawback in these algorithms is their slow convergence to global optimum and their weak stability can be considered in various running of these algorithms. In this paper, improved Particle swarm optimization is introduced for the first time to overcome its problems. The fuzzy cellular automata is used for improving the algorithm efficiently. The credibility of the proposed approach is evaluated by simulations, and it is shown that the proposed approach achieves better results can be achieved compared to the Particle swarm optimization algorithms.

Keywords: cellular automata, cellular learning automata, local search, optimization, particle swarm optimization

Procedia PDF Downloads 607
3452 A Comparative Study on Multimodal Metaphors in Public Service Advertising of China and Germany

Authors: Xing Lyu

Abstract:

Multimodal metaphor promotes the further development and refinement of multimodal discourse study. Cultural aspects matter a lot not only in creating but also in comprehending multimodal metaphor. By analyzing the target domain and the source domain in 10 public service advertisements of China and Germany about environmental protection, this paper compares the source when the target is alike in each multimodal metaphor in order to seek similarities and differences across cultures. The findings are as follows: first, the multimodal metaphors center around three major topics: the earth crisis, consequences of environmental damage, and appeal for environmental protection; second, the multimodal metaphors mainly grounded in three universal conceptual metaphors which focused on high level is up; earth is mother and all lives are precious. However, there are five Chinese culture-specific multimodal metaphors which are not discovered in Germany ads: east is high leve; a purposeful life is a journey; a nation is a person; good is clean, and water is mother. Since metaphors are excellent instruments on studying ideology, this study can be helpful on intercultural/cross-cultural communication.

Keywords: multimodal metaphor, cultural aspects, public service advertising, cross-cultural communication

Procedia PDF Downloads 173
3451 Multimodal Optimization of Density-Based Clustering Using Collective Animal Behavior Algorithm

Authors: Kristian Bautista, Ruben A. Idoy

Abstract:

A bio-inspired metaheuristic algorithm inspired by the theory of collective animal behavior (CAB) was integrated to density-based clustering modeled as multimodal optimization problem. The algorithm was tested on synthetic, Iris, Glass, Pima and Thyroid data sets in order to measure its effectiveness relative to CDE-based Clustering algorithm. Upon preliminary testing, it was found out that one of the parameter settings used was ineffective in performing clustering when applied to the algorithm prompting the researcher to do an investigation. It was revealed that fine tuning distance δ3 that determines the extent to which a given data point will be clustered helped improve the quality of cluster output. Even though the modification of distance δ3 significantly improved the solution quality and cluster output of the algorithm, results suggest that there is no difference between the population mean of the solutions obtained using the original and modified parameter setting for all data sets. This implies that using either the original or modified parameter setting will not have any effect towards obtaining the best global and local animal positions. Results also suggest that CDE-based clustering algorithm is better than CAB-density clustering algorithm for all data sets. Nevertheless, CAB-density clustering algorithm is still a good clustering algorithm because it has correctly identified the number of classes of some data sets more frequently in a thirty trial run with a much smaller standard deviation, a potential in clustering high dimensional data sets. Thus, the researcher recommends further investigation in the post-processing stage of the algorithm.

Keywords: clustering, metaheuristics, collective animal behavior algorithm, density-based clustering, multimodal optimization

Procedia PDF Downloads 230
3450 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 111
3449 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)

Procedia PDF Downloads 235
3448 OPEN-EmoRec-II-A Multimodal Corpus of Human-Computer Interaction

Authors: Stefanie Rukavina, Sascha Gruss, Steffen Walter, Holger Hoffmann, Harald C. Traue

Abstract:

OPEN-EmoRecII is an open multimodal corpus with experimentally induced emotions. In the first half of the experiment, emotions were induced with standardized picture material and in the second half during a human-computer interaction (HCI), realized with a wizard-of-oz design. The induced emotions are based on the dimensional theory of emotions (valence, arousal and dominance). These emotional sequences - recorded with multimodal data (mimic reactions, speech, audio and physiological reactions) during a naturalistic-like HCI-environment one can improve classification methods on a multimodal level. This database is the result of an HCI-experiment, for which 30 subjects in total agreed to a publication of their data including the video material for research purposes. The now available open corpus contains sensory signal of: video, audio, physiology (SCL, respiration, BVP, EMG Corrugator supercilii, EMG Zygomaticus Major) and mimic annotations.

Keywords: open multimodal emotion corpus, annotated labels, intelligent interaction

Procedia PDF Downloads 416
3447 New Approach for Constructing a Secure Biometric Database

Authors: A. Kebbeb, M. Mostefai, F. Benmerzoug, Y. Chahir

Abstract:

The multimodal biometric identification is the combination of several biometric systems. The challenge of this combination is to reduce some limitations of systems based on a single modality while significantly improving performance. In this paper, we propose a new approach to the construction and the protection of a multimodal biometric database dedicated to an identification system. We use a topological watermarking to hide the relation between face image and the registered descriptors extracted from other modalities of the same person for more secure user identification.

Keywords: biometric databases, multimodal biometrics, security authentication, digital watermarking

Procedia PDF Downloads 390
3446 Harmonizing Cities: Integrating Land Use Diversity and Multimodal Transit for Social Equity

Authors: Zi-Yan Chao

Abstract:

With the rapid development of urbanization and increasing demand for efficient transportation systems, the interaction between land use diversity and transportation resource allocation has become a critical issue in urban planning. Achieving a balance of land use types, such as residential, commercial, and industrial areas, is crucial role in ensuring social equity and sustainable urban development. Simultaneously, optimizing multimodal transportation networks, including bus, subway, and car routes, is essential for minimizing total travel time and costs, while ensuring fairness for all social groups, particularly in meeting the transportation needs of low-income populations. This study develops a bilevel programming model to address these challenges, with land use diversity as the foundation for measuring equity. The upper-level model maximizes land use diversity for balanced land distribution across regions. The lower-level model optimizes multimodal transportation networks to minimize travel time and costs while maintaining user equilibrium. The model also incorporates constraints to ensure fair resource allocation, such as balancing transportation accessibility and cost differences across various social groups. A solution approach is developed to solve the bilevel optimization problem, ensuring efficient exploration of the solution space for land use and transportation resource allocation. This study maximizes social equity by maximizing land use diversity and achieving user equilibrium with optimal transportation resource distribution. The proposed method provides a robust framework for addressing urban planning challenges, contributing to sustainable and equitable urban development.

Keywords: bilevel programming model, genetic algorithms, land use diversity, multimodal transportation optimization, social equity

Procedia PDF Downloads 22
3445 Teaching and Learning with Picturebooks: Developing Multimodal Literacy with a Community of Primary School Teachers in China

Authors: Fuling Deng

Abstract:

Today’s children are frequently exposed to multimodal texts that adopt diverse modes to communicate myriad meanings within different cultural contexts. To respond to the new textual landscape, scholars have considered new literacy theories which propose picturebooks as important educational resources. Picturebooks are multimodal, with their meaning conveyed through the synchronisation of multiple modes, including linguistic, visual, spatial, and gestural acting as access to multimodal literacy. Picturebooks have been popular reading materials in primary educational settings in China. However, often viewed as “easy” texts directed at the youngest readers, picturebooks remain on the margins of Chinese upper primary classrooms, where they are predominantly used for linguistic tasks, with little value placed on their multimodal affordances. Practices with picturebooks in the upper grades in Chinese primary schools also encounter many challenges associated with the curation of texts for use, designing curriculum, and assessment. To respond to these issues, a qualitative study was conducted with a community of Chinese primary teachers using multi-methods such as interviews, focus groups, and documents. The findings showed the impact of the teachers’ increased awareness of picturebooks' multimodal affordances on their pedagogical decisions in using picturebooks as educational resources in upper primary classrooms.

Keywords: picturebook education, multimodal literacy, teachers' response to contemporary picturebooks, community of practice

Procedia PDF Downloads 136
3444 Multimodal Content: Fostering Students’ Language and Communication Competences

Authors: Victoria L. Malakhova

Abstract:

The research is devoted to multimodal content and its effectiveness in developing students’ linguistic and intercultural communicative competences as an indefeasible constituent of their future professional activity. Description of multimodal content both as a linguistic and didactic phenomenon makes the study relevant. The objective of the article is the analysis of creolized texts and the effect they have on fostering higher education students’ skills and their productivity. The main methods used are linguistic text analysis, qualitative and quantitative methods, deduction, generalization. The author studies texts with full and partial creolization, their features and role in composing multimodal textual space. The main verbal and non-verbal markers and paralinguistic means that enhance the linguo-pragmatic potential of creolized texts are covered. To reveal the efficiency of multimodal content application in English teaching, the author conducts an experiment among both undergraduate students and teachers. This allows specifying main functions of creolized texts in the process of language learning, detecting ways of enhancing students’ competences, and increasing their motivation. The described stages of using creolized texts can serve as an algorithm for work with multimodal content in teaching English as a foreign language. The findings contribute to improving the efficiency of the academic process.

Keywords: creolized text, English language learning, higher education, language and communication competences, multimodal content

Procedia PDF Downloads 112
3443 A Proposal of Multi-modal Teaching Model for College English

Authors: Huang Yajing

Abstract:

Multimodal discourse refers to the phenomenon of using various senses such as hearing, vision, and touch to communicate through various means and symbolic resources such as language, images, sounds, and movements. With the development of modern technology and multimedia, language and technology have become inseparable, and foreign language teaching is becoming more and more modal. Teacher-student communication resorts to multiple senses and uses multiple symbol systems to construct and interpret meaning. The classroom is a semiotic space where multimodal discourses are intertwined. College English multi-modal teaching is to rationally utilize traditional teaching methods while mobilizing and coordinating various modern teaching methods to form a joint force to promote teaching and learning. Multimodal teaching makes full and reasonable use of various meaning resources and can maximize the advantages of multimedia and network environments. Based upon the above theories about multimodal discourse and multimedia technology, the present paper will propose a multi-modal teaching model for college English in China.

Keywords: multimodal discourse, multimedia technology, English education, applied linguistics

Procedia PDF Downloads 68
3442 An Exploration of Promoting EFL Students’ Language Learning Autonomy Using Multimodal Teaching - A Case Study of an Art University in Western China

Authors: Dian Guan

Abstract:

With the wide application of multimedia and the Internet, the development of teaching theories, and the implementation of teaching reforms, many different university English classroom teaching modes have emerged. The university English teaching mode is changing from the traditional teaching mode based on conversation and text to the multimodal English teaching mode containing discussion, pictures, audio, film, etc. Applying university English teaching models is conducive to cultivating lifelong learning skills. In addition, lifelong learning skills can also be called learners' autonomous learning skills. Learners' independent learning ability has a significant impact on English learning. However, many university students, especially art and design students, don't know how to learn individually. When they become university students, their English foundation is a relative deficiency because they always remember the language in a traditional way, which, to a certain extent, neglects the cultivation of English learners' independent ability. As a result, the autonomous learning ability of most university students is not satisfactory. The participants in this study were 60 students and one teacher in their first year at a university in western China. Two observations and interviews were conducted inside and outside the classroom to understand the impact of a multimodal teaching model of university English on students' autonomous learning ability. The results were analyzed, and it was found that the multimodal teaching model of university English significantly affected learners' autonomy. Incorporating classroom presentations and poster exhibitions into multimodal teaching can increase learners' interest in learning and enhance their learning ability outside the classroom. However, further exploration is needed to develop multimodal teaching materials and evaluate multimodal teaching outcomes. Despite the limitations of this study, the study adopts a scientific research method to analyze the impact of the multimodal teaching mode of university English on students' independent learning ability. It puts forward a different outlook for further research on this topic.

Keywords: art university, EFL education, learner autonomy, multimodal pedagogy

Procedia PDF Downloads 101
3441 Multimodal Characterization of Emotion within Multimedia Space

Authors: Dayo Samuel Banjo, Connice Trimmingham, Niloofar Yousefi, Nitin Agarwal

Abstract:

Technological advancement and its omnipresent connection have pushed humans past the boundaries and limitations of a computer screen, physical state, or geographical location. It has provided a depth of avenues that facilitate human-computer interaction that was once inconceivable such as audio and body language detection. Given the complex modularities of emotions, it becomes vital to study human-computer interaction, as it is the commencement of a thorough understanding of the emotional state of users and, in the context of social networks, the producers of multimodal information. This study first acknowledges the accuracy of classification found within multimodal emotion detection systems compared to unimodal solutions. Second, it explores the characterization of multimedia content produced based on their emotions and the coherence of emotion in different modalities by utilizing deep learning models to classify emotion across different modalities.

Keywords: affective computing, deep learning, emotion recognition, multimodal

Procedia PDF Downloads 156
3440 An Improved Many Worlds Quantum Genetic Algorithm

Authors: Li Dan, Zhao Junsuo, Zhang Wenjun

Abstract:

Aiming at the shortcomings of the Quantum Genetic Algorithm such as the multimodal function optimization problems easily falling into the local optimum, and vulnerable to premature convergence due to no closely relationship between individuals, the paper presents an Improved Many Worlds Quantum Genetic Algorithm (IMWQGA). The paper using the concept of Many Worlds; using the derivative way of parallel worlds’ parallel evolution; putting forward the thought which updating the population according to the main body; adopting the transition methods such as parallel transition, backtracking, travel forth. In addition, the algorithm in the paper also proposes the quantum training operator and the combinatorial optimization operator as new operators of quantum genetic algorithm.

Keywords: quantum genetic algorithm, many worlds, quantum training operator, combinatorial optimization operator

Procedia PDF Downloads 744
3439 Comparison of Parallel CUDA and OpenMP Implementations of Memetic Algorithms for Solving Optimization Problems

Authors: Jason Digalakis, John Cotronis

Abstract:

Memetic algorithms (MAs) are useful for solving optimization problems. It is quite difficult to search the search space of the optimization problem with large dimensions. There is a challenge to use all the cores of the system. In this study, a sequential implementation of the memetic algorithm is converted into a concurrent version, which is executed on the cores of both CPU and GPU. For this reason, CUDA and OpenMP libraries are operated on the parallel algorithm to make a concurrent execution on CPU and GPU, respectively. The aim of this study is to compare CPU and GPU implementation of the memetic algorithm. For this purpose, fourteen benchmark functions are selected as test problems. The obtained results indicate that our approach leads to speedups up to five thousand times higher compared to one CPU thread while maintaining a reasonable results quality. This clearly shows that GPUs have the potential to acceleration of MAs and allow them to solve much more complex tasks.

Keywords: memetic algorithm, CUDA, GPU-based memetic algorithm, open multi processing, multimodal functions, unimodal functions, non-linear optimization problems

Procedia PDF Downloads 101
3438 Multimodal Sentiment Analysis With Web Based Application

Authors: Shreyansh Singh, Afroz Ahmed

Abstract:

Sentiment Analysis intends to naturally reveal the hidden mentality that we hold towards an entity. The total of this assumption over a populace addresses sentiment surveying and has various applications. Current text-based sentiment analysis depends on the development of word embeddings and Machine Learning models that take in conclusion from enormous text corpora. Sentiment Analysis from text is presently generally utilized for consumer loyalty appraisal and brand insight investigation. With the expansion of online media, multimodal assessment investigation is set to carry new freedoms with the appearance of integral information streams for improving and going past text-based feeling examination using the new transforms methods. Since supposition can be distinguished through compelling follows it leaves, like facial and vocal presentations, multimodal opinion investigation offers good roads for examining facial and vocal articulations notwithstanding the record or printed content. These methodologies use the Recurrent Neural Networks (RNNs) with the LSTM modes to increase their performance. In this study, we characterize feeling and the issue of multimodal assessment investigation and audit ongoing advancements in multimodal notion examination in various spaces, including spoken surveys, pictures, video websites, human-machine, and human-human connections. Difficulties and chances of this arising field are additionally examined, promoting our theory that multimodal feeling investigation holds critical undiscovered potential.

Keywords: sentiment analysis, RNN, LSTM, word embeddings

Procedia PDF Downloads 119
3437 Enhancing Plant Throughput in Mineral Processing Through Multimodal Artificial Intelligence

Authors: Muhammad Bilal Shaikh

Abstract:

Mineral processing plants play a pivotal role in extracting valuable minerals from raw ores, contributing significantly to various industries. However, the optimization of plant throughput remains a complex challenge, necessitating innovative approaches for increased efficiency and productivity. This research paper investigates the application of Multimodal Artificial Intelligence (MAI) techniques to address this challenge, aiming to improve overall plant throughput in mineral processing operations. The integration of multimodal AI leverages a combination of diverse data sources, including sensor data, images, and textual information, to provide a holistic understanding of the complex processes involved in mineral extraction. The paper explores the synergies between various AI modalities, such as machine learning, computer vision, and natural language processing, to create a comprehensive and adaptive system for optimizing mineral processing plants. The primary focus of the research is on developing advanced predictive models that can accurately forecast various parameters affecting plant throughput. Utilizing historical process data, machine learning algorithms are trained to identify patterns, correlations, and dependencies within the intricate network of mineral processing operations. This enables real-time decision-making and process optimization, ultimately leading to enhanced plant throughput. Incorporating computer vision into the multimodal AI framework allows for the analysis of visual data from sensors and cameras positioned throughout the plant. This visual input aids in monitoring equipment conditions, identifying anomalies, and optimizing the flow of raw materials. The combination of machine learning and computer vision enables the creation of predictive maintenance strategies, reducing downtime and improving the overall reliability of mineral processing plants. Furthermore, the integration of natural language processing facilitates the extraction of valuable insights from unstructured textual data, such as maintenance logs, research papers, and operator reports. By understanding and analyzing this textual information, the multimodal AI system can identify trends, potential bottlenecks, and areas for improvement in plant operations. This comprehensive approach enables a more nuanced understanding of the factors influencing throughput and allows for targeted interventions. The research also explores the challenges associated with implementing multimodal AI in mineral processing plants, including data integration, model interpretability, and scalability. Addressing these challenges is crucial for the successful deployment of AI solutions in real-world industrial settings. To validate the effectiveness of the proposed multimodal AI framework, the research conducts case studies in collaboration with mineral processing plants. The results demonstrate tangible improvements in plant throughput, efficiency, and cost-effectiveness. The paper concludes with insights into the broader implications of implementing multimodal AI in mineral processing and its potential to revolutionize the industry by providing a robust, adaptive, and data-driven approach to optimizing plant operations. In summary, this research contributes to the evolving field of mineral processing by showcasing the transformative potential of multimodal artificial intelligence in enhancing plant throughput. The proposed framework offers a holistic solution that integrates machine learning, computer vision, and natural language processing to address the intricacies of mineral extraction processes, paving the way for a more efficient and sustainable future in the mineral processing industry.

Keywords: multimodal AI, computer vision, NLP, mineral processing, mining

Procedia PDF Downloads 68
3436 Integrating Critical Stylistics and Visual Grammar: A Multimodal Stylistic Approach to the Analysis of Non-Literary Texts

Authors: Shatha Khuzaee

Abstract:

The study develops multimodal stylistic approach to analyse a number of BBC online news articles reporting some key events from the so called ‘Arab Uprisings’. Critical stylistics (CS) and visual grammar (VG) provide insightful arguments to the ways ideology is projected through different verbal and visual modes, yet they are mode specific because they examine how each mode projects its meaning separately and do not attempt to clarify what happens intersemiotically when the two modes co-occur. Therefore, it is the task undertaken in this research to propose multimodal stylistic approach that addresses the issue of ideology construction when the two modes co-occur. Informed by functional grammar and social semiotics, the analysis attempts to integrate three linguistic models developed in critical stylistics, namely, transitivity choices, prioritizing and hypothesizing along with their visual equivalents adopted from visual grammar to investigate the way ideology is constructed, in multimodal text, when text/image participate and interrelate in the process of meaning making on the textual level of analysis. The analysis provides comprehensive theoretical and analytical elaborations on the different points of integration between CS linguistic models and VG equivalents which operate on the textual level of analysis to better account for ideology construction in news as non-literary multimodal texts. It is argued that the analysis well thought out a plan that would remark the first step towards the integration between the well-established linguistic models of critical stylistics and that of visual analysis to analyse multimodal texts on the textual level. Both approaches are compatible to produce multimodal stylistic approach because they intend to analyse text and image depending on whatever textual evidence is available. This supports the analysis maintain the rigor and replicability needed for a stylistic analysis like the one undertaken in this study.

Keywords: multimodality, stylistics, visual grammar, social semiotics, functional grammar

Procedia PDF Downloads 221
3435 Two Weeks of Multi-Modal Inpatient Treatment: Patients Suffering from Chronic Musculoskeletal Pain for over 12 Months

Authors: D. Schafer, H. Booke, R. Nordmeier

Abstract:

Patients suffering from chronic musculoskeletal pain ( > 12 months) are a challenging clientele for pain specialists. A multimodal approach, characterized by a two weeks inpatient treatment, often is the ultimate therapeutic attempt. The lasting effects of such a multimodal approach were analyzed, especially since two weeks of inpatient therapy, although very intense, often seem too short to make a difference in patients suffering from chronic pain for years. The study includes 32 consecutive patients suffering from chronic pain over years who underwent a two weeks multimodal inpatient treatment of pain. Twelve months after discharge, each patient was interviewed to objectify any lasting effects. Pain was measured on admission and 12 months after discharge using the numeric rating scale (NRS). For statistics, a paired students' t-test was used. Significance was defined as p < 0.05. The average intensity of pain on admission was 8,6 on the NRS. Twelve months after discharge, the intensity of pain was still reduced by an average of 48% (average NRS 4,4), p < 0.05. Despite this significant improvement in pain severity, two thirds (66%) of the patients still judge their treatment as not sufficient. In conclusion, inpatient treatment of chronic pain has a long-lasting effect on the intensity of pain in patients suffering from chronic musculoskeletal pain for more than 12 months.

Keywords: chronic pain, inpatient treatment, multimodal pain treatment, musculoskeletal pain

Procedia PDF Downloads 165
3434 Navigating the Case-Based Learning Multimodal Learning Environment: A Qualitative Study Across the First-Year Medical Students

Authors: Bhavani Veasuvalingam

Abstract:

Case-based learning (CBL) is a popular instructional method aimed to bridge theory to clinical practice. This study aims to explore CBL mixed modality curriculum in influencing students’ learning styles and strategies that support learning. An explanatory sequential mixed method study was employed with initial phase, 44-itemed Felderman’s Index of Learning Style (ILS) questionnaire employed across year one medical students (n=142) using convenience sampling to describe the preferred learning styles. The qualitative phase utilised three focus group discussions (FGD) to explore in depth on the multimodal learning style exhibited by the students. Most students preferred combination of learning stylesthat is reflective, sensing, visual and sequential i.e.: RSVISeq style (24.64%) from the ILS analysis. The frequency of learning preference from processing to understanding were well balanced, with sequential-global domain (66.2%); sensing-intuitive (59.86%), active- reflective (57%), and visual-verbal (51.41%). The qualitative data reported three major themes, namely Theme 1: CBL mixed modalities navigates learners’ learning style; Theme 2: Multimodal learners active learning strategies supports learning. Theme 3: CBL modalities facilitating theory into clinical knowledge. Both quantitative and qualitative study strongly reports the multimodal learning style of the year one medical students. Medical students utilise multimodal learning styles to attain the clinical knowledge when learning with CBL mixed modalities. Educators’ awareness of the multimodal learning style is crucial in delivering the CBL mixed modalities effectively, considering strategic pedagogical support students to engage and learn CBL in bridging the theoretical knowledge into clinical practice.

Keywords: case-based learning, learnign style, medical students, learning

Procedia PDF Downloads 95
3433 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 123
3432 Dual Biometrics Fusion Based Recognition System

Authors: Prakash, Vikash Kumar, Vinay Bansal, L. N. Das

Abstract:

Dual biometrics is a subpart of multimodal biometrics, which refers to the use of a variety of modalities to identify and authenticate persons rather than just one. We limit the risks of mistakes by mixing several modals, and hackers have a tiny possibility of collecting information. Our goal is to collect the precise characteristics of iris and palmprint, produce a fusion of both methodologies, and ensure that authentication is only successful when the biometrics match a particular user. After combining different modalities, we created an effective strategy with a mean DI and EER of 2.41 and 5.21, respectively. A biometric system has been proposed.

Keywords: multimodal, fusion, palmprint, Iris, EER, DI

Procedia PDF Downloads 147
3431 A Multimodal Approach to Improve the Performance of Biometric System

Authors: Chander Kant, Arun Kumar

Abstract:

Biometric systems automatically recognize an individual based on his/her physiological and behavioral characteristics. There are also some traits like weight, age, height etc. that may not provide reliable user recognition because of there common and temporary nature. These traits are called soft bio metric traits. Although soft bio metric traits are lack of permanence to uniquely and reliably identify an individual, yet they provide some beneficial evidence about the user identity and may improve the system performance. Here in this paper, we have proposed an approach for integrating the soft bio metrics with fingerprint and face to improve the performance of personal authentication system. In our approach we have proposed a combined architecture of three different sensors to elevate the system performance. The approach includes, soft bio metrics, fingerprint and face traits. We have also proven the efficiency of proposed system regarding FAR (False Acceptance Ratio) and total response time, with the help of MUBI (Multimodal Bio metrics Integration) software.

Keywords: FAR, minutiae point, multimodal bio metrics, primary bio metric, soft bio metric

Procedia PDF Downloads 346
3430 Curve Fitting by Cubic Bezier Curves Using Migrating Birds Optimization Algorithm

Authors: Mitat Uysal

Abstract:

A new met heuristic optimization algorithm called as Migrating Birds Optimization is used for curve fitting by rational cubic Bezier Curves. This requires solving a complicated multivariate optimization problem. In this study, the solution of this optimization problem is achieved by Migrating Birds Optimization algorithm that is a powerful met heuristic nature-inspired algorithm well appropriate for optimization. The results of this study show that the proposed method performs very well and being able to fit the data points to cubic Bezier Curves with a high degree of accuracy.

Keywords: algorithms, Bezier curves, heuristic optimization, migrating birds optimization

Procedia PDF Downloads 337
3429 Filmic and Verbal Metafphors

Authors: Manana Rusieshvili, Rusudan Dolidze

Abstract:

This paper aims at 1) investigating the ways in which a traditional, monomodal written verbal metaphor can be transposed as a monomodal non-verbal (visual) or multimodal (aural and -visual) filmic metaphor ; 2) exploring similarities and differences in the process of encoding and decoding of monomodal and multimodal metaphors. The empiric data, on which the research is based, embrace three sources: the novel by Harry Gray ‘The Hoods’, the script of the film ‘Once Upon a Time in America’ (English version by David Mills) and the resultant film by Sergio Leone. In order to achieve the above mentioned goals, the research focuses on the following issues: 1) identification of verbal and non-verbal monomodal and multimodal metaphors in the above-mentioned sources and 2) investigation of the ways and modes the specific written monomodal metaphors appearing in the novel and the script are enacted in the film and become visual, aural or visual-aural filmic metaphors ; 3) study of the factors which play an important role in contributing to the encoding and decoding of the filmic metaphor. The collection and analysis of the data were carried out in two stages: firstly, the relevant data, i.e. the monomodal metaphors from the novel, the script and the film were identified and collected. In the second, final stage the metaphors taken from all of the three sources were analysed, compared and two types of phenomena were selected for discussion: (1) the monomodal written metaphors found in the novel and/or in the script which become monomodal visual/aural metaphors in the film; (2) the monomodal written metaphors found in the novel and/or in the script which become multimodal, filmic (visual-aural) metaphors in the film.

Keywords: encoding, decoding, filmic metaphor, multimodality

Procedia PDF Downloads 526
3428 Efficient Layout-Aware Pretraining for Multimodal Form Understanding

Authors: Armineh Nourbakhsh, Sameena Shah, Carolyn Rose

Abstract:

Layout-aware language models have been used to create multimodal representations for documents that are in image form, achieving relatively high accuracy in document understanding tasks. However, the large number of parameters in the resulting models makes building and using them prohibitive without access to high-performing processing units with large memory capacity. We propose an alternative approach that can create efficient representations without the need for a neural visual backbone. This leads to an 80% reduction in the number of parameters compared to the smallest SOTA model, widely expanding applicability. In addition, our layout embeddings are pre-trained on spatial and visual cues alone and only fused with text embeddings in downstream tasks, which can facilitate applicability to low-resource of multi-lingual domains. Despite using 2.5% of training data, we show competitive performance on two form understanding tasks: semantic labeling and link prediction.

Keywords: layout understanding, form understanding, multimodal document understanding, bias-augmented attention

Procedia PDF Downloads 148
3427 A Multimodal Approach towards Intersemiotic Translations of 'The Great Gatsby'

Authors: Neda Razavi Kaleibar, Bahloul Salmani

Abstract:

The present study dealt with the multimodal analysis of two cinematic adaptations of The Great Gatsby as intersemiotic translation. The assessment in this study went beyond the faithfulness based on repetition, addition, deletion, and creation which limit the analysis from other aspects. In fact, this research aimed to pinpoint the role of multimodality in examining the intersemiotic translations of the novel into film by means of analyzing different applied modes. Through a qualitative type of research, the analysis was conducted based on the theory proposed by Burn as Kineikonic mode theory derived from the concept of multimodality. The results of the study revealed that due to the applied modes, each adaptation represents a sense and meaning different from the other one. Analyzing the results and discussions, it was concluded that not only the modes have an undeniable role in film adaptations, but rather multimodal analysis including different nonverbal modes can be a useful and functional choice for analyzing the intersemiotic translations.

Keywords: cinematic adaptation, intersemiotic translation, kineikonic mode, multimodality

Procedia PDF Downloads 421