Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 833

Search results for: audio segmentation

443 Multimodal Characterization of Emotion within Multimedia Space

Authors: Dayo Samuel Banjo, Connice Trimmingham, Niloofar Yousefi, Nitin Agarwal

Abstract:

Technological advancement and its omnipresent connection have pushed humans past the boundaries and limitations of a computer screen, physical state, or geographical location. It has provided a depth of avenues that facilitate human-computer interaction that was once inconceivable such as audio and body language detection. Given the complex modularities of emotions, it becomes vital to study human-computer interaction, as it is the commencement of a thorough understanding of the emotional state of users and, in the context of social networks, the producers of multimodal information. This study first acknowledges the accuracy of classification found within multimodal emotion detection systems compared to unimodal solutions. Second, it explores the characterization of multimedia content produced based on their emotions and the coherence of emotion in different modalities by utilizing deep learning models to classify emotion across different modalities.

Keywords: affective computing, deep learning, emotion recognition, multimodal

Procedia PDF Downloads 154

442 Musical Composition by Computer with Inspiration from Files of Different Media Types

Authors: Cassandra Pratt Romero, Andres Gomez de Silva Garza

Abstract:

This paper describes a computational system designed to imitate human inspiration during musical composition. The system is called MIS (Musical Inspiration Simulator). The MIS system is inspired by media to which human beings are exposed daily (visual, textual, or auditory) to create new musical compositions based on the emotions detected in said media. After building the system we carried out a series of evaluations with volunteer users who used MIS to compose music based on images, texts, and audio files. The volunteers were asked to judge the harmoniousness and innovation in the system's compositions. An analysis of the results points to the difficulty of computational analysis of the characteristics of the media to which we are exposed daily, as human emotions have a subjective character. This observation will direct future improvements in the system.

Keywords: human inspiration, musical composition, musical composition by computer, theory of sensation and human perception

Procedia PDF Downloads 179

441 Screen Casting Instead of Illegible Scribbles: Making a Mini Movie for Feedback on Students’ Scholarly Papers

Authors: Kerri Alderson

Abstract:

There is pervasive awareness by post secondary faculty that written feedback on course assignments is inconsistently reviewed by students. In order to support student success and growth, a novel method of providing feedback was sought, and screen casting - short, narrated “movies” of audio visual instructor feedback on students’ scholarly papers - was provided as an alternative to traditional means. An overview of the teaching and learning experience as well as the user-friendly software utilized will be presented. This study covers an overview of this more direct, student-centered medium for providing feedback using technology familiar to post secondary students. Reminiscent of direct personal contact, the personalized video feedback is positively evaluated by students as a formative medium for student growth in scholarly writing.

Keywords: education, pedagogy, screen casting, student feedback, teaching and learning

Procedia PDF Downloads 117

440 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 352

439 Applications of Visual Ethnography in Public Anthropology

Authors: Subramaniam Panneerselvam, Gunanithi Perumal, KP Subin

Abstract:

The Visual Ethnography is used to document the culture of a community through a visual means. It could be either photography or audio-visual documentation. The visual ethnographic techniques are widely used in visual anthropology. The visual anthropologists use the camera to capture the cultural image of the studied community. There is a scope for subjectivity while the culture is documented by an external person. But the upcoming of the public anthropology provides an opportunity for the participants to document their own culture. There is a need to equip the participants with the skill of doing visual ethnography. The mobile phone technology provides visual documentation facility to everyone to capture the moments instantly. The visual ethnography facilitates the multiple-interpretation for the audiences. This study explores the effectiveness of visual ethnography among the tribal youth through public anthropology perspective. The case study was conducted to equip the tribal youth of Nilgiris in visual ethnography and the outcome of the experiment shared in this paper.

Keywords: visual ethnography, visual anthropology, public anthropology, multiple-interpretation, case study

Procedia PDF Downloads 181

438 An Adaptive CFAR Algorithm Based on Automatic Censoring in Heterogeneous Environments

Authors: Naime Boudemagh

Abstract:

In this work, we aim to improve the detection performances of radar systems. To this end, we propose and analyze a novel censoring technique of undesirable samples, of priori unknown positions, that may be present in the environment under investigation. Therefore, we consider heterogeneous backgrounds characterized by the presence of some irregularities such that clutter edge transitions and/or interfering targets. The proposed detector, termed automatic censoring constant false alarm (AC-CFAR), operates exclusively in a Gaussian background. It is built to allow the segmentation of the environment to regions and switch automatically to the appropriate detector; namely, the cell averaging CFAR (CA-CFAR), the censored mean level CFAR (CMLD-CFAR) or the order statistic CFAR (OS-CFAR). Monte Carlo simulations show that the AC-CFAR detector performs like the CA-CFAR in a homogeneous background. Moreover, the proposed processor exhibits considerable robustness in a heterogeneous background.

Keywords: CFAR, automatic censoring, heterogeneous environments, radar systems

Procedia PDF Downloads 600

437 Using Electronic Books to Enhance the Museum Visitors' Experience

Authors: Elvin Karaaslan Klose

Abstract:

Museums are important sites of informal, often semi-structured and self-paced learning. Challenged by digital alternatives and increased expectations from their visitors, museums have to adapt to the digital age by enriching their collection and educational content with additional options for interactivity. One such option lies in the concept of the electronic book, which can be used either on dedicated devices or downloaded by visitors before entering the exhibition area. These electronic books serve as an alternative or supplement to the classic audio guide and provide visitors with information about artifacts as well as background stories and factoids about the subjects of the exhibition. Bringing such interactive elements into the museum experience has been shown to increase information retention and enjoyment among young aged visitors and adults. This article aims to bring together both theoretical frameworks and practical examples of how interactive media in the form of electronic books can be used to enhance the experience of the museum visitor.

Keywords: electronic books, interactive media, arts education, museum education

Procedia PDF Downloads 211

436 Temperature Contour Detection of Salt Ice Using Color Thermal Image Segmentation Method

Authors: Azam Fazelpour, Saeed Reza Dehghani, Vlastimil Masek, Yuri S. Muzychka

Abstract:

The study uses a novel image analysis based on thermal imaging to detect temperature contours created on salt ice surface during transient phenomena. Thermal cameras detect objects by using their emissivities and IR radiance. The ice surface temperature is not uniform during transient processes. The temperature starts to increase from the boundary of ice towards the center of that. Thermal cameras are able to report temperature changes on the ice surface at every individual moment. Various contours, which show different temperature areas, appear on the ice surface picture captured by a thermal camera. Identifying the exact boundary of these contours is valuable to facilitate ice surface temperature analysis. Image processing techniques are used to extract each contour area precisely. In this study, several pictures are recorded while the temperature is increasing throughout the ice surface. Some pictures are selected to be processed by a specific time interval. An image segmentation method is applied to images to determine the contour areas. Color thermal images are used to exploit the main information. Red, green and blue elements of color images are investigated to find the best contour boundaries. The algorithms of image enhancement and noise removal are applied to images to obtain a high contrast and clear image. A novel edge detection algorithm based on differences in the color of the pixels is established to determine contour boundaries. In this method, the edges of the contours are obtained according to properties of red, blue and green image elements. The color image elements are assessed considering their information. Useful elements proceed to process and useless elements are removed from the process to reduce the consuming time. Neighbor pixels with close intensities are assigned in one contour and differences in intensities determine boundaries. The results are then verified by conducting experimental tests. An experimental setup is performed using ice samples and a thermal camera. To observe the created ice contour by the thermal camera, the samples, which are initially at -20° C, are contacted with a warmer surface. Pictures are captured for 20 seconds. The method is applied to five images ,which are captured at the time intervals of 5 seconds. The study shows the green image element carries no useful information; therefore, the boundary detection method is applied on red and blue image elements. In this case study, the results indicate that proposed algorithm shows the boundaries more effective than other edges detection methods such as Sobel and Canny. Comparison between the contour detection in this method and temperature analysis, which states real boundaries, shows a good agreement. This color image edge detection method is applicable to other similar cases according to their image properties.

Keywords: color image processing, edge detection, ice contour boundary, salt ice, thermal image

Procedia PDF Downloads 312

435 Tank Barrel Surface Damage Detection Algorithm

Authors: Tomáš Dyk, Stanislav Procházka, Martin Drahanský

Abstract:

The article proposes a new algorithm for detecting damaged areas of the tank barrel based on the image of the inner surface of the tank barrel. Damage position is calculated using image processing techniques such as edge detection, discrete wavelet transformation and image segmentation for accurate contour detection. The algorithm can detect surface damage in smoothbore and even in rifled tank barrels. The algorithm also calculates the volume of the detected damage from the depth map generated, for example, from the distance measurement unit. The proposed method was tested on data obtained by a tank barrel scanning device, which generates both surface image data and depth map. The article also discusses tank barrel scanning devices and how damaged surface impacts material resistance.

Keywords: barrel, barrel diagnostic, image processing, surface damage detection, tank

Procedia PDF Downloads 136

434 Differentiation of Customer Types by Stereotypical Characteristics for Modular and Conventional Construction Methods

Authors: Peter Schnell, Phillip Haag

Abstract:

In the course of the structural transformation of the construction industry, the integration of industrialization and digitization has led to the development of construction methods with an increased degree of prefabrication, such as system or modular construction. Compared to conventional construction, these innovative construction methods are characterized by modified structural and procedural properties and expand the range of construction services. Faced with the supply side, it is possible to identify construction-specific customer types with different characteristics and certain preferences as far as the choice of construction method is concerned. The basis for this finding was qualitative expert interviews. By evaluating the stereotypical customer needs, a corresponding segmentation of the demand side can be made along with the basic orientation and decision behavior. This demarcation supports the target- and needs-oriented customer approach and contributes to cooperative and successful project management.

Keywords: differentiation of customer types, modular construction methods, conventional construction methods, stereotypical customer types

Procedia PDF Downloads 108

433 Photogrammetry and Topographic Information for Urban Growth and Change in Amman

Authors: Mahmoud M. S. Albattah

Abstract:

Urbanization results in the expansion of administrative boundaries, mainly at the periphery, ultimately leading to changes in landcover. Agricultural land, naturally vegetated land, and other land types are converted into residential areas with a high density of constructs, such as transportation systems and housing. In urban regions of rapid growth and change, urban planners need regular information on up to date ground change. Amman (the capital of Jordan) is growing at unprecedented rates, creating extensive urban landscapes. Planners interact with these changes without having a global view of their impact. The use of aerial photographs and satellite images data combined with topographic information and field survey could provide effective information to develop urban change and growth inventory which could be explored towards producing a very important signature for the built-up area changes.

Keywords: highway design, satellite technologies, remote sensing, GIS, image segmentation, classification

Procedia PDF Downloads 442

432 Passive Attenuation with Multiple Resonator Rings for Musical Instruments Equalization

Authors: Lorenzo Bonoldi, Gianluca Memoli, Abdelhalim Azbaid El Ouahabi

Abstract:

In this paper, a series of ring-shaped attenuators utilizing Helmholtz and quarter wavelength resonators in variable, fixed, and combined configurations have been manufactured using a 3D printer. We illustrate possible uses by incorporating such devices into musical instruments (e.g. in acoustic guitar sound holes) and audio speakers with a view to controlling such devices tonal emissions without electronic equalization systems. Numerical investigations into the transmission loss values of these ring-shaped attenuators using finite element method simulations (COMSOL Multiphysics) have been presented in the frequency range of 100– 1000 Hz. We compare such results for each attenuator model with experimental measurements using different driving sources such as white noise, a maximum-length sequence (MLS), square and sine sweep pulses, and point scans in the frequency domain. Finally, we present a preliminary discussion on the comparison of numerical and experimental results.

Keywords: equaliser, metamaterials, musical, instruments

Procedia PDF Downloads 172

431 Automatic Number Plate Recognition System Based on Deep Learning

Authors: T. Damak, O. Kriaa, A. Baccar, M. A. Ben Ayed, N. Masmoudi

Abstract:

In the last few years, Automatic Number Plate Recognition (ANPR) systems have become widely used in the safety, the security, and the commercial aspects. Forethought, several methods and techniques are computing to achieve the better levels in terms of accuracy and real time execution. This paper proposed a computer vision algorithm of Number Plate Localization (NPL) and Characters Segmentation (CS). In addition, it proposed an improved method in Optical Character Recognition (OCR) based on Deep Learning (DL) techniques. In order to identify the number of detected plate after NPL and CS steps, the Convolutional Neural Network (CNN) algorithm is proposed. A DL model is developed using four convolution layers, two layers of Maxpooling, and six layers of fully connected. The model was trained by number image database on the Jetson TX2 NVIDIA target. The accuracy result has achieved 95.84%.

Keywords: ANPR, CS, CNN, deep learning, NPL

Procedia PDF Downloads 304

430 Cells Detection and Recognition in Bone Marrow Examination with Deep Learning Method

Authors: Shiyin He, Zheng Huang

Abstract:

In this paper, deep learning methods are applied in bio-medical field to detect and count different types of cells in an automatic way instead of manual work in medical practice, specifically in bone marrow examination. The process is mainly composed of two steps, detection and recognition. Mask-Region-Convolutional Neural Networks (Mask-RCNN) was used for detection and image segmentation to extract cells and then Convolutional Neural Networks (CNN), as well as Deep Residual Network (ResNet) was used to classify. Result of cell detection network shows high efficiency to meet application requirements. For the cell recognition network, two networks are compared and the final system is fully applicable.

Keywords: cell detection, cell recognition, deep learning, Mask-RCNN, ResNet

Procedia PDF Downloads 186

429 Bit Error Rate (BER) Performance of Coherent Homodyne BPSK-OCDMA Network for Multimedia Applications

Authors: Morsy Ahmed Morsy Ismail

Abstract:

In this paper, the structure of a coherent homodyne receiver for the Binary Phase Shift Keying (BPSK) Optical Code Division Multiple Access (OCDMA) network is introduced based on the Multi-Length Weighted Modified Prime Code (ML-WMPC) for multimedia applications. The Bit Error Rate (BER) of this homodyne detection is evaluated as a function of the number of active users and the signal to noise ratio for different code lengths according to the multimedia application such as audio, voice, and video. Besides, the Mach-Zehnder interferometer is used as an external phase modulator in homodyne detection. Furthermore, the Multiple Access Interference (MAI) and the receiver noise in a shot-noise limited regime are taken into consideration in the BER calculations.

Keywords: OCDMA networks, bit error rate, multiple access interference, binary phase-shift keying, multimedia

Procedia PDF Downloads 174

428 Comparison of the Effect of Heart Rate Variability Biofeedback and Slow Breathing Training on Promoting Autonomic Nervous Function Related Performance

Authors: Yi Jen Wang, Yu Ju Chen

Abstract:

Background: Heart rate variability (HRV) biofeedback can promote autonomic nervous function, sleep quality and reduce psychological stress. In HRV biofeedback training, it is hoped that through the guidance of machine video or audio, the patient can breathe slowly according to his own heart rate changes so that the heart and lungs can achieve resonance, thereby promoting the related effects of autonomic nerve function; while, it is also pointed out that if slow breathing of 6 times per minute can also guide the case to achieve the effect of cardiopulmonary resonance. However, there is no relevant research to explore the comparison of the effectiveness of cardiopulmonary resonance by using video or audio HRV biofeedback training and metronome-guided slow breathing. Purpose: To compare the promotion of autonomic nervous function performance between using HRV biofeedback and slow breathing guided by a metronome. Method: This research is a kind of experimental design with convenient sampling; the cases are randomly divided into the heart rate variability biofeedback training group and the slow breathing training group. The HRV biofeedback training group will conduct HRV biofeedback training in a four-week laboratory and use the home training device for autonomous training; while the slow breathing training group will conduct slow breathing training in the four-week laboratory using the mobile phone APP breathing metronome to guide the slow breathing training, and use the mobile phone APP for autonomous training at home. After two groups were enrolled and four weeks after the intervention, the autonomic nervous function-related performance was repeatedly measured. Using the chi-square test, student’s t-test and other statistical methods to analyze the results, and use p <0.05 as the basis for statistical significance. Results: A total of 27 subjects were included in the analysis. After four weeks of training, the HRV biofeedback training group showed significant improvement in the HRV indexes (SDNN, RMSSD, HF, TP) and sleep quality. Although the stress index also decreased, it did not reach statistical significance; the slow breathing training group was not statistically significant after four weeks of training, only sleep quality improved significantly, while the HRV indexes (SDNN, RMSSD, TP) all increased. Although HF and stress indexes decreased, they were not statistically significant. Comparing the difference between the two groups after training, it was found that the HF index improved significantly and reached statistical significance in the HRV biofeedback training group. Although the sleep quality of the two groups improved, it did not reach that level in a statistically significant difference. Conclusion: HRV biofeedback training is more effective in promoting autonomic nervous function than slow breathing training, but the effects of reducing stress and promoting sleep quality need to be explored after increasing the number of samples. The results of this study can provide a reference for clinical or community health promotion. In the future, it can also be further designed to integrate heart rate variability biological feedback training into the development of AI artificial intelligence wearable devices, which can make it more convenient for people to train independently and get effective feedback in time.

Keywords: autonomic nervous function, HRV biofeedback, heart rate variability, slow breathing

Procedia PDF Downloads 174

427 Health Literacy in Jordan: Obstacles for Doctors and Quality Patient Care

Authors: Etaf Alkhlaifat

Abstract:

This study drew conceptually on Communication Accommodation Theory to describe and analyze conversations between doctors and patients to examine the extent to which patients’ level of literacy represents one of the linguistic obstacles that may adversely influence the quality of healthcare services in Jordan. A thematic qualitative approach was employed to interpret the phenomena under study, which required direct observation and interviews with doctors (n=6) and patients (n=15) in natural Jordanian medical settings. This generated a comprehensive corpus of audio and videotaped data, which revealed that most doctors expressed dissatisfaction with patients’ ability to express themselves and comprehend them as a result of a lack of medical awareness and limited health education. The significance of this study rests on its detailed investigation of the impact of health literacy on patients’ health outcomes and while providing unique insights into how low health literacy could contribute to misunderstanding and potential ill-health.

Keywords: doctor-patient communication, health literacy, medical knowledge, communication accommodation theory, qualitative research

Procedia PDF Downloads 3

426 Klippel Feil Syndrome: A Case Report and Review of Literature

Authors: Rim Frikha, Nouha Bouayed Abdelmoula, Afifa Sellami, Salima Daoud, Tarek Rebai

Abstract:

Klippel-Feil Syndrome (KFS) is characterized by congenital vertebral fusion of the cervical spine resulting from faulty segmentation along the embryo's developing axis. A wide spectrum of associated anomalies may be present. This heterogeneity has complicated elucidation of the genetic etiology and management of the syndrome. We report a case of an isolated Klippel-Feil Syndrome with C5-C6 fusion on the cervical spine. It‘s the rarest form of congenital fused cervical vertebrae which is predisposed to the risk of spinal cord injury and neurologic problems. The aim of this paper was to review clinical heterogeneity; radiographic abnormalities and genetic etiology in Klippel-Feil Syndrome. We insist in comprehensive evaluation and delineation of diagnostic and prognostic classes.

Keywords: Klippel–Feil anomaly, genetic, clinical heterogeneity, radiographic abnormalities

Procedia PDF Downloads 482

425 Hear Me: The Learning Experience on “Zoom” of Students With Deafness or Hard of Hearing Impairments

Authors: H. Weigelt-Marom

Abstract:

Over the years and up to the arousal of the COVID-19 pandemic, deaf or hard of hearing students studying in higher education institutions, participated lectures on campus using hearing aids and strategies adapted for frontal learning in a classroom. Usually, these aids were well known to them from their earlier study experience in school. However, the transition to online lessons, due to the latest pandemic, led deaf or hard of hearing students to study outside of their physical, well known learning environment. The change of learning environment and structure rose new challenges for these students. The present study examined the learning experience, limitations, challenges and benefits regarding learning online with lecture and classmates via the “Zoom” video conference program, among deaf or hard of hearing students in academia setting. In addition, emotional and social aspects related to learning in general versus the “Zoom” were examined. The study included 18 students diagnosed as deaf or hard of hearing, studying in various higher education institutions in Israel. All students had experienced lessons on the “Zoom”. Following allocation of the group study by the deaf and hard of hearing non-profit organization “Ma’agalei Shema”, and receiving the participants inform of consent, students were requested to answer a google form questioner and participate in an interview. The questioner included background information (e.g., age, year of studying, faculty etc.), level of computer literacy, and level of hearing and forms of communication (e.g., lip reading, sign language etc.). The interviews included a one on one, semi-structured, in-depth interview, conducted by the main researcher of the study (interview duration: up to 60 minutes). The interviews were held on “ZOOM” using specific adaptations for each interviewee: clear face screen of the interviewer for lip and face reading, and/ or professional sign language or live text transcript of the conversation. Additionally, interviewees used their audio devices if needed. Questions regarded: learning experience, difficulties and advantages studying using “Zoom”, learning in a classroom versus on “Zoom”, and questions concerning emotional and social aspects related to learning. Thematic analysis of the interviews revealed severe difficulties regarding the ability of deaf or hard of hearing students to comprehend during ”Zoom“ lessons without adoptive aids. For example, interviewees indicated difficulties understanding “Zoom” lessons due to their inability to use hearing devices commonly used by them in the classroom (e.g., FM systems). 80% indicated that they could not comprehend “Zoom” lessons since they could not see the lectures face, either because lectures did not agree to open their cameras or, either because they did not keep a straight forward clear face appearance while teaching. However, not all descriptions regarded learning via the “zoom” were negative. For example, 20% reported the recording of “Zoom” lessons as a main advantage. Enabling then to repeatedly watch the lessons at their own pace, mostly assisted by friends and family to translate the audio output into an accessible input. These finding and others regarding the learning experience of the group study on the “Zoom”, as well as their recommendation to enable deaf or hard of hearing students to study inclusively online, will be presented at the conference.

Keywords: deaf or hard of hearing, learning experience, Zoom, qualitative research

Procedia PDF Downloads 115

424 A Hybrid Watermarking Model Based on Frequency of Occurrence

Authors: Hamza A. A. Al-Sewadi, Adnan H. M. Al-Helali, Samaa A. K. Khamis

Abstract:

Ownership proofs of multimedia such as text, image, audio or video files can be achieved by the burial of watermark is them. It is achieved by introducing modifications into these files that are imperceptible to the human senses but easily recoverable by a computer program. These modifications would be in the time domain or frequency domain or both. This paper presents a procedure for watermarking by mixing amplitude modulation with frequency transformation histogram; namely a specific value is used to modulate the intensity component Y of the YIQ components of the carrier image. This scheme is referred to as histogram embedding technique (HET). Results comparison with those of other techniques such as discrete wavelet transform (DWT), discrete cosine transform (DCT) and singular value decomposition (SVD) have shown an enhance efficiency in terms of ease and performance. It has manifested a good degree of robustness against various environment effects such as resizing, rotation and different kinds of noise. This method would prove very useful technique for copyright protection and ownership judgment.

Keywords: authentication, copyright protection, information hiding, ownership, watermarking

Procedia PDF Downloads 564

423 The Museum of Museums: A Mobile Augmented Reality Application

Authors: Qian Jin

Abstract:

Museums have been using interactive technology to spark visitor interest and improve understanding. These technologies can play a crucial role in helping visitors understand more about an exhibition site by using multimedia to provide information. Google Arts and Culture and Smartify are two very successful digital heritage products. They used mobile augmented reality to visualise the museum's 3D models and heritage images but did not include 3D models of the collection and audio information. In this research, service-oriented mobile augmented reality application was developed for users to access collections from multiple museums(including V and A, the British Museum, and British Library). The third-party API (Application Programming Interface) is requested to collect metadata (including images, 3D models, videos, and text) of three museums' collections. The acquired content is then visualized in AR environments. This product will help users who cannot visit the museum offline due to various reasons (inconvenience of transportation, physical disability, time schedule).

Keywords: digital heritage, argument reality, museum, flutter, ARcore

Procedia PDF Downloads 77

422 Frequency of Occurrence Hybrid Watermarking Scheme

Authors: Hamza A. Ali, Adnan H. M. Al-Helali

Abstract:

Generally, a watermark is information that identifies the ownership of multimedia (text, image, audio or video files). It is achieved by introducing modifications into these files that are imperceptible to the human senses but easily recoverable by a computer program. These modifications are done according to a secret key in a descriptive model that would be either in the time domain or frequency domain or both. This paper presents a procedure for watermarking by mixing amplitude modulation with frequency transformation histogram; namely a specific value is used to modulate the intensity component Y of the YIQ components of the carrier image. This scheme is referred to as histogram embedding technique (HET). Results comparison with those of other techniques such as discrete wavelet transform (DWT), discrete cosine transform (DCT) and singular value decomposition (SVD) have shown an enhance efficiency in terms of ease and performance. It has manifested a good degree of robustness against various environment effects such as resizing, rotation and different kinds of noise. This method would prove very useful technique for copyright protection and ownership judgment.

Keywords: watermarking, ownership, copyright protection, steganography, information hiding, authentication

Procedia PDF Downloads 367

421 Voice Signal Processing and Coding in MATLAB Generating a Plasma Signal in a Tesla Coil for a Security System

Authors: Juan Jimenez, Erika Yambay, Dayana Pilco, Brayan Parra

Abstract:

This paper presents an investigation of voice signal processing and coding using MATLAB, with the objective of generating a plasma signal on a Tesla coil within a security system. The approach focuses on using advanced voice signal processing techniques to encode and modulate the audio signal, which is then amplified and applied to a Tesla coil. The result is the creation of a striking visual effect of voice-controlled plasma with specific applications in security systems. The article explores the technical aspects of voice signal processing, the generation of the plasma signal, and its relationship to security. The implications and creative potential of this technology are discussed, highlighting its relevance at the forefront of research in signal processing and visual effect generation in the field of security systems.

Keywords: voice signal processing, voice signal coding, MATLAB, plasma signal, Tesla coil, security system, visual effects, audiovisual interaction

Procedia PDF Downloads 90

420 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 348

419 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 272

418 “Moves” for Guiding Presentations in French

Authors: Nuchanat Handumrongkul, Suwaree Yordchim, Anantachai Aeka

Abstract:

Despite four years of study in the tourism industry, the Bachelor’s graduates cannot perform their jobs as experienced tour guides. This research aimed to develop French teaching and studying for Tourism with two main purposes: to analyze ‘Moves’ used in oral presentations at tourist attractions; and to study content in guiding presentations or 'Guide Speak'. The study employed audio recording of these presentations as an interview method in authentic situations, having four tour guides as respondents and information providers. The data was analyzed via moves and content analysis. The results found that there were eight moves used; namely: welcoming, introducing oneself, drawing someone’s attention, giving information, explaining, highlighting, persuading, and saying goodbye. In terms of content, the information being presented covered the outstanding characteristics of the places and well-integrated with other related content. The findings were used as guidelines for curriculum development; in particular, the core content and the presentation forming the basis for students to meet the standard requirements of the labor-market and professional schemes.

Keywords: moves, guiding presentation, french, tourism

Procedia PDF Downloads 231

417 Adaptive Few-Shot Deep Metric Learning

Authors: Wentian Shi, Daming Shi, Maysam Orouskhani, Feng Tian

Abstract:

Whereas currently the most prevalent deep learning methods require a large amount of data for training, few-shot learning tries to learn a model from limited data without extensive retraining. In this paper, we present a loss function based on triplet loss for solving few-shot problem using metric based learning. Instead of setting the margin distance in triplet loss as a constant number empirically, we propose an adaptive margin distance strategy to obtain the appropriate margin distance automatically. We implement the strategy in the deep siamese network for deep metric embedding, by utilizing an optimization approach by penalizing the worst case and rewarding the best. Our experiments on image recognition and co-segmentation model demonstrate that using our proposed triplet loss with adaptive margin distance can significantly improve the performance.

Keywords: few-shot learning, triplet network, adaptive margin, deep learning

Procedia PDF Downloads 167

416 Identification of Clinical Characteristics from Persistent Homology Applied to Tumor Imaging

Authors: Eashwar V. Somasundaram, Raoul R. Wadhwa, Jacob G. Scott

Abstract:

The use of radiomics in measuring geometric properties of tumor images such as size, surface area, and volume has been invaluable in assessing cancer diagnosis, treatment, and prognosis. In addition to analyzing geometric properties, radiomics would benefit from measuring topological properties using persistent homology. Intuitively, features uncovered by persistent homology may correlate to tumor structural features. One example is necrotic cavities (corresponding to 2D topological features), which are markers of very aggressive tumors. We develop a data pipeline in R that clusters tumors images based on persistent homology is used to identify meaningful clinical distinctions between tumors and possibly new relationships not captured by established clinical categorizations. A preliminary analysis was performed on 16 Magnetic Resonance Imaging (MRI) breast tissue segments downloaded from the 'Investigation of Serial Studies to Predict Your Therapeutic Response with Imaging and Molecular Analysis' (I-SPY TRIAL or ISPY1) collection in The Cancer Imaging Archive. Each segment represents a patient’s breast tumor prior to treatment. The ISPY1 dataset also provided the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) status data. A persistent homology matrix up to 2-dimensional features was calculated for each of the MRI segmentation. Wasserstein distances were then calculated between all pairwise tumor image persistent homology matrices to create a distance matrix for each feature dimension. Since Wasserstein distances were calculated for 0, 1, and 2-dimensional features, three hierarchal clusters were constructed. The adjusted Rand Index was used to see how well the clusters corresponded to the ER/PR/HER2 status of the tumors. Triple-negative cancers (negative status for all three receptors) significantly clustered together in the 2-dimensional features dendrogram (Adjusted Rand Index of .35, p = .031). It is known that having a triple-negative breast tumor is associated with aggressive tumor growth and poor prognosis when compared to non-triple negative breast tumors. The aggressive tumor growth associated with triple-negative tumors may have a unique structure in an MRI segmentation, which persistent homology is able to identify. This preliminary analysis shows promising results in the use of persistent homology on tumor imaging to assess the severity of breast tumors. The next step is to apply this pipeline to other tumor segment images from The Cancer Imaging Archive at different sites such as the lung, kidney, and brain. In addition, whether other clinical parameters, such as overall survival, tumor stage, and tumor genotype data are captured well in persistent homology clusters will be assessed. If analyzing tumor MRI segments using persistent homology consistently identifies clinical relationships, this could enable clinicians to use persistent homology data as a noninvasive way to inform clinical decision making in oncology.

Keywords: cancer biology, oncology, persistent homology, radiomics, topological data analysis, tumor imaging

Procedia PDF Downloads 134

415 Intelligent Grading System of Apple Using Neural Network Arbitration

Authors: Ebenezer Obaloluwa Olaniyi

Abstract:

In this paper, an intelligent system has been designed to grade apple based on either its defective or healthy for production in food processing. This paper is segmented into two different phase. In the first phase, the image processing techniques were employed to extract the necessary features required in the apple. These techniques include grayscale conversion, segmentation where a threshold value is chosen to separate the foreground of the images from the background. Then edge detection was also employed to bring out the features in the images. These extracted features were then fed into the neural network in the second phase of the paper. The second phase is a classification phase where neural network employed to classify the defective apple from the healthy apple. In this phase, the network was trained with back propagation and tested with feed forward network. The recognition rate obtained from our system shows that our system is more accurate and faster as compared with previous work.

Keywords: image processing, neural network, apple, intelligent system

Procedia PDF Downloads 396

414 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 43