Search results for: audio segmentation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 796

Search results for: audio segmentation

766 Audio-Visual Aids and the Secondary School Teaching

Authors: Shrikrishna Mishra, Badri Yadav

Abstract:

In this complex society of today where experiences are innumerable and varied, it is not at all possible to present every situation in its original colors hence the opportunities for learning by actual experiences always are not at all possible. It is only through the use of proper audio visual aids that the life situation can be trough in the class room by an enlightened teacher in their simplest form and representing the original to the highest point of similarity which is totally absent in the verbal or lecture method. In the presence of audio aids, the attention is attracted interest roused and suitable atmosphere for proper understanding is automatically created, but in the existing traditional method greater efforts are to be made in order to achieve the aforesaid essential requisite. Inspire of the best and sincere efforts on the side of the teacher the net effect as regards understanding or learning in general is quite negligible.

Keywords: Audio-Visual Aids, the secondary school teaching, complex society, audio

Procedia PDF Downloads 457
765 Effective Parameter Selection for Audio-Based Music Mood Classification for Christian Kokborok Song: A Regression-Based Approach

Authors: Sanchali Das, Swapan Debbarma

Abstract:

Music mood classification is developing in both the areas of music information retrieval (MIR) and natural language processing (NLP). Some of the Indian languages like Hindi English etc. have considerable exposure in MIR. But research in mood classification in regional language is very less. In this paper, powerful audio based feature for Kokborok Christian song is identified and mood classification task has been performed. Kokborok is an Indo-Burman language especially spoken in the northeastern part of India and also some other countries like Bangladesh, Myanmar etc. For performing audio-based classification task, useful audio features are taken out by jMIR software. There are some standard audio parameters are there for the audio-based task but as known to all that every language has its unique characteristics. So here, the most significant features which are the best fit for the database of Kokborok song is analysed. The regression-based model is used to find out the independent parameters that act as a predictor and predicts the dependencies of parameters and shows how it will impact on overall classification result. For classification WEKA 3.5 is used, and selected parameters create a classification model. And another model is developed by using all the standard audio features that are used by most of the researcher. In this experiment, the essential parameters that are responsible for effective audio based mood classification and parameters that do not significantly change for each of the Christian Kokborok songs are analysed, and a comparison is also shown between the two above model.

Keywords: Christian Kokborok song, mood classification, music information retrieval, regression

Procedia PDF Downloads 191
764 Computer-Aided Detection of Liver and Spleen from CT Scans using Watershed Algorithm

Authors: Belgherbi Aicha, Bessaid Abdelhafid

Abstract:

In the recent years a great deal of research work has been devoted to the development of semi-automatic and automatic techniques for the analysis of abdominal CT images. The first and fundamental step in all these studies is the semi-automatic liver and spleen segmentation that is still an open problem. In this paper, a semi-automatic liver and spleen segmentation method by the mathematical morphology based on watershed algorithm has been proposed. Our algorithm is currency in two parts. In the first, we seek to determine the region of interest by applying the morphological to extract the liver and spleen. The second step consists to improve the quality of the image gradient. In this step, we propose a method for improving the image gradient to reduce the over-segmentation problem by applying the spatial filters followed by the morphological filters. Thereafter we proceed to the segmentation of the liver, spleen. The aim of this work is to develop a method for semi-automatic segmentation liver and spleen based on watershed algorithm, improve the accuracy and the robustness of the liver and spleen segmentation and evaluate a new semi-automatic approach with the manual for liver segmentation. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work. The system has been evaluated by computing the sensitivity and specificity between the semi-automatically segmented (liver and spleen) contour and the manually contour traced by radiological experts. Liver segmentation has achieved the sensitivity and specificity; sens Liver=96% and specif Liver=99% respectively. Spleen segmentation achieves similar, promising results sens Spleen=95% and specif Spleen=99%.

Keywords: CT images, liver and spleen segmentation, anisotropic diffusion filter, morphological filters, watershed algorithm

Procedia PDF Downloads 285
763 A Neural Approach for Color-Textured Images Segmentation

Authors: Khalid Salhi, El Miloud Jaara, Mohammed Talibi Alaoui

Abstract:

In this paper, we present a neural approach for unsupervised natural color-texture image segmentation, which is based on both Kohonen maps and mathematical morphology, using a combination of the texture and the image color information of the image, namely, the fractal features based on fractal dimension are selected to present the information texture, and the color features presented in RGB color space. These features are then used to train the network Kohonen, which will be represented by the underlying probability density function, the segmentation of this map is made by morphological watershed transformation. The performance of our color-texture segmentation approach is compared first, to color-based methods or texture-based methods only, and then to k-means method.

Keywords: segmentation, color-texture, neural networks, fractal, watershed

Procedia PDF Downloads 312
762 Defect Detection for Nanofibrous Images with Deep Learning-Based Approaches

Authors: Gaokai Liu

Abstract:

Automatic defect detection for nanomaterial images is widely required in industrial scenarios. Deep learning approaches are considered as the most effective solutions for the great majority of image-based tasks. In this paper, an edge guidance network for defect segmentation is proposed. First, the encoder path with multiple convolution and downsampling operations is applied to the acquisition of shared features. Then two decoder paths both are connected to the last convolution layer of the encoder and supervised by the edge and segmentation labels, respectively, to guide the whole training process. Meanwhile, the edge and encoder outputs from the same stage are concatenated to the segmentation corresponding part to further tune the segmentation result. Finally, the effectiveness of the proposed method is verified via the experiments on open nanofibrous datasets.

Keywords: deep learning, defect detection, image segmentation, nanomaterials

Procedia PDF Downloads 115
761 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 348
760 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 114
759 Segmentation of the Liver and Spleen From Abdominal CT Images Using Watershed Approach

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

The phase of segmentation is an important step in the processing and interpretation of medical images. In this paper, we focus on the segmentation of liver and spleen from the abdomen computed tomography (CT) images. The importance of our study comes from the fact that the segmentation of ROI from CT images is usually a difficult task. This difficulty is the gray’s level of which is similar to the other organ also the ROI are connected to the ribs, heart, kidneys, etc. Our proposed method is based on the anatomical information and mathematical morphology tools used in the image processing field. At first, we try to remove the surrounding and connected organs and tissues by applying morphological filters. This first step makes the extraction of interest regions easier. The second step consists of improving the quality of the image gradient. In this step, we propose a method for improving the image gradient to reduce these deficiencies by applying the spatial filters followed by the morphological filters. Thereafter we proceed to the segmentation of the liver, spleen. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.The system has been evaluated by computing the sensitivity and specificity between the semi-automatically segmented (liver and spleen) contour and the manually contour traced by radiological experts.

Keywords: CT images, liver and spleen segmentation, anisotropic diffusion filter, morphological filters, watershed algorithm

Procedia PDF Downloads 463
758 A Study on the Improvement of Mobile Device Call Buzz Noise Caused by Audio Frequency Ground Bounce

Authors: Jangje Park, So Young Kim

Abstract:

The market demand for audio quality in mobile devices continues to increase, and audible buzz noise generated in time division communication is a chronic problem that goes against the market demand. In the case of time division type communication, the RF Power Amplifier (RF PA) is driven at the audio frequency cycle, and it makes various influences on the audio signal. In this paper, we measured the ground bounce noise generated by the peak current flowing through the ground network in the RF PA with the audio frequency; it was confirmed that the noise is the cause of the audible buzz noise during a call. In addition, a grounding method of the microphone device that can improve the buzzing noise was proposed. Considering that the level of the audio signal generated by the microphone device is -38dBV based on 94dB Sound Pressure Level (SPL), even ground bounce noise of several hundred uV will fall within the range of audible noise if it is induced by the audio amplifier. Through the grounding method of the microphone device proposed in this paper, it was confirmed that the audible buzz noise power density at the RF PA driving frequency was improved by more than 5dB under the conditions of the Printed Circuit Board (PCB) used in the experiment. A fundamental improvement method was presented regarding the buzzing noise during a mobile phone call.

Keywords: audio frequency, buzz noise, ground bounce, microphone grounding

Procedia PDF Downloads 111
757 Contrastive Learning for Unsupervised Object Segmentation in Sequential Images

Authors: Tian Zhang

Abstract:

Unsupervised object segmentation aims at segmenting objects in sequential images and obtaining the mask of each object without any manual intervention. Unsupervised segmentation remains a challenging task due to the lack of prior knowledge about these objects. Previous methods often require manually specifying the action of each object, which is often difficult to obtain. Instead, this paper does not need action information of objects and automatically learns the actions and relations among objects from the structured environment. To obtain the object segmentation of sequential images, the relationships between objects and images are extracted to infer the action and interaction of objects based on the multi-head attention mechanism. Three types of objects’ relationships in the object segmentation task are proposed: the relationship between objects in the same frame, the relationship between objects in two frames, and the relationship between objects and historical information. Based on these relationships, the proposed model (1) is effective in multiple objects segmentation tasks, (2) just needs images as input, and (3) produces better segmentation results as more relationships are considered. The experimental results on multiple datasets show that this paper’s method achieves state-of-art performance. The quantitative and qualitative analyses of the result are conducted. The proposed method could be easily extended to other similar applications.

Keywords: unsupervised object segmentation, attention mechanism, contrastive learning, structured environment

Procedia PDF Downloads 84
756 The Influence of Noise on Aerial Image Semantic Segmentation

Authors: Pengchao Wei, Xiangzhong Fang

Abstract:

Noise is ubiquitous in this world. Denoising is an essential technology, especially in image semantic segmentation, where noises are generally categorized into two main types i.e. feature noise and label noise. The main focus of this paper is aiming at modeling label noise, investigating the behaviors of different types of label noise on image semantic segmentation tasks using K-Nearest-Neighbor and Convolutional Neural Network classifier. The performance without label noise and with is evaluated and illustrated in this paper. In addition to that, the influence of feature noise on the image semantic segmentation task is researched as well and a feature noise reduction method is applied to mitigate its influence in the learning procedure.

Keywords: convolutional neural network, denoising, feature noise, image semantic segmentation, k-nearest-neighbor, label noise

Procedia PDF Downloads 187
755 Audio-Visual Entrainment and Acupressure Therapy for Insomnia

Authors: Mariya Yeldhos, G. Hema, Sowmya Narayanan, L. Dhiviyalakshmi

Abstract:

Insomnia is one of the most prevalent psychological disorders worldwide. Some of the deficiencies of the current treatments of insomnia are: side effects in the case of sleeping pills and high costs in the case of psychotherapeutic treatment. In this paper, we propose a device which provides a combination of audio visual entrainment and acupressure based compression therapy for insomnia. This device provides drug-free treatment of insomnia through a user friendly and portable device that enables relaxation of brain and muscles, with certain advantages such as low cost, and wide accessibility to a large number of people. Tools adapted towards the treatment of insomnia: -Audio -Continuous exposure to binaural beats of a particular frequency of audible range -Visual -Flash of LED light -Acupressure points -GB-20 -GV-16 -B-10

Keywords: insomnia, acupressure, entrainment, audio-visual entrainment

Procedia PDF Downloads 407
754 Maximum Entropy Based Image Segmentation of Human Skin Lesion

Authors: Sheema Shuja Khattak, Gule Saman, Imran Khan, Abdus Salam

Abstract:

Image segmentation plays an important role in medical imaging applications. Therefore, accurate methods are needed for the successful segmentation of medical images for diagnosis and detection of various diseases. In this paper, we have used maximum entropy to achieve image segmentation. Maximum entropy has been calculated using Shannon, Renyi, and Tsallis entropies. This work has novelty based on the detection of skin lesion caused by the bite of a parasite called Sand Fly causing the disease is called Cutaneous Leishmaniasis.

Keywords: shannon, maximum entropy, Renyi, Tsallis entropy

Procedia PDF Downloads 428
753 Perception of Value Affecting Engagement Through Online Audio Communication

Authors: Apipol Penkitti

Abstract:

The new normal or a new way of life stemmed from the COVID-19 outbreak, gave rise to a new form of social media: audio-based social platforms (ABSPs), known as Clubhouse, Twitter space, and Facebook live audio room. These platforms, on which audio-based communication is featured, became popular in a short span of time. The objective of the research study is to understand ABSPs users’ behaviors in Thailand. The study, in which functional attitude theory, uses and gratifications theory, and social influence theory are referred to, is conducted through consumer perceived utilitarian, hedonic, and social value that affect engagement. This research study is mixed method paradigm, utilizing Model of Triangulation as its framework. The data acquisition is proceeded through questionnaires from a sample of 384 male, female and LGBTQA+ individuals aged 25 - 34 who, from various occupations, have used audio-based social platform applications. This research study employs the structural equation modeling to analyze the relationships between variables, and it uses the semi - structured interviewing to comprehend the rationality of the variables in the study. The study found that hedonic value directly affects engagement.

Keywords: audio based social platform, engagement, hedonic, perceived value, social, utilitarian

Procedia PDF Downloads 84
752 Multi-Atlas Segmentation Based on Dynamic Energy Model: Application to Brain MR Images

Authors: Jie Huo, Jonathan Wu

Abstract:

Segmentation of anatomical structures in medical images is essential for scientific inquiry into the complex relationships between biological structure and clinical diagnosis, treatment and assessment. As a method of incorporating the prior knowledge and the anatomical structure similarity between a target image and atlases, multi-atlas segmentation has been successfully applied in segmenting a variety of medical images, including the brain, cardiac, and abdominal images. The basic idea of multi-atlas segmentation is to transfer the labels in atlases to the coordinate of the target image by matching the target patch to the atlas patch in the neighborhood. However, this technique is limited by the pairwise registration between target image and atlases. In this paper, a novel multi-atlas segmentation approach is proposed by introducing a dynamic energy model. First, the target is mapped to each atlas image by minimizing the dynamic energy function, then the segmentation of target image is generated by weighted fusion based on the energy. The method is tested on MICCAI 2012 Multi-Atlas Labeling Challenge dataset which includes 20 target images and 15 atlases images. The paper also analyzes the influence of different parameters of the dynamic energy model on the segmentation accuracy and measures the dice coefficient by using different feature terms with the energy model. The highest mean dice coefficient obtained with the proposed method is 0.861, which is competitive compared with the recently published method.

Keywords: brain MRI segmentation, dynamic energy model, multi-atlas segmentation, energy minimization

Procedia PDF Downloads 308
751 A Non-Parametric Based Mapping Algorithm for Use in Audio Fingerprinting

Authors: Analise Borg, Paul Micallef

Abstract:

Over the past few years, the online multimedia collection has grown at a fast pace. Several companies showed interest to study the different ways to organize the amount of audio information without the need of human intervention to generate metadata. In the past few years, many applications have emerged on the market which are capable of identifying a piece of music in a short time. Different audio effects and degradation make it much harder to identify the unknown piece. In this paper, an audio fingerprinting system which makes use of a non-parametric based algorithm is presented. Parametric analysis is also performed using Gaussian Mixture Models (GMMs). The feature extraction methods employed are the Mel Spectrum Coefficients and the MPEG-7 basic descriptors. Bin numbers replaced the extracted feature coefficients during the non-parametric modelling. The results show that non-parametric analysis offer potential results as the ones mentioned in the literature.

Keywords: audio fingerprinting, mapping algorithm, Gaussian Mixture Models, MFCC, MPEG-7

Procedia PDF Downloads 394
750 Computer-Aided Detection of Simultaneous Abdominal Organ CT Images by Iterative Watershed Transform

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

Interpretation of medical images benefits from anatomical and physiological priors to optimize computer-aided diagnosis applications. Segmentation of liver, spleen and kidneys is regarded as a major primary step in the computer-aided diagnosis of abdominal organ diseases. In this paper, a semi-automated method for medical image data is presented for the abdominal organ segmentation data using mathematical morphology. Our proposed method is based on hierarchical segmentation and watershed algorithm. In our approach, a powerful technique has been designed to suppress over-segmentation based on mosaic image and on the computation of the watershed transform. Our algorithm is currency in two parts. In the first, we seek to improve the quality of the gradient-mosaic image. In this step, we propose a method for improving the gradient-mosaic image by applying the anisotropic diffusion filter followed by the morphological filters. Thereafter, we proceed to the hierarchical segmentation of the liver, spleen and kidney. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.

Keywords: anisotropic diffusion filter, CT images, morphological filter, mosaic image, simultaneous organ segmentation, the watershed algorithm

Procedia PDF Downloads 413
749 Endocardial Ultrasound Segmentation using Level Set method

Authors: Daoudi Abdelaziz, Mahmoudi Saïd, Chikh Mohamed Amine

Abstract:

This paper presents a fully automatic segmentation method of the left ventricle at End Systolic (ES) and End Diastolic (ED) in the ultrasound images by means of an implicit deformable model (level set) based on Geodesic Active Contour model. A pre-processing Gaussian smoothing stage is applied to the image, which is essential for a good segmentation. Before the segmentation phase, we locate automatically the area of the left ventricle by using a detection approach based on the Hough Transform method. Consequently, the result obtained is used to automate the initialization of the level set model. This initial curve (zero level set) deforms to search the Endocardial border in the image. On the other hand, quantitative evaluation was performed on a data set composed of 15 subjects with a comparison to ground truth (manual segmentation).

Keywords: level set method, transform Hough, Gaussian smoothing, left ventricle, ultrasound images.

Procedia PDF Downloads 436
748 Heterogenous Dimensional Super Resolution of 3D CT Scans Using Transformers

Authors: Helen Zhang

Abstract:

Accurate segmentation of the airways from CT scans is crucial for early diagnosis of lung cancer. However, the existing airway segmentation algorithms often rely on thin-slice CT scans, which can be inconvenient and costly. This paper presents a set of machine learning-based 3D super-resolution algorithms along heterogeneous dimensions to improve the resolution of thicker CT scans to reduce the reliance on thin-slice scans. To evaluate the efficacy of the super-resolution algorithms, quantitative assessments using PSNR (Peak Signal to Noise Ratio) and SSIM (Structural SIMilarity index) were performed. The impact of super-resolution on airway segmentation accuracy is also studied. The proposed approach has the potential to make airway segmentation more accessible and affordable, thereby facilitating early diagnosis and treatment of lung cancer.

Keywords: 3D super-resolution, airway segmentation, thin-slice CT scans, machine learning

Procedia PDF Downloads 80
747 An Overview of Posterior Fossa Associated Pathologies and Segmentation

Authors: Samuel J. Ahmad, Michael Zhu, Andrew J. Kobets

Abstract:

Segmentation tools continue to advance, evolving from manual methods to automated contouring technologies utilizing convolutional neural networks. These techniques have evaluated ventricular and hemorrhagic volumes in the past but may be applied in novel ways to assess posterior fossa-associated pathologies such as Chiari malformations. Herein, we summarize literature pertaining to segmentation in the context of this and other posterior fossa-based diseases such as trigeminal neuralgia, hemifacial spasm, and posterior fossa syndrome. A literature search for volumetric analysis of the posterior fossa identified 27 papers where semi-automated, automated, manual segmentation, linear measurement-based formulas, and the Cavalieri estimator were utilized. These studies produced superior data than older methods utilizing formulas for rough volumetric estimations. The most commonly used segmentation technique was semi-automated segmentation (12 studies). Manual segmentation was the second most common technique (7 studies). Automated segmentation techniques (4 studies) and the Cavalieri estimator (3 studies), a point-counting method that uses a grid of points to estimate the volume of a region, were the next most commonly used techniques. The least commonly utilized segmentation technique was linear measurement-based formulas (1 study). Semi-automated segmentation produced accurate, reproducible results. However, it is apparent that there does not exist a single semi-automated software, open source or otherwise, that has been widely applied to the posterior fossa. Fully-automated segmentation via such open source software as FSL and Freesurfer produced highly accurate posterior fossa segmentations. Various forms of segmentation have been used to assess posterior fossa pathologies and each has its advantages and disadvantages. According to our results, semi-automated segmentation is the predominant method. However, atlas-based automated segmentation is an extremely promising method that produces accurate results. Future evolution of segmentation technologies will undoubtedly yield superior results, which may be applied to posterior fossa related pathologies. Medical professionals will save time and effort analyzing large sets of data due to these advances.

Keywords: chiari, posterior fossa, segmentation, volumetric

Procedia PDF Downloads 68
746 Digital Recording System Identification Based on Audio File

Authors: Michel Kulhandjian, Dimitris A. Pados

Abstract:

The objective of this work is to develop a theoretical framework for reliable digital recording system identification from digital audio files alone, for forensic purposes. A digital recording system consists of a microphone and a digital sound processing card. We view the cascade as a system of unknown transfer function. We expect same manufacturer and model microphone-sound card combinations to have very similar/near identical transfer functions, bar any unique manufacturing defect. Input voice (or other) signals are modeled as non-stationary processes. The technical problem under consideration becomes blind deconvolution with non-stationary inputs as it manifests itself in the specific application of digital audio recording equipment classification.

Keywords: blind system identification, audio fingerprinting, blind deconvolution, blind dereverberation

Procedia PDF Downloads 280
745 Level Set and Morphological Operation Techniques in Application of Dental Image Segmentation

Authors: Abdolvahab Ehsani Rad, Mohd Shafry Mohd Rahim, Alireza Norouzi

Abstract:

Medical image analysis is one of the great effects of computer image processing. There are several processes to analysis the medical images which the segmentation process is one of the challenging and most important step. In this paper the segmentation method proposed in order to segment the dental radiograph images. Thresholding method has been applied to simplify the images and to morphologically open binary image technique performed to eliminate the unnecessary regions on images. Furthermore, horizontal and vertical integral projection techniques used to extract the each individual tooth from radiograph images. Segmentation process has been done by applying the level set method on each extracted images. Nevertheless, the experiments results by 90% accuracy demonstrate that proposed method achieves high accuracy and promising result.

Keywords: integral production, level set method, morphological operation, segmentation

Procedia PDF Downloads 284
744 Audio-Visual Recognition Based on Effective Model and Distillation

Authors: Heng Yang, Tao Luo, Yakun Zhang, Kai Wang, Wei Qin, Liang Xie, Ye Yan, Erwei Yin

Abstract:

Recent years have seen that audio-visual recognition has shown great potential in a strong noise environment. The existing method of audio-visual recognition has explored methods with ResNet and feature fusion. However, on the one hand, ResNet always occupies a large amount of memory resources, restricting the application in engineering. On the other hand, the feature merging also brings some interferences in a high noise environment. In order to solve the problems, we proposed an effective framework with bidirectional distillation. At first, in consideration of the good performance in extracting of features, we chose the light model, Efficientnet as our extractor of spatial features. Secondly, self-distillation was applied to learn more information from raw data. Finally, we proposed a bidirectional distillation in decision-level fusion. In more detail, our experimental results are based on a multi-model dataset from 24 volunteers. Eventually, the lipreading accuracy of our framework was increased by 2.3% compared with existing systems, and our framework made progress in audio-visual fusion in a high noise environment compared with the system of audio recognition without visual.

Keywords: lipreading, audio-visual, Efficientnet, distillation

Procedia PDF Downloads 102
743 A Supervised Face Parts Labeling Framework

Authors: Khalil Khan, Ikram Syed, Muhammad Ehsan Mazhar, Iran Uddin, Nasir Ahmad

Abstract:

Face parts labeling is the process of assigning class labels to each face part. A face parts labeling method (FPL) which divides a given image into its constitutes parts is proposed in this paper. A database FaceD consisting of 564 images is labeled with hand and make publically available. A supervised learning model is built through extraction of features from the training data. The testing phase is performed with two semantic segmentation methods, i.e., pixel and super-pixel based segmentation. In pixel-based segmentation class label is provided to each pixel individually. In super-pixel based method class label is assigned to super-pixel only – as a result, the same class label is given to all pixels inside a super-pixel. Pixel labeling accuracy reported with pixel and super-pixel based methods is 97.68 % and 93.45% respectively.

Keywords: face labeling, semantic segmentation, classification, face segmentation

Procedia PDF Downloads 228
742 Brainbow Image Segmentation Using Bayesian Sequential Partitioning

Authors: Yayun Hsu, Henry Horng-Shing Lu

Abstract:

This paper proposes a data-driven, biology-inspired neural segmentation method of 3D drosophila Brainbow images. We use Bayesian Sequential Partitioning algorithm for probabilistic modeling, which can be used to detect somas and to eliminate cross talk effects. This work attempts to develop an automatic methodology for neuron image segmentation, which nowadays still lacks a complete solution due to the complexity of the image. The proposed method does not need any predetermined, risk-prone thresholds since biological information is inherently included in the image processing procedure. Therefore, it is less sensitive to variations in neuron morphology; meanwhile, its flexibility would be beneficial for tracing the intertwining structure of neurons.

Keywords: brainbow, 3D imaging, image segmentation, neuron morphology, biological data mining, non-parametric learning

Procedia PDF Downloads 458
741 An Improved C-Means Model for MRI Segmentation

Authors: Ying Shen, Weihua Zhu

Abstract:

Medical images are important to help identifying different diseases, for example, Magnetic resonance imaging (MRI) can be used to investigate the brain, spinal cord, bones, joints, breasts, blood vessels, and heart. Image segmentation, in medical image analysis, is usually the first step to find out some characteristics with similar color, intensity or texture so that the diagnosis could be further carried out based on these features. This paper introduces an improved C-means model to segment the MRI images. The model is based on information entropy to evaluate the segmentation results by achieving global optimization. Several contributions are significant. Firstly, Genetic Algorithm (GA) is used for achieving global optimization in this model where fuzzy C-means clustering algorithm (FCMA) is not capable of doing that. Secondly, the information entropy after segmentation is used for measuring the effectiveness of MRI image processing. Experimental results show the outperformance of the proposed model by comparing with traditional approaches.

Keywords: magnetic resonance image (MRI), c-means model, image segmentation, information entropy

Procedia PDF Downloads 203
740 Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries

Authors: Elham Alaee, Mousa Shamsi, Hossein Ahmadi, Soroosh Nazem, Mohammad Hossein Sedaaghi

Abstract:

Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy C-Means (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic C-Means (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.

Keywords: facial image, segmentation, PCM, FCM, skin error, facial surgery

Procedia PDF Downloads 552
739 Satisfaction of Distance Education University Students with the Use of Audio Media as a Medium of Instruction: The Case of Mountains of the Moon University in Uganda

Authors: Mark Kaahwa, Chang Zhu, Moses Muhumuza

Abstract:

This study investigates the satisfaction of distance education university students (DEUS) with the use of audio media as a medium of instruction. Studying students’ satisfaction is vital because it shows whether learners are comfortable with a certain instructional strategy or not. Although previous studies have investigated the use of audio media, the satisfaction of students with an instructional strategy that combines radio teaching and podcasts as an independent teaching strategy has not been fully investigated. In this study, all lectures were delivered through the radio and students had no direct contact with their instructors. No modules or any other material in form of text were given to the students. They instead, revised the taught content by listening to podcasts saved on their mobile electronic gadgets. Prior to data collection, DEUS received orientation through workshops on how to use audio media in distance education. To achieve objectives of the study, a survey, naturalistic observations and face-to-face interviews were used to collect data from a sample of 211 undergraduate and graduate students. Findings indicate that there was no statistically significant difference in the levels of satisfaction between male and female students. The results from post hoc analysis show that there is a statistically significant difference in the levels of satisfaction regarding the use of audio media between diploma and graduate students. Diploma students are more satisfied compared to their graduate counterparts. T-test results reveal that there was no statistically significant difference in the general satisfaction with audio media between rural and urban-based students. And ANOVA results indicate that there is no statistically significant difference in the levels of satisfaction with the use of audio media across age groups. Furthermore, results from observations and interviews reveal that DEUS found learning using audio media a pleasurable medium of instruction. This is an indication that audio media can be considered as an instructional strategy on its own merit.

Keywords: audio media, distance education, distance education university students, medium of instruction, satisfaction

Procedia PDF Downloads 96
738 Image Segmentation of Visual Markers in Robotic Tracking System Based on Differential Evolution Algorithm with Connected-Component Labeling

Authors: Shu-Yu Hsu, Chen-Chien Hsu, Wei-Yen Wang

Abstract:

Color segmentation is a basic and simple way for recognizing the visual markers in a robotic tracking system. In this paper, we propose a new method for color segmentation by incorporating differential evolution algorithm and connected component labeling to autonomously preset the HSV threshold of visual markers. To evaluate the effectiveness of the proposed algorithm, a ROBOTIS OP2 humanoid robot is used to conduct the experiment, where five most commonly used color including red, purple, blue, yellow, and green in visual markers are given for comparisons.

Keywords: color segmentation, differential evolution, connected component labeling, humanoid robot

Procedia PDF Downloads 574
737 Semi-Automatic Segmentation of Mitochondria on Transmission Electron Microscopy Images Using Live-Wire and Surface Dragging Methods

Authors: Mahdieh Farzin Asanjan, Erkan Unal Mumcuoglu

Abstract:

Mitochondria are cytoplasmic organelles of the cell, which have a significant role in the variety of cellular metabolic functions. Mitochondria act as the power plants of the cell and are surrounded by two membranes. Significant morphological alterations are often due to changes in mitochondrial functions. A powerful technique in order to study the three-dimensional (3D) structure of mitochondria and its alterations in disease states is Electron microscope tomography. Detection of mitochondria in electron microscopy images due to the presence of various subcellular structures and imaging artifacts is a challenging problem. Another challenge is that each image typically contains more than one mitochondrion. Hand segmentation of mitochondria is tedious and time-consuming and also special knowledge about the mitochondria is needed. Fully automatic segmentation methods lead to over-segmentation and mitochondria are not segmented properly. Therefore, semi-automatic segmentation methods with minimum manual effort are required to edit the results of fully automatic segmentation methods. Here two editing tools were implemented by applying spline surface dragging and interactive live-wire segmentation tools. These editing tools were applied separately to the results of fully automatic segmentation. 3D extension of these tools was also studied and tested. Dice coefficients of 2D and 3D for surface dragging using splines were 0.93 and 0.92. This metric for 2D and 3D for live-wire method were 0.94 and 0.91 respectively. The root mean square symmetric surface distance values of 2D and 3D for surface dragging was measured as 0.69, 0.93. The same metrics for live-wire tool were 0.60 and 2.11. Comparing the results of these editing tools with the results of automatic segmentation method, it shows that these editing tools, led to better results and these results were more similar to ground truth image but the required time was higher than hand-segmentation time

Keywords: medical image segmentation, semi-automatic methods, transmission electron microscopy, surface dragging using splines, live-wire

Procedia PDF Downloads 138