Search results for: word segmentation and recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2757

Search results for: word segmentation and recognition

2697 An Image Segmentation Algorithm for Gradient Target Based on Mean-Shift and Dictionary Learning

Authors: Yanwen Li, Shuguo Xie

Abstract:

In electromagnetic imaging, because of the diffraction limited system, the pixel values could change slowly near the edge of the image targets and they also change with the location in the same target. Using traditional digital image segmentation methods to segment electromagnetic gradient images could result in lots of errors because of this change in pixel values. To address this issue, this paper proposes a novel image segmentation and extraction algorithm based on Mean-Shift and dictionary learning. Firstly, the preliminary segmentation results from adaptive bandwidth Mean-Shift algorithm are expanded, merged and extracted. Then the overlap rate of the extracted image block is detected before determining a segmentation region with a single complete target. Last, the gradient edge of the extracted targets is recovered and reconstructed by using a dictionary-learning algorithm, while the final segmentation results are obtained which are very close to the gradient target in the original image. Both the experimental results and the simulated results show that the segmentation results are very accurate. The Dice coefficients are improved by 70% to 80% compared with the Mean-Shift only method.

Keywords: gradient image, segmentation and extract, mean-shift algorithm, dictionary iearning

Procedia PDF Downloads 263
2696 Effective Texture Features for Segmented Mammogram Images Based on Multi-Region of Interest Segmentation Method

Authors: Ramayanam Suresh, A. Nagaraja Rao, B. Eswara Reddy

Abstract:

Texture features of mammogram images are useful for finding masses or cancer cases in mammography, which have been used by radiologists. Textures are greatly succeeded for segmented images rather than normal images. It is necessary to perform segmentation for exclusive specification of cancer and non-cancer regions separately. Region of interest (ROI) is most commonly used technique for mammogram segmentation. Limitation of this method is that it is unable to explore segmentation for large collection of mammogram images. Therefore, this paper is proposed multi-ROI segmentation for addressing the above limitation. It supports greatly in finding the best texture features of mammogram images. Experimental study demonstrates the effectiveness of proposed work using benchmarked images.

Keywords: texture features, region of interest, multi-ROI segmentation, benchmarked images

Procedia PDF Downloads 305
2695 Sign Language Recognition of Static Gestures Using Kinect™ and Convolutional Neural Networks

Authors: Rohit Semwal, Shivam Arora, Saurav, Sangita Roy

Abstract:

This work proposes a supervised framework with deep convolutional neural networks (CNNs) for vision-based sign language recognition of static gestures. Our approach addresses the acquisition and segmentation of correct inputs for the CNN-based classifier. Microsoft Kinect™ sensor, despite complex environmental conditions, can track hands efficiently. Skin Colour based segmentation is applied on cropped images of hands in different poses, used to depict different sign language gestures. The segmented hand images are used as an input for our classifier. The CNN classifier proposed in the paper is able to classify the input images with a high degree of accuracy. The system was trained and tested on 39 static sign language gestures, including 26 letters of the alphabet and 13 commonly used words. This paper includes a problem definition for building the proposed system, which acts as a sign language translator between deaf/mute and the rest of the society. It is then followed by a focus on reviewing existing knowledge in the area and work done by other researchers. It also describes the working principles behind different components of CNNs in brief. The architecture and system design specifications of the proposed system are discussed in the subsequent sections of the paper to give the reader a clear picture of the system in terms of the capability required. The design then gives the top-level details of how the proposed system meets the requirements.

Keywords: sign language, CNN, HCI, segmentation

Procedia PDF Downloads 156
2694 The Facilitatory Effect of Phonological Priming on Visual Word Recognition in Arabic as a Function of Lexicality and Overlap Positions

Authors: Ali Al Moussaoui

Abstract:

An experiment was designed to assess the performance of 24 Lebanese adults (mean age 29:5 years) in a lexical decision making (LDM) task to find out how the facilitatory effect of phonological priming (PP) affects the speed of visual word recognition in Arabic as lexicality (wordhood) and phonological overlap positions (POP) vary. The experiment falls in line with previous research on phonological priming in the light of the cohort theory and in relation to visual word recognition. The experiment also departs from the research on the Arabic language in which the importance of the consonantal root as a distinct morphological unit is confirmed. Based on previous research, it is hypothesized that (1) PP has a facilitating effect in LDM with words but not with nonwords and (2) final phonological overlap between the prime and the target is more facilitatory than initial overlap. An LDM task was programmed on PsychoPy application. Participants had to decide if a target (e.g., bayn ‘between’) preceded by a prime (e.g., bayt ‘house’) is a word or not. There were 4 conditions: no PP (NP), nonwords priming nonwords (NN), nonwords priming words (NW), and words priming words (WW). The conditions were simultaneously controlled for word length, wordhood, and POP. The interstimulus interval was 700 ms. Within the PP conditions, POP was controlled for in which there were 3 overlap positions between the primes and the targets: initial (e.g., asad ‘lion’ and asaf ‘sorrow’), final (e.g., kattab ‘cause to write’ 2sg-mas and rattab ‘organize’ 2sg-mas), or two-segmented (e.g., namle ‘ant’ and naħle ‘bee’). There were 96 trials, 24 in each condition, using a within-subject design. The results show that concerning (1), the highest average reaction time (RT) is that in NN, followed firstly by NW and finally by WW. There is statistical significance only between the pairs NN-NW and NN-WW. Regarding (2), the shortest RT is that in the two-segmented overlap condition, followed by the final POP in the first place and the initial POP in the last place. The difference between the two-segmented and the initial overlap is significant, while other pairwise comparisons are not. Based on these results, PP emerges as a facilitatory phenomenon that is highly sensitive to lexicality and POP. While PP can have a facilitating effect under lexicality, it shows no facilitation in its absence, which intersects with several previous findings. Participants are found to be more sensitive to the final phonological overlap than the initial overlap, which also coincides with a body of earlier literature. The results contradict the cohort theory’s stress on the onset overlap position and, instead, give more weight to final overlap, and even heavier weight to the two-segmented one. In conclusion, this study confirms the facilitating effect of PP with words but not when stimuli (at least the primes and at most both the primes and targets) are nonwords. It also shows that the two-segmented priming is the most influential in LDM in Arabic.

Keywords: lexicality, phonological overlap positions, phonological priming, visual word recognition

Procedia PDF Downloads 183
2693 Review of the Software Used for 3D Volumetric Reconstruction of the Liver

Authors: P. Strakos, M. Jaros, T. Karasek, T. Kozubek, P. Vavra, T. Jonszta

Abstract:

In medical imaging, segmentation of different areas of human body like bones, organs, tissues, etc. is an important issue. Image segmentation allows isolating the object of interest for further processing that can lead for example to 3D model reconstruction of whole organs. Difficulty of this procedure varies from trivial for bones to quite difficult for organs like liver. The liver is being considered as one of the most difficult human body organ to segment. It is mainly for its complexity, shape versatility and proximity of other organs and tissues. Due to this facts usually substantial user effort has to be applied to obtain satisfactory results of the image segmentation. Process of image segmentation then deteriorates from automatic or semi-automatic to fairly manual one. In this paper, overview of selected available software applications that can handle semi-automatic image segmentation with further 3D volume reconstruction of human liver is presented. The applications are being evaluated based on the segmentation results of several consecutive DICOM images covering the abdominal area of the human body.

Keywords: image segmentation, semi-automatic, software, 3D volumetric reconstruction

Procedia PDF Downloads 287
2692 Abdominal Organ Segmentation in CT Images Based On Watershed Transform and Mosaic Image

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

Accurate Liver, spleen and kidneys segmentation in abdominal CT images is one of the most important steps for computer aided abdominal organs pathology diagnosis. In this paper, we have proposed a new semi-automatic algorithm for Liver, spleen and kidneys area extraction in abdominal CT images. Our proposed method is based on hierarchical segmentation and watershed algorithm. In our approach, a powerful technique has been designed to suppress over-segmentation based on mosaic image and on the computation of the watershed transform. The algorithm is currency in two parts. In the first, we seek to improve the quality of the gradient-mosaic image. In this step, we propose a method for improving the gradient-mosaic image by applying the anisotropic diffusion filter followed by the morphological filters. Thereafter we proceed to the hierarchical segmentation of the liver, spleen and kidney. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.

Keywords: anisotropic diffusion filter, CT images, morphological filter, mosaic image, multi-abdominal organ segmentation, mosaic image, the watershed algorithm

Procedia PDF Downloads 496
2691 Image Segmentation Techniques: Review

Authors: Lindani Mbatha, Suvendi Rimer, Mpho Gololo

Abstract:

Image segmentation is the process of dividing an image into several sections, such as the object's background and the foreground. It is a critical technique in both image-processing tasks and computer vision. Most of the image segmentation algorithms have been developed for gray-scale images and little research and algorithms have been developed for the color images. Most image segmentation algorithms or techniques vary based on the input data and the application. Nearly all of the techniques are not suitable for noisy environments. Most of the work that has been done uses the Markov Random Field (MRF), which involves the computations and is said to be robust to noise. In the past recent years' image segmentation has been brought to tackle problems such as easy processing of an image, interpretation of the contents of an image, and easy analysing of an image. This article reviews and summarizes some of the image segmentation techniques and algorithms that have been developed in the past years. The techniques include neural networks (CNN), edge-based techniques, region growing, clustering, and thresholding techniques and so on. The advantages and disadvantages of medical ultrasound image segmentation techniques are also discussed. The article also addresses the applications and potential future developments that can be done around image segmentation. This review article concludes with the fact that no technique is perfectly suitable for the segmentation of all different types of images, but the use of hybrid techniques yields more accurate and efficient results.

Keywords: clustering-based, convolution-network, edge-based, region-growing

Procedia PDF Downloads 93
2690 Fruit Identification System in Sweet Orange Citrus (L.) Osbeck Using Thermal Imaging and Fuzzy

Authors: Ingrid Argote, John Archila, Marcelo Becker

Abstract:

In agriculture, intelligent systems applications have generated great advances in automating some of the processes in the production chain. In order to improve the efficiency of those systems is proposed a vision system to estimate the amount of fruits in sweet orange trees. This work presents a system proposal using capture of thermal images and fuzzy logic. A bibliographical review has been done to analyze the state-of-the-art of the different systems used in fruit recognition, and also the different applications of thermography in agricultural systems. The algorithm developed for this project uses the metrics of the fuzzines parameter to the contrast improvement and segmentation of the image, for the counting algorith m was used the Hough transform. In order to validate the proposed algorithm was created a bank of images of sweet orange Citrus (L.) Osbeck acquired in the Maringá Farm. The tests with the algorithm Indicated that the variation of the tree branch temperature and the fruit is not very high, Which makes the process of image segmentation using this differentiates, This Increases the amount of false positives in the fruit counting algorithm. Recognition of fruits isolated with the proposed algorithm present an overall accuracy of 90.5 % and grouped fruits. The accuracy was 81.3 %. The experiments show the need for a more suitable hardware to have a better recognition of small temperature changes in the image.

Keywords: Agricultural systems, Citrus, Fuzzy logic, Thermal images.

Procedia PDF Downloads 228
2689 An Automatic Speech Recognition Tool for the Filipino Language Using the HTK System

Authors: John Lorenzo Bautista, Yoon-Joong Kim

Abstract:

This paper presents the development of a Filipino speech recognition tool using the HTK System. The system was trained from a subset of the Filipino Speech Corpus developed by the DSP Laboratory of the University of the Philippines-Diliman. The speech corpus was both used in training and testing the system by estimating the parameters for phonetic HMM-based (Hidden-Markov Model) acoustic models. Experiments on different mixture-weights were incorporated in the study. The phoneme-level word-based recognition of a 5-state HMM resulted in an average accuracy rate of 80.13 for a single-Gaussian mixture model, 81.13 after implementing a phoneme-alignment, and 87.19 for the increased Gaussian-mixture weight model. The highest accuracy rate of 88.70% was obtained from a 5-state model with 6 Gaussian mixtures.

Keywords: Filipino language, Hidden Markov Model, HTK system, speech recognition

Procedia PDF Downloads 478
2688 Morphology Operation and Discrete Wavelet Transform for Blood Vessels Segmentation in Retina Fundus

Authors: Rita Magdalena, N. K. Caecar Pratiwi, Yunendah Nur Fuadah, Sofia Saidah, Bima Sakti

Abstract:

Vessel segmentation of retinal fundus is important for biomedical sciences in diagnosing ailments related to the eye. Segmentation can simplify medical experts in diagnosing retinal fundus image state. Therefore, in this study, we designed a software using MATLAB which enables the segmentation of the retinal blood vessels on retinal fundus images. There are two main steps in the process of segmentation. The first step is image preprocessing that aims to improve the quality of the image to be optimum segmented. The second step is the image segmentation in order to perform the extraction process to retrieve the retina’s blood vessel from the eye fundus image. The image segmentation methods that will be analyzed in this study are Morphology Operation, Discrete Wavelet Transform and combination of both. The amount of data that used in this project is 40 for the retinal image and 40 for manually segmentation image. After doing some testing scenarios, the average accuracy for Morphology Operation method is 88.46 % while for Discrete Wavelet Transform is 89.28 %. By combining the two methods mentioned in later, the average accuracy was increased to 89.53 %. The result of this study is an image processing system that can segment the blood vessels in retinal fundus with high accuracy and low computation time.

Keywords: discrete wavelet transform, fundus retina, morphology operation, segmentation, vessel

Procedia PDF Downloads 192
2687 Computer-Aided Detection of Liver and Spleen from CT Scans using Watershed Algorithm

Authors: Belgherbi Aicha, Bessaid Abdelhafid

Abstract:

In the recent years a great deal of research work has been devoted to the development of semi-automatic and automatic techniques for the analysis of abdominal CT images. The first and fundamental step in all these studies is the semi-automatic liver and spleen segmentation that is still an open problem. In this paper, a semi-automatic liver and spleen segmentation method by the mathematical morphology based on watershed algorithm has been proposed. Our algorithm is currency in two parts. In the first, we seek to determine the region of interest by applying the morphological to extract the liver and spleen. The second step consists to improve the quality of the image gradient. In this step, we propose a method for improving the image gradient to reduce the over-segmentation problem by applying the spatial filters followed by the morphological filters. Thereafter we proceed to the segmentation of the liver, spleen. The aim of this work is to develop a method for semi-automatic segmentation liver and spleen based on watershed algorithm, improve the accuracy and the robustness of the liver and spleen segmentation and evaluate a new semi-automatic approach with the manual for liver segmentation. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work. The system has been evaluated by computing the sensitivity and specificity between the semi-automatically segmented (liver and spleen) contour and the manually contour traced by radiological experts. Liver segmentation has achieved the sensitivity and specificity; sens Liver=96% and specif Liver=99% respectively. Spleen segmentation achieves similar, promising results sens Spleen=95% and specif Spleen=99%.

Keywords: CT images, liver and spleen segmentation, anisotropic diffusion filter, morphological filters, watershed algorithm

Procedia PDF Downloads 323
2686 A Neural Approach for Color-Textured Images Segmentation

Authors: Khalid Salhi, El Miloud Jaara, Mohammed Talibi Alaoui

Abstract:

In this paper, we present a neural approach for unsupervised natural color-texture image segmentation, which is based on both Kohonen maps and mathematical morphology, using a combination of the texture and the image color information of the image, namely, the fractal features based on fractal dimension are selected to present the information texture, and the color features presented in RGB color space. These features are then used to train the network Kohonen, which will be represented by the underlying probability density function, the segmentation of this map is made by morphological watershed transformation. The performance of our color-texture segmentation approach is compared first, to color-based methods or texture-based methods only, and then to k-means method.

Keywords: segmentation, color-texture, neural networks, fractal, watershed

Procedia PDF Downloads 344
2685 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 273
2684 Defect Detection for Nanofibrous Images with Deep Learning-Based Approaches

Authors: Gaokai Liu

Abstract:

Automatic defect detection for nanomaterial images is widely required in industrial scenarios. Deep learning approaches are considered as the most effective solutions for the great majority of image-based tasks. In this paper, an edge guidance network for defect segmentation is proposed. First, the encoder path with multiple convolution and downsampling operations is applied to the acquisition of shared features. Then two decoder paths both are connected to the last convolution layer of the encoder and supervised by the edge and segmentation labels, respectively, to guide the whole training process. Meanwhile, the edge and encoder outputs from the same stage are concatenated to the segmentation corresponding part to further tune the segmentation result. Finally, the effectiveness of the proposed method is verified via the experiments on open nanofibrous datasets.

Keywords: deep learning, defect detection, image segmentation, nanomaterials

Procedia PDF Downloads 146
2683 A Distinct Method Based on Mamba-Unet for Brain Tumor Image Segmentation

Authors: Djallel Bouamama, Yasser R. Haddadi

Abstract:

Accurate brain tumor segmentation is crucial for diagnosis and treatment planning, yet it remains a challenging task due to the variability in tumor shapes and intensities. This paper introduces a distinct approach to brain tumor image segmentation by leveraging an advanced architecture known as Mamba-Unet. Building on the well-established U-Net framework, Mamba-Unet incorporates distinct design enhancements to improve segmentation performance. Our proposed method integrates a multi-scale attention mechanism and a hybrid loss function to effectively capture fine-grained details and contextual information in brain MRI scans. We demonstrate that Mamba-Unet significantly enhances segmentation accuracy compared to conventional U-Net models by utilizing a comprehensive dataset of annotated brain MRI scans. Quantitative evaluations reveal that Mamba-Unet surpasses traditional U-Net architectures and other contemporary segmentation models regarding Dice coefficient, sensitivity, and specificity. The improvements are attributed to the method's ability to manage class imbalance better and resolve complex tumor boundaries. This work advances the state-of-the-art in brain tumor segmentation and holds promise for improving clinical workflows and patient outcomes through more precise and reliable tumor detection.

Keywords: brain tumor classification, image segmentation, CNN, U-NET

Procedia PDF Downloads 30
2682 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 173
2681 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images

Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat

Abstract:

The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.

Keywords: image segmentation, clustering, GUI, 2D MRI

Procedia PDF Downloads 374
2680 TransDrift: Modeling Word-Embedding Drift Using Transformer

Authors: Nishtha Madaan, Prateek Chaudhury, Nishant Kumar, Srikanta Bedathur

Abstract:

In modern NLP applications, word embeddings are a crucial backbone that can be readily shared across a number of tasks. However, as the text distributions change and word semantics evolve over time, the downstream applications using the embeddings can suffer if the word representations do not conform to the data drift. Thus, maintaining word embeddings to be consistent with the underlying data distribution is a key problem. In this work, we tackle this problem and propose TransDrift, a transformer-based prediction model for word embeddings. Leveraging the flexibility of the transformer, our model accurately learns the dynamics of the embedding drift and predicts future embedding. In experiments, we compare with existing methods and show that our model makes significantly more accurate predictions of the word embedding than the baselines. Crucially, by applying the predicted embeddings as a backbone for downstream classification tasks, we show that our embeddings lead to superior performance compared to the previous methods.

Keywords: NLP applications, transformers, Word2vec, drift, word embeddings

Procedia PDF Downloads 88
2679 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 138
2678 Segmentation of the Liver and Spleen From Abdominal CT Images Using Watershed Approach

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

The phase of segmentation is an important step in the processing and interpretation of medical images. In this paper, we focus on the segmentation of liver and spleen from the abdomen computed tomography (CT) images. The importance of our study comes from the fact that the segmentation of ROI from CT images is usually a difficult task. This difficulty is the gray’s level of which is similar to the other organ also the ROI are connected to the ribs, heart, kidneys, etc. Our proposed method is based on the anatomical information and mathematical morphology tools used in the image processing field. At first, we try to remove the surrounding and connected organs and tissues by applying morphological filters. This first step makes the extraction of interest regions easier. The second step consists of improving the quality of the image gradient. In this step, we propose a method for improving the image gradient to reduce these deficiencies by applying the spatial filters followed by the morphological filters. Thereafter we proceed to the segmentation of the liver, spleen. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.The system has been evaluated by computing the sensitivity and specificity between the semi-automatically segmented (liver and spleen) contour and the manually contour traced by radiological experts.

Keywords: CT images, liver and spleen segmentation, anisotropic diffusion filter, morphological filters, watershed algorithm

Procedia PDF Downloads 492
2677 Contrastive Learning for Unsupervised Object Segmentation in Sequential Images

Authors: Tian Zhang

Abstract:

Unsupervised object segmentation aims at segmenting objects in sequential images and obtaining the mask of each object without any manual intervention. Unsupervised segmentation remains a challenging task due to the lack of prior knowledge about these objects. Previous methods often require manually specifying the action of each object, which is often difficult to obtain. Instead, this paper does not need action information of objects and automatically learns the actions and relations among objects from the structured environment. To obtain the object segmentation of sequential images, the relationships between objects and images are extracted to infer the action and interaction of objects based on the multi-head attention mechanism. Three types of objects’ relationships in the object segmentation task are proposed: the relationship between objects in the same frame, the relationship between objects in two frames, and the relationship between objects and historical information. Based on these relationships, the proposed model (1) is effective in multiple objects segmentation tasks, (2) just needs images as input, and (3) produces better segmentation results as more relationships are considered. The experimental results on multiple datasets show that this paper’s method achieves state-of-art performance. The quantitative and qualitative analyses of the result are conducted. The proposed method could be easily extended to other similar applications.

Keywords: unsupervised object segmentation, attention mechanism, contrastive learning, structured environment

Procedia PDF Downloads 105
2676 Handwriting Recognition of Gurmukhi Script: A Survey of Online and Offline Techniques

Authors: Ravneet Kaur

Abstract:

Character recognition is a very interesting area of pattern recognition. From past few decades, an intensive research on character recognition for Roman, Chinese, and Japanese and Indian scripts have been reported. In this paper, a review of Handwritten Character Recognition work on Indian Script Gurmukhi is being highlighted. Most of the published papers were summarized, various methodologies were analysed and their results are reported.

Keywords: Gurmukhi character recognition, online, offline, HCR survey

Procedia PDF Downloads 422
2675 The Influence of Noise on Aerial Image Semantic Segmentation

Authors: Pengchao Wei, Xiangzhong Fang

Abstract:

Noise is ubiquitous in this world. Denoising is an essential technology, especially in image semantic segmentation, where noises are generally categorized into two main types i.e. feature noise and label noise. The main focus of this paper is aiming at modeling label noise, investigating the behaviors of different types of label noise on image semantic segmentation tasks using K-Nearest-Neighbor and Convolutional Neural Network classifier. The performance without label noise and with is evaluated and illustrated in this paper. In addition to that, the influence of feature noise on the image semantic segmentation task is researched as well and a feature noise reduction method is applied to mitigate its influence in the learning procedure.

Keywords: convolutional neural network, denoising, feature noise, image semantic segmentation, k-nearest-neighbor, label noise

Procedia PDF Downloads 218
2674 Automatic Number Plate Recognition System Based on Deep Learning

Authors: T. Damak, O. Kriaa, A. Baccar, M. A. Ben Ayed, N. Masmoudi

Abstract:

In the last few years, Automatic Number Plate Recognition (ANPR) systems have become widely used in the safety, the security, and the commercial aspects. Forethought, several methods and techniques are computing to achieve the better levels in terms of accuracy and real time execution. This paper proposed a computer vision algorithm of Number Plate Localization (NPL) and Characters Segmentation (CS). In addition, it proposed an improved method in Optical Character Recognition (OCR) based on Deep Learning (DL) techniques. In order to identify the number of detected plate after NPL and CS steps, the Convolutional Neural Network (CNN) algorithm is proposed. A DL model is developed using four convolution layers, two layers of Maxpooling, and six layers of fully connected. The model was trained by number image database on the Jetson TX2 NVIDIA target. The accuracy result has achieved 95.84%.

Keywords: ANPR, CS, CNN, deep learning, NPL

Procedia PDF Downloads 304
2673 Maximum Entropy Based Image Segmentation of Human Skin Lesion

Authors: Sheema Shuja Khattak, Gule Saman, Imran Khan, Abdus Salam

Abstract:

Image segmentation plays an important role in medical imaging applications. Therefore, accurate methods are needed for the successful segmentation of medical images for diagnosis and detection of various diseases. In this paper, we have used maximum entropy to achieve image segmentation. Maximum entropy has been calculated using Shannon, Renyi, and Tsallis entropies. This work has novelty based on the detection of skin lesion caused by the bite of a parasite called Sand Fly causing the disease is called Cutaneous Leishmaniasis.

Keywords: shannon, maximum entropy, Renyi, Tsallis entropy

Procedia PDF Downloads 461
2672 OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text

Authors: A. R. Bagirzade, A. Sh. Najafova, S. M. Yessirkepova, E. S. Albert

Abstract:

This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication.

Keywords: ABBYY FineReader system, algorithm symbol recognition, OCR/ICR techniques, recognition technologies

Procedia PDF Downloads 166
2671 Multi-Atlas Segmentation Based on Dynamic Energy Model: Application to Brain MR Images

Authors: Jie Huo, Jonathan Wu

Abstract:

Segmentation of anatomical structures in medical images is essential for scientific inquiry into the complex relationships between biological structure and clinical diagnosis, treatment and assessment. As a method of incorporating the prior knowledge and the anatomical structure similarity between a target image and atlases, multi-atlas segmentation has been successfully applied in segmenting a variety of medical images, including the brain, cardiac, and abdominal images. The basic idea of multi-atlas segmentation is to transfer the labels in atlases to the coordinate of the target image by matching the target patch to the atlas patch in the neighborhood. However, this technique is limited by the pairwise registration between target image and atlases. In this paper, a novel multi-atlas segmentation approach is proposed by introducing a dynamic energy model. First, the target is mapped to each atlas image by minimizing the dynamic energy function, then the segmentation of target image is generated by weighted fusion based on the energy. The method is tested on MICCAI 2012 Multi-Atlas Labeling Challenge dataset which includes 20 target images and 15 atlases images. The paper also analyzes the influence of different parameters of the dynamic energy model on the segmentation accuracy and measures the dice coefficient by using different feature terms with the energy model. The highest mean dice coefficient obtained with the proposed method is 0.861, which is competitive compared with the recently published method.

Keywords: brain MRI segmentation, dynamic energy model, multi-atlas segmentation, energy minimization

Procedia PDF Downloads 333
2670 The Influence of Audio on Perceived Quality of Segmentation

Authors: Silvio Ricardo Rodrigues Sanches, Bianca Cogo Barbosa, Beatriz Regina Brum, Cléber Gimenez Corrêa

Abstract:

To evaluate the quality of a segmentation algorithm, the authors use subjective or objective metrics. Although subjective metrics are more accurate than objective ones, objective metrics do not require user feedback to test an algorithm. Objective metrics require subjective experiments only during their development. Subjective experiments typically display to users some videos (generated from frames with segmentation errors) that simulate the environment of an application domain. This user feedback is crucial information for metric definition. In the subjective experiments applied to develop some state-of-the-art metrics used to test segmentation algorithms, the videos displayed during the experiments did not contain audio. Audio is an essential component in applications such as videoconference and augmented reality. If the audio influences the user’s perception, using only videos without audio in subjective experiments can compromise the efficiency of an objective metric generated using data from these experiments. This work aims to identify if the audio influences the user’s perception of segmentation quality in background substitution applications with audio. The proposed approach used a subjective method based on formal video quality assessment methods. The results showed that audio influences the quality of segmentation perceived by a user.

Keywords: background substitution, influence of audio, segmentation evaluation, segmentation quality

Procedia PDF Downloads 115
2669 Iris Feature Extraction and Recognition Based on Two-Dimensional Gabor Wavelength Transform

Authors: Bamidele Samson Alobalorun, Ifedotun Roseline Idowu

Abstract:

Biometrics technologies apply the human body parts for their unique and reliable identification based on physiological traits. The iris recognition system is a biometric–based method for identification. The human iris has some discriminating characteristics which provide efficiency to the method. In order to achieve this efficiency, there is a need for feature extraction of the distinct features from the human iris in order to generate accurate authentication of persons. In this study, an approach for an iris recognition system using 2D Gabor for feature extraction is applied to iris templates. The 2D Gabor filter formulated the patterns that were used for training and equally sent to the hamming distance matching technique for recognition. A comparison of results is presented using two iris image subjects of different matching indices of 1,2,3,4,5 filter based on the CASIA iris image database. By comparing the two subject results, the actual computational time of the developed models, which is measured in terms of training and average testing time in processing the hamming distance classifier, is found with best recognition accuracy of 96.11% after capturing the iris localization or segmentation using the Daughman’s Integro-differential, the normalization is confined to the Daugman’s rubber sheet model.

Keywords: Daugman rubber sheet, feature extraction, Hamming distance, iris recognition system, 2D Gabor wavelet transform

Procedia PDF Downloads 63
2668 Computer-Aided Detection of Simultaneous Abdominal Organ CT Images by Iterative Watershed Transform

Authors: Belgherbi Aicha, Hadjidj Ismahen, Bessaid Abdelhafid

Abstract:

Interpretation of medical images benefits from anatomical and physiological priors to optimize computer-aided diagnosis applications. Segmentation of liver, spleen and kidneys is regarded as a major primary step in the computer-aided diagnosis of abdominal organ diseases. In this paper, a semi-automated method for medical image data is presented for the abdominal organ segmentation data using mathematical morphology. Our proposed method is based on hierarchical segmentation and watershed algorithm. In our approach, a powerful technique has been designed to suppress over-segmentation based on mosaic image and on the computation of the watershed transform. Our algorithm is currency in two parts. In the first, we seek to improve the quality of the gradient-mosaic image. In this step, we propose a method for improving the gradient-mosaic image by applying the anisotropic diffusion filter followed by the morphological filters. Thereafter, we proceed to the hierarchical segmentation of the liver, spleen and kidney. To validate the segmentation technique proposed, we have tested it on several images. Our segmentation approach is evaluated by comparing our results with the manual segmentation performed by an expert. The experimental results are described in the last part of this work.

Keywords: anisotropic diffusion filter, CT images, morphological filter, mosaic image, simultaneous organ segmentation, the watershed algorithm

Procedia PDF Downloads 438