Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2405

Search results for: retinal fundus images

2015 A Comparative Study of Deep Learning Methods for COVID-19 Detection

Abstract:

COVID 19 is a pandemic which has resulted in thousands of deaths around the world and a huge impact on the global economy. Testing is a huge issue as the test kits have limited availability and are expensive to manufacture. Using deep learning methods on radiology images in the detection of the coronavirus as these images contain information about the spread of the virus in the lungs is extremely economical and time-saving as it can be used in areas with a lack of testing facilities. This paper focuses on binary classification and multi-class classification of COVID 19 and other diseases such as pneumonia, tuberculosis, etc. Different deep learning methods such as VGG-19, COVID-Net, ResNET+ SVM, Deep CNN, DarkCovidnet, etc., have been used, and their accuracy has been compared using the Chest X-Ray dataset.

Keywords: deep learning, computer vision, radiology, COVID-19, ResNet, VGG-19, deep neural networks

Procedia PDF Downloads 122

2014 Transparency Phenomenon in Kuew Teow

Authors: Muhammad Heikal Ismail, Law Chung Lim, Hii Ching Lik

Abstract:

In maintaining food quality and shelf life, drying is employed in food industry as the most reliable perseverance technique. In this way, heat pump drying and hot air drying of fresh rice noodles was deduced to freeze drying in achieving quality attributes of oil content Scanning Electron Microscope (SEM) images, texture, and colour. Soxthlet analysis shows freeze dried noodles contain more than 10 times oil content, distinct pores of SEM images, higher hardness by more than three times, and wider colour changes by average more than two times to both methods to explain the less transparency physical outlook of freeze dried samples.

Keywords: freeze drying, heat pump drying, noodles, Soxthlet

Procedia PDF Downloads 460

2013 Analysis of Vocal Fold Vibrations from High-Speed Digital Images Based on Dynamic Time Warping

Authors: A. I. A. Rahman, Sh-Hussain Salleh, K. Ahmad, K. Anuar

Abstract:

Analysis of vocal fold vibration is essential for understanding the mechanism of voice production and for improving clinical assessment of voice disorders. This paper presents a Dynamic Time Warping (DTW) based approach to analyze and objectively classify vocal fold vibration patterns. The proposed technique was designed and implemented on a Glottal Area Waveform (GAW) extracted from high-speed laryngeal images by delineating the glottal edges for each image frame. Feature extraction from the GAW was performed using Linear Predictive Coding (LPC). Several types of voice reference templates from simulations of clear, breathy, fry, pressed and hyperfunctional voice productions were used. The patterns of the reference templates were first verified using the analytical signal generated through Hilbert transformation of the GAW. Samples from normal speakers’ voice recordings were then used to evaluate and test the effectiveness of this approach. The classification of the voice patterns using the technique of LPC and DTW gave the accuracy of 81%.

Keywords: dynamic time warping, glottal area waveform, linear predictive coding, high-speed laryngeal images, Hilbert transform

Procedia PDF Downloads 213

2012 Alteration of Bone Strength in Osteoporosis of Mouse Femora: Computational Study Based on Micro CT Images

Authors: Changsoo Chon, Sangkuy Han, Donghyun Seo, Jihyung Park, Bokku Kang, Hansung Kim, Keyoungjin Chun, Cheolwoong Ko

Abstract:

The purpose of the study is to develop a finite element model based on 3D bone structural images of Micro-CT and to analyze the stress distribution for the osteoporosis mouse femora. In this study, results of finite element analysis show that the early osteoporosis of mouse model decreased a bone density in trabecular region; however, the bone density in cortical region increased.

Keywords: micro-CT, finite element analysis, osteoporosis, bone strength

Procedia PDF Downloads 336

2011 Visual Preferences of Elementary School Children with Autism Spectrum Disorder: An Experimental Study

Authors: Larissa Pliska, Isabel Neitzel, Michael Buschermöhle, Olga Kunina-Habenicht, Ute Ritterfeld

Abstract:

Visual preferences that can be assessed via eye tracking technologies are considered one of the defining hallmarks of Autism Spectrum Disorder (ASD). Specifically, children with ASD show a decreased preference for social compared to geometric images compared to typically developed (TD) children. Such differences are already prevalent at a very early age and indicate the severity of the disorder: toddlers with ASD who preferred the geometric images when presented with social and geometric images also showed higher ASD symptom severity than toddlers with ASD who showed higher social attention. Furthermore, the complexity of social pictures (one child playing vs. two children playing together) as well as the mode of stimulus presentation (video or image), appear to have no influence on the hallmark. Although such visual preferences are also a hallmark of the diagnosis of ASD, studies have primarily been conducted with toddlers and preschool children. Since the age for diagnosis often falls into this age group – the average age of diagnosis for ASD in Germany is 6.5 years – we were investigating whether visual preferences (1) persist into this age range and (2) might be used for a technology-based screening. We examined the visual preferences of 16 boys aged 6 to 11 with ASD and normal cognition and TD children (1:1 matching) within an experimental setting. Matching criteria are the children's age and the parent's level of education. Different stimulus presentation formats (images vs. videos) and different levels of stimulus complexity are included. Children with and without ASD always receive pairs of social and non-social images and video stimuli on a screen. For images, the social stimuli show one or more children playing whereas the non-social show images of the universe. For videos, the social stimuli show a man or a woman making faces, and the non-social are dynamic geometric shapes. During stimulus presentation of approx. 10 s length by image and of approx. 18 s length by video, eye movements (i.e., eye position and gaze direction) are captured. Therefore, KIZMO GmbH developed a customized, native iOS app (KIZMO Face-Analyzer) for use on iPads. In addition, the preferences of the image stimuli are directly measured. Data collection is currently ongoing, with an expected sample size of 32 (16 children with ASD and 16 TD children). One expected finding is that children with ASD demonstrate lower attention (total fixation time) on social stimuli (images and video) compared to their TD peers, even at this elementary school age. We also expect that the stimulus presentation, including different complexity levels, will not affect social attention. The data for the group of children with ASD is already available, while the data for the control group will be collected in the coming months. Preliminary descriptive analyses for 16 children with ASD show that in 80 possible image comparisons 52 times the non-social one was selected, 8 times the social one and 20 are missing or cannot be correctly assigned. The results will be discussed concerning various clinical implications, e.g. implementation of an automated digital screening.

Keywords: autism spectrum disorder, visual preference, hallmark, eye movement

Procedia PDF Downloads 27

2010 A Comparison between Underwater Image Enhancement Techniques

Authors: Ouafa Benaida, Abdelhamid Loukil, Adda Ali Pacha

Abstract:

In recent years, the growing interest of scientists in the field of image processing and analysis of underwater images and videos has been strengthened following the emergence of new underwater exploration techniques, such as the emergence of autonomous underwater vehicles and the use of underwater image sensors facilitating the exploration of underwater mineral resources as well as the search for new species of aquatic life by biologists. Indeed, underwater images and videos have several defects and must be preprocessed before their analysis. Underwater landscapes are usually darkened due to the interaction of light with the marine environment: light is absorbed as it travels through deep waters depending on its wavelength. Additionally, light does not follow a linear direction but is scattered due to its interaction with microparticles in water, resulting in low contrast, low brightness, color distortion, and restricted visibility. The improvement of the underwater image is, therefore, more than necessary in order to facilitate its analysis. The research presented in this paper aims to implement and evaluate a set of classical techniques used in the field of improving the quality of underwater images in several color representation spaces. These methods have the particularity of being simple to implement and do not require prior knowledge of the physical model at the origin of the degradation.

Keywords: underwater image enhancement, histogram normalization, histogram equalization, contrast limited adaptive histogram equalization, single-scale retinex

Procedia PDF Downloads 61

2009 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 88

2008 2D Convolutional Networks for Automatic Segmentation of Knee Cartilage in 3D MRI

Authors: Ananya Ananya, Karthik Rao

Abstract:

Accurate segmentation of knee cartilage in 3-D magnetic resonance (MR) images for quantitative assessment of volume is crucial for studying and diagnosing osteoarthritis (OA) of the knee, one of the major causes of disability in elderly people. Radiologists generally perform this task in slice-by-slice manner taking 15-20 minutes per 3D image, and lead to high inter and intra observer variability. Hence automatic methods for knee cartilage segmentation are desirable and are an active field of research. This paper presents design and experimental evaluation of 2D convolutional neural networks based fully automated methods for knee cartilage segmentation in 3D MRI. The architectures are validated based on 40 test images and 60 training images from SKI10 dataset. The proposed methods segment 2D slices one by one, which are then combined to give segmentation for whole 3D images. Proposed methods are modified versions of U-net and dilated convolutions, consisting of a single step that segments the given image to 5 labels: background, femoral cartilage, tibia cartilage, femoral bone and tibia bone; cartilages being the primary components of interest. U-net consists of a contracting path and an expanding path, to capture context and localization respectively. Dilated convolutions lead to an exponential expansion of receptive field with only a linear increase in a number of parameters. A combination of modified U-net and dilated convolutions has also been explored. These architectures segment one 3D image in 8 – 10 seconds giving average volumetric Dice Score Coefficients (DSC) of 0.950 - 0.962 for femoral cartilage and 0.951 - 0.966 for tibia cartilage, reference being the manual segmentation.

Keywords: convolutional neural networks, dilated convolutions, 3 dimensional, fully automated, knee cartilage, MRI, segmentation, U-net

Procedia PDF Downloads 234

2007 Characterization and Monitoring of the Yarn Faults Using Diametric Fault System

Authors: S. M. Ishtiaque, V. K. Yadav, S. D. Joshi, J. K. Chatterjee

Abstract:

The DIAMETRIC FAULTS system has been developed that captures a bi-directional image of yarn continuously in sequentially manner and provides the detailed classification of faults. A novel mathematical framework developed on the acquired bi-directional images forms the basis of fault classification in four broad categories, namely, Thick1, Thick2, Thin and Normal Yarn. A discretised version of Radon transformation has been used to convert the bi-directional images into one-dimensional signals. Images were divided into training and test sample sets. Karhunen–Loève Transformation (KLT) basis is computed for the signals from the images in training set for each fault class taking top six highest energy eigen vectors. The fault class of the test image is identified by taking the Euclidean distance of its signal from its projection on the KLT basis for each sample realization and fault class in the training set. Euclidean distance applied using various techniques is used for classifying an unknown fault class. An accuracy of about 90% is achieved in detecting the correct fault class using the various techniques. The four broad fault classes were further sub classified in four sub groups based on the user set boundary limits for fault length and fault volume. The fault cross-sectional area and the fault length defines the total volume of fault. A distinct distribution of faults is found in terms of their volume and physical dimensions which can be used for monitoring the yarn faults. It has been shown from the configurational based characterization and classification that the spun yarn faults arising out of mass variation, exhibit distinct characteristics in terms of their contours, sizes and shapes apart from their frequency of occurrences.

Keywords: Euclidean distance, fault classification, KLT, Radon Transform

Procedia PDF Downloads 243

2006 An Image Processing Based Approach for Assessing Wheelchair Cushions

Authors: B. Farahani, R. Fadil, A. Aboonabi, B. Hoffmann, J. Loscheider, K. Tavakolian, S. Arzanpour

Abstract:

Wheelchair users spend long hours in a sitting position, and selecting the right cushion is highly critical in preventing pressure ulcers in that demographic. Pressure mapping systems (PMS) are typically used in clinical settings by therapists to identify the sitting profile and pressure points in the sitting area to select the cushion that fits the best for the users. A PMS is a flexible mat composed of arrays of distributed networks of flexible sensors. The output of the PMS systems is a color-coded image that shows the intensity of the pressure concentration. Therapists use the PMS images to compare different cushions fit for each user. This process is highly subjective and requires good visual memory for the best outcome. This paper aims to develop an image processing technique to analyze the images of PMS and provide an objective measure to assess the cushions based on their pressure distribution mappings. In this paper, we first reviewed the skeletal anatomy of the human sitting area and its relation to the PMS image. This knowledge is then used to identify the important features that must be considered in image processing. We then developed an algorithm based on those features to analyze the images and rank them according to their fit to the users' needs.

Keywords: dynamic cushion, image processing, pressure mapping system, wheelchair

Procedia PDF Downloads 139

2005 Design of Speed Bump Recognition System Integrated with Adjustable Shock Absorber Control

Authors: Ming-Yen Chang, Sheng-Hung Ke

Abstract:

This research focuses on the development of a speed bump identification system for real-time control of adjustable shock absorbers in vehicular suspension systems. The study initially involved the collection of images of various speed bumps, and rubber speed bump profiles found on roadways. These images were utilized for training and recognition purposes through the deep learning object detection algorithm YOLOv5. Subsequently, the trained speed bump identification program was integrated with an in-vehicle camera system for live image capture during driving. These images were instantly transmitted to a computer for processing. Using the principles of monocular vision ranging, the distance between the vehicle and an approaching speed bump was determined. The appropriate control distance was established through both practical vehicle measurements and theoretical calculations. Collaboratively, with the electronically adjustable shock absorbers equipped in the vehicle, a shock absorber control system was devised to dynamically adapt the damping force just prior to encountering a speed bump. This system effectively mitigates passenger discomfort and enhances ride quality.

Keywords: adjustable shock absorbers, image recognition, monocular vision ranging, ride

Procedia PDF Downloads 37

2004 An Image Enhancement Method Based on Curvelet Transform for CBCT-Images

Authors: Shahriar Farzam, Maryam Rastgarpour

Abstract:

Image denoising plays extremely important role in digital image processing. Enhancement of clinical image research based on Curvelet has been developed rapidly in recent years. In this paper, we present a method for image contrast enhancement for cone beam CT (CBCT) images based on fast discrete curvelet transforms (FDCT) that work through Unequally Spaced Fast Fourier Transform (USFFT). These transforms return a table of Curvelet transform coefficients indexed by a scale parameter, an orientation and a spatial location. Accordingly, the coefficients obtained from FDCT-USFFT can be modified in order to enhance contrast in an image. Our proposed method first uses a two-dimensional mathematical transform, namely the FDCT through unequal-space fast Fourier transform on input image and then applies thresholding on coefficients of Curvelet to enhance the CBCT images. Consequently, applying unequal-space fast Fourier Transform leads to an accurate reconstruction of the image with high resolution. The experimental results indicate the performance of the proposed method is superior to the existing ones in terms of Peak Signal to Noise Ratio (PSNR) and Effective Measure of Enhancement (EME).

Keywords: curvelet transform, CBCT, image enhancement, image denoising

Procedia PDF Downloads 266

2003 Clinical and Etiological Particularities of Infectious Uveitis in HIV+ and HIV- Patients in the Internal Medicine Department

Authors: N. Jait, M. Maamar, H. Khibri, H. Harmouche, N. Mouatssim, W. Ammouri, Z. Tazimezaelek, M. Adnaoui

Abstract:

Introduction: Uveitis presents with inflammation of the uvea, intraocular, of heterogeneous etiology and presentation. The objective of our study is to describe the clinical and therapeutic characteristics of infectious uveitis in HIV+ and HIV- patients. Patients and Methods: This is a retrospective study conducted at the internal medicine department of CHU Ibn Sina in Rabat over a period of 12 years (2010–2021), collecting 42 cases of infectious uveitis. Results: 42 patients were identified. 34% (14 cases) had acquired immunosuppression (9 cases: 22% had HIV infection and 12% were on chemotherapy), and 66% were immunocompetent. The M/F sex ratio was 1.1. The average age was 39 years old. Uveitis revealed HIV in a single case; 8/9 patients have already been followed, their average viral load is 3.4 log and an average CD4 count is 356/mm³. The revealing functional signs were: ocular redness (27%), decreased visual acuity (63%), visual blurring (40%), ocular pain (18%), scotoma (13%), and headaches (4%). The uveitis was site: anterior (30%), intermediate (6%), posterior (32%), and pan-uveitis (32%); unilateral in 80% of patients and bilateral in 20%. The etiologies of uveitis in HIV+ were: 3 cases of CMV, 2 cases of toxoplasmosis, 1 case of tuberculosis, 1 case of HSV, 1 case of VZV, and 1 case of syphilis. Etiologies of immunocompetent patients: tuberculosis (41%), toxoplasmosis (18%), syphilis (15%), CMV infection (4 cases: 10%), HSV infection (4 cases: 10%) , lepromatous uveitis (1 case: 2%), VZV infection (1 case: 2%), a locoregional infectious cause such as dental abscess (1 case: 2%), and one case of borreliosis (3% ). 50% of tuberculous uveitis was of the pan-uveitis type, 75% of the uveitis by toxoplasmosis was of the posterior type. Uveitis was associated with other pathologies in 2 seropositive cases (cerebral vasculitis, multifocal tuberculosis). A specific treatment was prescribed in all patients. The initial evolution was favorable in 67%, including 12% HIV+. 11% presented relapses of the same seat during uveitis of the toxoplasmic, tuberculous and herpetic type. 47% presented complications, of which 4 patients were HIV+: 3 retinal detachments; 7 Retinal hemorrhages. 6 unilateral blindness (including 2 HIV+ patients). Conclusion: In our series, the etiologies of infectious uveitis differ between HIV+ and HIV- patients. In HIV+ patients most often had toxoplasmosis and CMV, while HIV - patients mainly presented with tuberculosis and toxoplasmosis. The association between HIV and uveitis is undetermined, but HIV infection was an independent risk factor for uveitis.

Keywords: uveitis, HIV, immunosuppression, infection

Procedia PDF Downloads 68

2002 Size Reduction of Images Using Constraint Optimization Approach for Machine Communications

Authors: Chee Sun Won

Abstract:

This paper presents the size reduction of images for machine-to-machine communications. Here, the salient image regions to be preserved include the image patches of the key-points such as corners and blobs. Based on a saliency image map from the key-points and their image patches, an axis-aligned grid-size optimization is proposed for the reduction of image size. To increase the size-reduction efficiency the aspect ratio constraint is relaxed in the constraint optimization framework. The proposed method yields higher matching accuracy after the size reduction than the conventional content-aware image size-reduction methods.

Keywords: image compression, image matching, key-point detection and description, machine-to-machine communication

Procedia PDF Downloads 390

2001 Efficient Schemes of Classifiers for Remote Sensing Satellite Imageries of Land Use Pattern Classifications

Authors: S. S. Patil, Sachidanand Kini

Abstract:

Classification of land use patterns is compelling in complexity and variability of remote sensing imageries data. An imperative research in remote sensing application exploited to mine some of the significant spatially variable factors as land cover and land use from satellite images for remote arid areas in Karnataka State, India. The diverse classification techniques, unsupervised and supervised consisting of maximum likelihood, Mahalanobis distance, and minimum distance are applied in Bellary District in Karnataka State, India for the classification of the raw satellite images. The accuracy evaluations of results are compared visually with the standard maps with ground-truths. We initiated with the maximum likelihood technique that gave the finest results and both minimum distance and Mahalanobis distance methods over valued agriculture land areas. In meanness of mislaid few irrelevant features due to the low resolution of the satellite images, high-quality accord between parameters extracted automatically from the developed maps and field observations was found.

Keywords: Mahalanobis distance, minimum distance, supervised, unsupervised, user classification accuracy, producer's classification accuracy, maximum likelihood, kappa coefficient

Procedia PDF Downloads 154

2000 Clinical Application of Measurement of Eyeball Movement for Diagnose of Autism

Authors: Ippei Torii, Kaoruko Ohtani, Takahito Niwa, Naohiro Ishii

Abstract:

This paper shows developing an objectivity index using the measurement of subtle eyeball movement to diagnose autism. The developmentally disabled assessment varies, and the diagnosis depends on the subjective judgment of professionals. Therefore, a supplementary inspection method that will enable anyone to obtain the same quantitative judgment is needed. The diagnosis are made based on a comparison of the time of gazing an object in the conventional autistic study, but the results do not match. First, we divided the pupil into four parts from the center using measurements of subtle eyeball movement and comparing the number of pixels in the overlapping parts based on an afterimage. Then we developed the objective evaluation indicator to judge non-autistic and autistic people more clearly than conventional methods by analyzing the differences of subtle eyeball movements between the right and left eyes. Even when a person gazes at one point and his/her eyeballs always stay fixed at that point, their eyes perform subtle fixating movements (ie. tremors, drifting, microsaccades) to keep the retinal image clear. Particularly, the microsaccades link with nerves and reflect the mechanism that process the sight in a brain. We converted the differences between these movements into numbers. The process of the conversion is as followed: 1) Select the pixel indicating the subject's pupil from images of captured frames. 2) Set up a reference image, known as an afterimage, from the pixel indicating the subject's pupil. 3) Divide the pupil of the subject into four from the center in the acquired frame image. 4) Select the pixel in each divided part and count the number of the pixels of the overlapping part with the present pixel based on the afterimage. 5) Process the images with precision in 24 - 30fps from a camera and convert the amount of change in the pixels of the subtle movements of the right and left eyeballs in to numbers. The difference in the area of the amount of change occurs by measuring the difference between the afterimage in consecutive frames and the present frame. We set the amount of change to the quantity of the subtle eyeball movements. This method made it possible to detect a change of the eyeball vibration in numerical value. By comparing the numerical value between the right and left eyes, we found that there is a difference in how much they move. We compared the difference in these movements between non-autistc and autistic people and analyzed the result. Our research subjects consists of 8 children and 10 adults with autism, and 6 children and 18 adults with no disability. We measured the values through pasuit movements and fixations. We converted the difference in subtle movements between the right and left eyes into a graph and define it in multidimensional measure. Then we set the identification border with density function of the distribution, cumulative frequency function, and ROC curve. With this, we established an objective index to determine autism, normal, false positive, and false negative.

Keywords: subtle eyeball movement, autism, microsaccade, pursuit eye movements, ROC curve

Procedia PDF Downloads 256

1999 Integration of an Augmented Reality System for the Visualization of the HRMAS NMR Analysis of Brain Biopsy Specimens Using the Brainlab Cranial Navigation System

Authors: Abdelkrim Belhaoua, Jean-Pierre Radoux, Mariana Kuras, Vincent Récamier, Martial Piotto, Karim Elbayed, François Proust, Izzie Namer

Abstract:

This paper proposes an augmented reality system dedicated to neurosurgery in order to assist the surgeon during an operation. This work is part of the ExtempoRMN project (Funded by Bpifrance) which aims at analyzing during a surgical operation the metabolic content of tumoral brain biopsy specimens by HRMAS NMR. Patients affected with a brain tumor (gliomas) frequently need to undergo an operation in order to remove the tumoral mass. During the operation, the neurosurgeon removes biopsy specimens using image-guided surgery. The biopsy specimens removed are then sent for HRMAS NMR analysis in order to obtain a better diagnosis and prognosis. Image-guided refers to the use of MRI images and a computer to precisely locate and target a lesion (abnormal tissue) within the brain. This is performed using preoperative MRI images and the BrainLab neuro-navigation system. With the patient MRI images loaded on the Brainlab Cranial neuro-navigation system in the operating theater, surgeons can better identify their approach before making an incision. The Brainlab neuro-navigation tool tracks in real time the position of the instruments and displays their position on the patient MRI data. The results of the biopsy analysis by 1H HRMAS NMR are then sent back to the operating theater and superimposed on the 3D localization system directly on the MRI images. The method we have developed to communicate between the HRMAS NMR analysis software and Brainlab makes use of a combination of C++, VTK and the Insight Toolkit using OpenIGTLink protocol.

Keywords: neuro-navigation, augmented reality, biopsy, BrainLab, HR-MAS NMR

Procedia PDF Downloads 344

1998 Performance Comparison of Deep Convolutional Neural Networks for Binary Classification of Fine-Grained Leaf Images

Authors: Kamal KC, Zhendong Yin, Dasen Li, Zhilu Wu

Abstract:

Intra-plant disease classification based on leaf images is a challenging computer vision task due to similarities in texture, color, and shape of leaves with a slight variation of leaf spot; and external environmental changes such as lighting and background noises. Deep convolutional neural network (DCNN) has proven to be an effective tool for binary classification. In this paper, two methods for binary classification of diseased plant leaves using DCNN are presented; model created from scratch and transfer learning. Our main contribution is a thorough evaluation of 4 networks created from scratch and transfer learning of 5 pre-trained models. Training and testing of these models were performed on a plant leaf images dataset belonging to 16 distinct classes, containing a total of 22,265 images from 8 different plants, consisting of a pair of healthy and diseased leaves. We introduce a deep CNN model, Optimized MobileNet. This model with depthwise separable CNN as a building block attained an average test accuracy of 99.77%. We also present a fine-tuning method by introducing the concept of a convolutional block, which is a collection of different deep neural layers. Fine-tuned models proved to be efficient in terms of accuracy and computational cost. Fine-tuned MobileNet achieved an average test accuracy of 99.89% on 8 pairs of [healthy, diseased] leaf ImageSet.

Keywords: deep convolution neural network, depthwise separable convolution, fine-grained classification, MobileNet, plant disease, transfer learning

Procedia PDF Downloads 158

1997 Grain Boundary Detection Based on Superpixel Merges

Authors: Gaokai Liu

Abstract:

The distribution of material grain sizes reflects the strength, fracture, corrosion and other properties, and the grain size can be acquired via the grain boundary. In recent years, the automatic grain boundary detection is widely required instead of complex experimental operations. In this paper, an effective solution is applied to acquire the grain boundary of material images. First, the initial superpixel segmentation result is obtained via a superpixel approach. Then, a region merging method is employed to merge adjacent regions based on certain similarity criterions, the experimental results show that the merging strategy improves the superpixel segmentation result on material datasets.

Keywords: grain boundary detection, image segmentation, material images, region merging

Procedia PDF Downloads 142

1996 Image Segmentation Techniques: Review

Authors: Lindani Mbatha, Suvendi Rimer, Mpho Gololo

Abstract:

Image segmentation is the process of dividing an image into several sections, such as the object's background and the foreground. It is a critical technique in both image-processing tasks and computer vision. Most of the image segmentation algorithms have been developed for gray-scale images and little research and algorithms have been developed for the color images. Most image segmentation algorithms or techniques vary based on the input data and the application. Nearly all of the techniques are not suitable for noisy environments. Most of the work that has been done uses the Markov Random Field (MRF), which involves the computations and is said to be robust to noise. In the past recent years' image segmentation has been brought to tackle problems such as easy processing of an image, interpretation of the contents of an image, and easy analysing of an image. This article reviews and summarizes some of the image segmentation techniques and algorithms that have been developed in the past years. The techniques include neural networks (CNN), edge-based techniques, region growing, clustering, and thresholding techniques and so on. The advantages and disadvantages of medical ultrasound image segmentation techniques are also discussed. The article also addresses the applications and potential future developments that can be done around image segmentation. This review article concludes with the fact that no technique is perfectly suitable for the segmentation of all different types of images, but the use of hybrid techniques yields more accurate and efficient results.

Keywords: clustering-based, convolution-network, edge-based, region-growing

Procedia PDF Downloads 59

1995 Implementation of a Low-Cost Driver Drowsiness Evaluation System Using a Thermal Camera

Authors: Isa Moazen, Ali Nahvi

Abstract:

Driver drowsiness is a major cause of vehicle accidents, and facial images are highly valuable to detect drowsiness. In this paper, we perform our research via a thermal camera to record drivers' facial images on a driving simulator. A robust real-time algorithm extracts the features using horizontal and vertical integration projection, contours, contour orientations, and cropping tools. The features are included four target areas on the cheeks and forehead. Qt compiler and OpenCV are used with two cameras with different resolutions. A high-resolution thermal camera is used for fifteen subjects, and a low-resolution one is used for a person. The results are investigated by four temperature plots and evaluated by observer rating of drowsiness.

Keywords: advanced driver assistance systems, thermal imaging, driver drowsiness detection, feature extraction

Procedia PDF Downloads 103

1994 Application of Pattern Recognition Technique to the Quality Characterization of Superficial Microstructures in Steel Coatings

Authors: H. Gonzalez-Rivera, J. L. Palmeros-Torres

Abstract:

This paper describes the application of traditional computer vision techniques as a procedure for automatic measurement of the secondary dendrite arm spacing (SDAS) from microscopic images. The algorithm is capable of finding the lineal or curve-shaped secondary column of the main microstructure, measuring its length size in a micro-meter and counting the number of spaces between dendrites. The automatic characterization was compared with a set of 1728 manually characterized images, leading to an accuracy of −0.27 µm for the length size determination and a precision of ± 2.78 counts for dendrite spacing counting, also reducing the characterization time from 7 hours to 2 minutes.

Keywords: dendrite arm spacing, microstructure inspection, pattern recognition, polynomial regression

Procedia PDF Downloads 18

1993 Secure Message Transmission Using Meaningful Shares

Authors: Ajish Sreedharan

Abstract:

Visual cryptography encodes a secret image into shares of random binary patterns. If the shares are exerted onto transparencies, the secret image can be visually decoded by superimposing a qualified subset of transparencies, but no secret information can be obtained from the superposition of a forbidden subset. The binary patterns of the shares, however, have no visual meaning and hinder the objectives of visual cryptography. In the Secret Message Transmission through Meaningful Shares a secret message to be transmitted is converted to grey scale image. Then (2,2) visual cryptographic shares are generated from this converted gray scale image. The shares are encrypted using A Chaos-Based Image Encryption Algorithm Using Wavelet Transform. Two separate color images which are of the same size of the shares, taken as cover image of the respective shares to hide the shares into them. The encrypted shares which are covered by meaningful images so that a potential eavesdropper wont know there is a message to be read. The meaningful shares are transmitted through two different transmission medium. During decoding shares are fetched from received meaningful images and decrypted using A Chaos-Based Image Encryption Algorithm Using Wavelet Transform. The shares are combined to regenerate the grey scale image from where the secret message is obtained.

Keywords: visual cryptography, wavelet transform, meaningful shares, grey scale image

Procedia PDF Downloads 427

1992 The Stereotypical Images of Marginalized Women in the Poetry of Rita Dove

Authors: Wafaa Kamal Isaac

Abstract:

This paper attempts to shed light upon the stereotypical images of marginalized black women as shown through the poetry of Rita Dove. Meanwhile, it explores how stereotypical images held by the society and public perceptions perpetuate the marginalization of black women. Dove is considered one of the most fundamental African-American poets who devoted her writings to explore the problem of identity that confronted marginalized women in America. Besides tackling the issue of black women’s stereotypical images, this paper focuses upon the psychological damage which the black women had suffered from due to their stripped identity. In ‘Thomas and Beulah’, Dove reflects the black woman’s longing for her homeland in order to make up for her lost identity. This poem represents atavistic feelings deal with certain recurrent images, both aural and visual, like the image of Beulah who represents the African-American woman who searches for an identity, as she is being denied and humiliated one in the newly founded society. In an attempt to protest against the stereotypical mule image that had been imposed upon black women in America, Dove in ‘On the Bus with Rosa Parks’ tries to ignite the beaten spirits to struggle for their own rights by revitalizing the rebellious nature and strong determination of the historical figure ‘Rosa Parks’ that sparked the Civil Rights Movement. In ‘Daystar’, Dove proves that black women are subjected to double-edged oppression; firstly, in terms of race as a black woman in an unjust white society that violates her rights due to her black origins and secondly, in terms of gender as a member of the female sex that is meant to exist only to serve man’s needs. Similarly, in the ‘Adolescence’ series, Dove focuses on the double marginalization which the black women had experienced. It concludes that the marginalization of black women has resulted from the domination of the masculine world and the oppression of the white world. Moreover, Dove’s ‘Beauty and the Beast’ investigates the African-American women’s problem of estrangement and identity crisis in America. It also sheds light upon the psychological consequences that resulted from the violation of marginalized women’s identity. Furthermore, this poem shows the black women’s self-debasement, helplessness, and double consciousness that emanate from the sense of uprootedness. Finally, this paper finds out that the negative, debased and inferior stereotypical image held by the society did not only contribute to the marginalization of black women but also silenced and muted their voices.

Keywords: stereotypical images, marginalized women, Rita Dove, identity

Procedia PDF Downloads 134

1991 The Development of Congeneric Elicited Writing Tasks to Capture Language Decline in Alzheimer Patients

Authors: Lise Paesen, Marielle Leijten

Abstract:

People diagnosed with probable Alzheimer disease suffer from an impairment of their language capacities; a gradual impairment which affects both their spoken and written communication. Our study aims at characterising the language decline in DAT patients with the use of congeneric elicited writing tasks. Within these tasks, a descriptive text has to be written based upon images with which the participants are confronted. A randomised set of images allows us to present the participants with a different task on every encounter, thus allowing us to avoid a recognition effect in this iterative study. This method is a revision from previous studies, in which participants were presented with a larger picture depicting an entire scene. In order to create the randomised set of images, existing pictures were adapted following strict criteria (e.g. frequency, AoA, colour, ...). The resulting data set contained 50 images, belonging to several categories (vehicles, animals, humans, and objects). A pre-test was constructed to validate the created picture set; most images had been used before in spoken picture naming tasks. Hence the same reaction times ought to be triggered in the typed picture naming task. Once validated, the effectiveness of the descriptive tasks was assessed. First, the participants (n=60 students, n=40 healthy elderly) performed a typing task, which provided information about the typing speed of each individual. Secondly, two descriptive writing tasks were carried out, one simple and one complex. The simple task contains 4 images (1 animal, 2 objects, 1 vehicle) and only contains elements with high frequency, a young AoA (<6 years), and fast reaction times. Slow reaction times, a later AoA (≥ 6 years) and low frequency were criteria for the complex task. This task uses 6 images (2 animals, 1 human, 2 objects and 1 vehicle). The data were collected with the keystroke logging programme Inputlog. Keystroke logging tools log and time stamp keystroke activity to reconstruct and describe text production processes. The data were analysed using a selection of writing process and product variables, such as general writing process measures, detailed pause analysis, linguistic analysis, and text length. As a covariate, the intrapersonal interkey transition times from the typing task were taken into account. The pre-test indicated that the new images lead to similar or even faster reaction times compared to the original images. All the images were therefore used in the main study. The produced texts of the description tasks were significantly longer compared to previous studies, providing sufficient text and process data for analyses. Preliminary analysis shows that the amount of words produced differed significantly between the healthy elderly and the students, as did the mean length of production bursts, even though both groups needed the same time to produce their texts. However, the elderly took significantly more time to produce the complex task than the simple task. Nevertheless, the amount of words per minute remained comparable between simple and complex. The pauses within and before words varied, even when taking personal typing abilities (obtained by the typing task) into account.

Keywords: Alzheimer's disease, experimental design, language decline, writing process

Procedia PDF Downloads 250

1990 Lifting Wavelet Transform and Singular Values Decomposition for Secure Image Watermarking

Authors: Siraa Ben Ftima, Mourad Talbi, Tahar Ezzedine

Abstract:

In this paper, we present a technique of secure watermarking of grayscale and color images. This technique consists in applying the Singular Value Decomposition (SVD) in LWT (Lifting Wavelet Transform) domain in order to insert the watermark image (grayscale) in the host image (grayscale or color image). It also uses signature in the embedding and extraction steps. The technique is applied on a number of grayscale and color images. The performance of this technique is proved by the PSNR (Pick Signal to Noise Ratio), the MSE (Mean Square Error) and the SSIM (structural similarity) computations.

Keywords: lifting wavelet transform (LWT), sub-space vectorial decomposition, secure, image watermarking, watermark

Procedia PDF Downloads 238

1989 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 378

1988 Detection of Intentional Attacks in Images Based on Watermarking

Authors: Hazem Munawer Al-Otum

Abstract:

In this work, an efficient watermarking technique is proposed and can be used for detecting intentional attacks in RGB color images. The proposed technique can be implemented for image authentication and exhibits high robustness against unintentional common image processing attacks. It deploys two measures to discern between intentional and unintentional attacks based on using a quantization-based technique in a modified 2D multi-pyramidal DWT transform. Simulations have shown high accuracy in detecting intentionally attacked regions while exhibiting high robustness under moderate to severe common image processing attacks.

Keywords: image authentication, copyright protection, semi-fragile watermarking, tamper detection

Procedia PDF Downloads 229

1987 Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images

Authors: Elham Bagheri, Yalda Mohsenzadeh

Abstract:

Image memorability refers to the phenomenon where certain images are more likely to be remembered by humans than others. It is a quantifiable and intrinsic attribute of an image. Understanding how visual perception and memory interact is important in both cognitive science and artificial intelligence. It reveals the complex processes that support human cognition and helps to improve machine learning algorithms by mimicking the brain's efficient data processing and storage mechanisms. To explore the computational underpinnings of image memorability, this study examines the relationship between an image's reconstruction error, distinctiveness in latent space, and its memorability score. A trained autoencoder is used to replicate human-like memorability assessment inspired by the visual memory game employed in memorability estimations. This study leverages a VGG-based autoencoder that is pre-trained on the vast ImageNet dataset, enabling it to recognize patterns and features that are common to a wide and diverse range of images. An empirical analysis is conducted using the MemCat dataset, which includes 10,000 images from five broad categories: animals, sports, food, landscapes, and vehicles, along with their corresponding memorability scores. The memorability score assigned to each image represents the probability of that image being remembered by participants after a single exposure. The autoencoder is finetuned for one epoch with a batch size of one, attempting to create a scenario similar to human memorability experiments where memorability is quantified by the likelihood of an image being remembered after being seen only once. The reconstruction error, which is quantified as the difference between the original and reconstructed images, serves as a measure of how well the autoencoder has learned to represent the data. The reconstruction error of each image, the error reduction, and its distinctiveness in latent space are calculated and correlated with the memorability score. Distinctiveness is measured as the Euclidean distance between each image's latent representation and its nearest neighbor within the autoencoder's latent space. Different structural and perceptual loss functions are considered to quantify the reconstruction error. The results indicate that there is a strong correlation between the reconstruction error and the distinctiveness of images and their memorability scores. This suggests that images with more unique distinct features that challenge the autoencoder's compressive capacities are inherently more memorable. There is also a negative correlation between the reduction in reconstruction error compared to the autoencoder pre-trained on ImageNet, which suggests that highly memorable images are harder to reconstruct, probably due to having features that are more difficult to learn by the autoencoder. These insights suggest a new pathway for evaluating image memorability, which could potentially impact industries reliant on visual content and mark a step forward in merging the fields of artificial intelligence and cognitive science. The current research opens avenues for utilizing neural representations as instruments for understanding and predicting visual memory.

Keywords: autoencoder, computational vision, image memorability, image reconstruction, memory retention, reconstruction error, visual perception

Procedia PDF Downloads 44

1986 Meteosat Second Generation Image Compression Based on the Radon Transform and Linear Predictive Coding: Comparison and Performance

Authors: Cherifi Mehdi, Lahdir Mourad, Ameur Soltane

Abstract:

Image compression is used to reduce the number of bits required to represent an image. The Meteosat Second Generation satellite (MSG) allows the acquisition of 12 image files every 15 minutes. Which results a large databases sizes. The transform selected in the images compression should contribute to reduce the data representing the images. The Radon transform retrieves the Radon points that represent the sum of the pixels in a given angle for each direction. Linear predictive coding (LPC) with filtering provides a good decorrelation of Radon points using a Predictor constitute by the Symmetric Nearest Neighbor filter (SNN) coefficients, which result losses during decompression. Finally, Run Length Coding (RLC) gives us a high and fixed compression ratio regardless of the input image. In this paper, a novel image compression method based on the Radon transform and linear predictive coding (LPC) for MSG images is proposed. MSG image compression based on the Radon transform and the LPC provides a good compromise between compression and quality of reconstruction. A comparison of our method with other whose two based on DCT and one on DWT bi-orthogonal filtering is evaluated to show the power of the Radon transform in its resistibility against the quantization noise and to evaluate the performance of our method. Evaluation criteria like PSNR and the compression ratio allows showing the efficiency of our method of compression.

Keywords: image compression, radon transform, linear predictive coding (LPC), run lengthcoding (RLC), meteosat second generation (MSG)

Procedia PDF Downloads 390