Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2963

Search results for: image retrieval

1733 Iterative Reconstruction Techniques as a Dose Reduction Tool in Pediatric Computed Tomography Imaging: A Phantom Study

Abstract:

Background and Purpose: Computed Tomography (CT) scans have become the largest source of radiation in radiological imaging. The purpose of this study was to compare the quality of pediatric Computed Tomography (CT) images reconstructed using Filtered Back Projection (FBP) with images reconstructed using different strengths of Iterative Reconstruction (IR) technique, and to perform a feasibility study to assess the use of IR techniques as a dose reduction tool. Materials and Methods: An anthropomorphic phantom representing a 5-year old child was scanned, in two stages, using a Siemens Somatom CT unit. In stage one, scans of the head, chest and abdomen were performed using standard protocols recommended by the scanner manufacturer. Images were reconstructed using FBP and 5 different strengths of IR. Contrast-to-Noise Ratios (CNR) were calculated from average CT number and its standard deviation measured in regions of interest created in the lungs, bone, and soft tissues regions of the phantom. Paired t-test and the one-way ANOVA were used to compare the CNR from FBP images with IR images, at p = 0.05 level. The lowest strength value of IR that produced the highest CNR was identified. In the second stage, scans of the head was performed with decreased mA(s) values relative to the increase in CNR compared to the standard FBP protocol. CNR values were compared in this stage using Paired t-test at p = 0.05 level. Results: Images reconstructed using IR technique had higher CNR values (p < 0.01.) in all regions compared to the FBP images, at all strengths of IR. The CNR increased with increasing IR strength of up to 3, in the head and chest images. Increases beyond this strength were insignificant. In abdomen images, CNR continued to increase up to strength 5. The results also indicated that, IR techniques improve CNR by a up to factor of 1.5. Based on the CNR values at strength 3 of IR images and CNR values of FBP images, a reduction in mA(s) of about 20% was identified. The images of the head acquired at 20% reduced mA(s) and reconstructed using IR at strength 3, had similar CNR as FBP images at standard mA(s). In the head scans of the phantom used in this study, it was demonstrated that similar CNR can be achieved even when the mA(s) is reduced by about 20% if IR technique with strength of 3 is used for reconstruction. Conclusions: The IR technique produced better image quality at all strengths of IR in comparison to FBP. IR technique can provide approximately 20% dose reduction in pediatric head CT while maintaining the same image quality as FBP technique.

Keywords: filtered back projection, image quality, iterative reconstruction, pediatric computed tomography imaging

Procedia PDF Downloads 130

1732 Quantitative Phase Imaging System Based on a Three-Lens Common-Path Interferometer

Authors: Alexander Machikhin, Olga Polschikova, Vitold Pozhar, Alina Ramazanova

Abstract:

White-light quantitative phase imaging is an effective technique for achieving sub-nanometer phase sensitivity. Highly stable interferometers based on common-path geometry have been developed in recent years to solve this task. Some of these methods also apply multispectral approach. The purpose of this research is to suggest a simple and effective interferometer for such systems. We developed a three-lens common-path interferometer, which can be used for quantitative phase imaging with or without multispectral modality. The lens system consists of two components, the first one of which is a compound lens, consisting of two lenses. A pinhole is placed between the components. The lens-in-lens approach enables effective light transmission and high stability of the interferometer. The multispectrality is easily implemented by placing a tunable filter in front of the interferometer. In our work, we used an acousto-optical tunable filter. Some design considerations are discussed and multispectral quantitative phase retrieval is demonstrated.

Keywords: acousto-optical tunable filter, common-path interferometry, digital holography, multispectral quantitative phase imaging

Procedia PDF Downloads 293

1731 Optimized Text Summarization Model on Mobile Screens for Sight-Interpreters: An Empirical Study

Authors: Jianhua Wang

Abstract:

To obtain key information quickly from long texts on small screens of mobile devices, sight-interpreters need to establish optimized summarization model for fast information retrieval. Four summarization models based on previous studies were studied including title+key words (TKW), title+topic sentences (TTS), key words+topic sentences (KWTS) and title+key words+topic sentences (TKWTS). Psychological experiments were conducted on the four models for three different genres of interpreting texts to establish the optimized summarization model for sight-interpreters. This empirical study shows that the optimized summarization model for sight-interpreters to quickly grasp the key information of the texts they interpret is title+key words (TKW) for cultural texts, title+key words+topic sentences (TKWTS) for economic texts and topic sentences+key words (TSKW) for political texts.

Keywords: different genres, mobile screens, optimized summarization models, sight-interpreters

Procedia PDF Downloads 296

1730 Use of Machine Learning Algorithms to Pediatric MR Images for Tumor Classification

Authors: I. Stathopoulos, V. Syrgiamiotis, E. Karavasilis, A. Ploussi, I. Nikas, C. Hatzigiorgi, K. Platoni, E. P. Efstathopoulos

Abstract:

Introduction: Brain and central nervous system (CNS) tumors form the second most common group of cancer in children, accounting for 30% of all childhood cancers. MRI is the key imaging technique used for the visualization and management of pediatric brain tumors. Initial characterization of tumors from MRI scans is usually performed via a radiologist’s visual assessment. However, different brain tumor types do not always demonstrate clear differences in visual appearance. Using only conventional MRI to provide a definite diagnosis could potentially lead to inaccurate results, and so histopathological examination of biopsy samples is currently considered to be the gold standard for obtaining definite diagnoses. Machine learning is defined as the study of computational algorithms that can use, complex or not, mathematical relationships and patterns from empirical and scientific data to make reliable decisions. Concerning the above, machine learning techniques could provide effective and accurate ways to automate and speed up the analysis and diagnosis for medical images. Machine learning applications in radiology are or could potentially be useful in practice for medical image segmentation and registration, computer-aided detection and diagnosis systems for CT, MR or radiography images and functional MR (fMRI) images for brain activity analysis and neurological disease diagnosis. Purpose: The objective of this study is to provide an automated tool, which may assist in the imaging evaluation and classification of brain neoplasms in pediatric patients by determining the glioma type, grade and differentiating between different brain tissue types. Moreover, a future purpose is to present an alternative way of quick and accurate diagnosis in order to save time and resources in the daily medical workflow. Materials and Methods: A cohort, of 80 pediatric patients with a diagnosis of posterior fossa tumor, was used: 20 ependymomas, 20 astrocytomas, 20 medulloblastomas and 20 healthy children. The MR sequences used, for every single patient, were the following: axial T1-weighted (T1), axial T2-weighted (T2), FluidAttenuated Inversion Recovery (FLAIR), axial diffusion weighted images (DWI), axial contrast-enhanced T1-weighted (T1ce). From every sequence only a principal slice was used that manually traced by two expert radiologists. Image acquisition was carried out on a GE HDxt 1.5-T scanner. The images were preprocessed following a number of steps including noise reduction, bias-field correction, thresholding, coregistration of all sequences (T1, T2, T1ce, FLAIR, DWI), skull stripping, and histogram matching. A large number of features for investigation were chosen, which included age, tumor shape characteristics, image intensity characteristics and texture features. After selecting the features for achieving the highest accuracy using the least number of variables, four machine learning classification algorithms were used: k-Nearest Neighbour, Support-Vector Machines, C4.5 Decision Tree and Convolutional Neural Network. The machine learning schemes and the image analysis are implemented in the WEKA platform and MatLab platform respectively. Results-Conclusions: The results and the accuracy of images classification for each type of glioma by the four different algorithms are still on process.

Keywords: image classification, machine learning algorithms, pediatric MRI, pediatric oncology

Procedia PDF Downloads 131

1729 Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison between Central Processing Unit vs. Graphics Processing Unit Functions for Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

Neural network approaches are machine learning methods used in many domains, such as healthcare and cyber security. Neural networks are mostly known for dealing with image datasets. While training with the images, several fundamental mathematical operations are carried out in the Neural Network. The operation includes a number of algebraic and mathematical functions, including derivative, convolution, and matrix inversion and transposition. Such operations require higher processing power than is typically needed for computer usage. Central Processing Unit (CPU) is not appropriate for a large image size of the dataset as it is built with serial processing. While Graphics Processing Unit (GPU) has parallel processing capabilities and, therefore, has higher speed. This paper uses advanced Neural Network techniques such as VGG16, Resnet50, Densenet, Inceptionv3, Xception, Mobilenet, XGBOOST-VGG16, and our proposed models to compare CPU and GPU resources. A system for classifying autism disease using face images of an autistic and non-autistic child was used to compare performance during testing. We used evaluation matrices such as Accuracy, F1 score, Precision, Recall, and Execution time. It has been observed that GPU runs faster than the CPU in all tests performed. Moreover, the performance of the Neural Network models in terms of accuracy increases on GPU compared to CPU.

Keywords: autism disease, neural network, CPU, GPU, transfer learning

Procedia PDF Downloads 94

1728 Progressive Multimedia Collection Structuring via Scene Linking

Authors: Aman Berhe, Camille Guinaudeau, Claude Barras

Abstract:

In order to facilitate information seeking in large collections of multimedia documents with long and progressive content (such as broadcast news or TV series), one can extract the semantic links that exist between semantically coherent parts of documents, i.e., scenes. The links can then create a coherent collection of scenes from which it is easier to perform content analysis, topic extraction, or information retrieval. In this paper, we focus on TV series structuring and propose two approaches for scene linking at different levels of granularity (episode and season): a fuzzy online clustering technique and a graph-based community detection algorithm. When evaluated on the two first seasons of the TV series Game of Thrones, we found that the fuzzy online clustering approach performed better compared to graph-based community detection at the episode level, while graph-based approaches show better performance at the season level.

Keywords: multimedia collection structuring, progressive content, scene linking, fuzzy clustering, community detection

Procedia PDF Downloads 79

1727 Digital Preservation in Nigeria Universities Libraries: A Comparison between University of Nigeria Nsukka and Ahmadu Bello University Zaria

Authors: Suleiman Musa, Shuaibu Sidi Safiyanu

Abstract:

This study examined the digital preservation in Nigeria university libraries. A comparison between the university of Nigeria Nsukka (UNN) and Ahmadu Bello University Zaria (ABU, Zaria). The study utilized primary source of data obtained from two selected institution librarians. Finding revealed varying results in terms of skills acquired by librarians before and after digitization of the two institutions. The study reports that journals publication, text book, CD-ROMS, conference papers and proceedings, theses, dissertations and seminar papers are among the information resources available for digitization. The study further documents that copyright issue, power failure, and unavailability of needed materials are among the challenges facing the digitization of library of the institution. On the basis of the finding, the study concluded that digitization of library enhances efficiency in organization and retrieval of information services. The study therefore recommended that software should be upgraded with backup, training of the librarians on digital process, installation of antivirus and enhancement of technical collaboration between the library and MIS.

Keywords: digitalization, preservation, libraries, comparison

Procedia PDF Downloads 312

1726 Application of Compressed Sensing and Different Sampling Trajectories for Data Reduction of Small Animal Magnetic Resonance Image

Authors: Matheus Madureira Matos, Alexandre Rodrigues Farias

Abstract:

Magnetic Resonance Imaging (MRI) is a vital imaging technique used in both clinical and pre-clinical areas to obtain detailed anatomical and functional information. However, MRI scans can be expensive, time-consuming, and often require the use of anesthetics to keep animals still during the imaging process. Anesthetics are commonly administered to animals undergoing MRI scans to ensure they remain still during the imaging process. However, prolonged or repeated exposure to anesthetics can have adverse effects on animals, including physiological alterations and potential toxicity. Minimizing the duration and frequency of anesthesia is, therefore, crucial for the well-being of research animals. In recent years, various sampling trajectories have been investigated to reduce the number of MRI measurements leading to shorter scanning time and minimizing the duration of animal exposure to the effects of anesthetics. Compressed sensing (CS) and sampling trajectories, such as cartesian, spiral, and radial, have emerged as powerful tools to reduce MRI data while preserving diagnostic quality. This work aims to apply CS and cartesian, spiral, and radial sampling trajectories for the reconstruction of MRI of the abdomen of mice sub-sampled at levels below that defined by the Nyquist theorem. The methodology of this work consists of using a fully sampled reference MRI of a female model C57B1/6 mouse acquired experimentally in a 4.7 Tesla MRI scanner for small animals using Spin Echo pulse sequences. The image is down-sampled by cartesian, radial, and spiral sampling paths and then reconstructed by CS. The quality of the reconstructed images is objectively assessed by three quality assessment techniques RMSE (Root mean square error), PSNR (Peak to Signal Noise Ratio), and SSIM (Structural similarity index measure). The utilization of optimized sampling trajectories and CS technique has demonstrated the potential for a significant reduction of up to 70% of image data acquisition. This result translates into shorter scan times, minimizing the duration and frequency of anesthesia administration and reducing the potential risks associated with it.

Keywords: compressed sensing, magnetic resonance, sampling trajectories, small animals

Procedia PDF Downloads 51

1725 Temporal Characteristics of Human Perception to Significant Variation of Block Structures

Authors: Kuo-Cheng Liu

Abstract:

In the latest research efforts, the structures of the image in the spatial domain have been successfully analyzed and proved to deduce the visual masking for accurately estimating the visibility thresholds of the image. If the structural properties of the video sequence in the temporal domain are taken into account to estimate the temporal masking, the improvement and enhancement of the as-sessing spatio-temporal visibility thresholds are reasonably expected. In this paper, the temporal characteristics of human perception to the change in block structures on the time axis are analyzed. The temporal characteristics of human perception are represented in terms of the significant variation in block structures for the analysis of human visual system (HVS). Herein, the block structure in each frame is computed by combined the pattern masking and the contrast masking simultaneously. The contrast masking always overestimates the visibility thresholds of edge regions and underestimates that of texture regions, while the pattern masking is weak on a uniform background and is strong on the complex background with spatial patterns. Under considering the significant variation of block structures between successive frames, we extend the block structures of images in the spatial domain to that of video sequences in the temporal domain to analyze the relation between the inter-frame variation of structures and the temporal masking. Meanwhile, the subjective viewing test and the fair rating process are designed to evaluate the consistency of the temporal characteristics with the HVS under a specified viewing condition.

Keywords: temporal characteristic, block structure, pattern masking, contrast masking

Procedia PDF Downloads 390

1724 Direct Integration of 3D Ultrasound Scans with Patient Educational Mobile Application

Authors: Zafar Iqbal, Eugene Chan, Fareed Ahmed, Mohamed Jama, Avez Rizvi

Abstract:

Advancements in Ultrasound Technology have enabled machines to capture 3D and 4D images with intricate features of the growing fetus. Sonographers can now capture clear 3D images and 4D videos of the fetus, especially of the face. Fetal faces are often seen on the ultrasound scan of the third trimester where anatomical features become more defined. Parents often want 3D/4D images and videos of their ultrasounds, and particularly image that capture the child’s face. Sidra Medicine developed a patient education mobile app called 10 Moons to improve care and provide useful information during the length of their pregnancy. In addition to general information, we built the ability to send ultrasound images directly from the modality to the mobile application, allowing expectant mothers to easily store and share images of their baby. 10 Moons represent the length of the pregnancy on a lunar calendar, which has both cultural and religious significance in the Middle East. During the third trimester scan, sonographers can capture 3D pictures of the fetus. Ultrasound machines are connected with a local 10 Moons Server with a Digital Imaging and Communications in Medicine (DICOM) application running on it. Sonographers are able to send images directly to the DICOM server by a preprogrammed button on the ultrasound modality. Mothers can also request which pictures they would like to be available on the app. An internally built DICOM application receives the image and saves the patient information from DICOM header (for verification purpose). The application also anonymizes the image by removing all the DICOM header information and subsequently converts it into a lossless JPEG. Finally, and the application passes the image to the mobile application server. On the 10 Moons mobile app – patients enter their Medical Record Number (MRN) and Date of Birth (DOB) to receive a One Time Password (OTP) for security reasons to view the images. Patients can also share the images anonymized images with friends and family. Furthermore, patients can also request 3D printed mementos of their child through 10 Moons. 10 Moons is unique patient education and information application where expected mothers can also see 3D ultrasound images of their children. Sidra Medicine staff has the added benefit of a full content management administrative backend where updates to content can be made. The app is available on secure infrastructure with both local and public interfaces. The application is also available in both English and Arabic languages to facilitate most of the patients in the region. Innovation is at the heart of modern healthcare management. With Innovation being one of Sidra Medicine’s core values, our 10 Moons application provides expectant mothers with unique educational content as well as the ability to store and share images of their child and purchase 3D printed mementos.

Keywords: patient educational mobile application, ultrasound images, digital imaging and communications in medicine (DICOM), imaging informatics

Procedia PDF Downloads 112

1723 Design of Liquid Crystal Based Interface to Study the Interaction of Gram Negative Bacterial Endotoxin with Milk Protein Lactoferrin

Authors: Dibyendu Das, Santanu Kumar Pal

Abstract:

Milk protein lactoferrin (Lf) exhibits potent antibacterial activity due to its interaction with Gram-negative bacterial cell membrane component, lipopolysaccharide (LPS). This paper represents fabrication of new Liquid crystals (LCs) based biosensors to explore the interaction between Lf and LPS. LPS self-assembled at aqueous/LCs interface and orients interfacial nematic 4-cyano-4’- pentylbiphenyl (5CB) LCs in a homeotropic fashion (exhibiting dark optical image under polarized optical microscope). Interestingly, on the exposure of Lf on LPS decorated aqueous/LCs interface, an optical image of LCs changed from dark to bright indicating an ordering alteration of interfacial LCs from homeotropic to tilted/planar state. The ordering transition reflects strong binding between Lf and interfacial LPS that, in turn, perturbs the orientation of LCs. With the help of epifluorescence microscopy, we further affirmed the interfacial LPS-Lf binding event by imaging the presence of FITC tagged Lf at the LPS laden aqueous/LCs interface. Finally, we have investigated the conformational behavior of Lf in solution as well as in the presence of LPS using Circular Dichroism (CD) spectroscopy and further reconfirmed with Vibrational Circular Dichroism (VCD) spectroscopy where we found that Lf undergoes alpha-helix to random coil-like structure in the presence of LPS. As a whole the entire results described in this paper establish a robust approach to envisage the interaction between LPS and Lf through the ordering transitions of LCs at aqueous/LCs interface.

Keywords: endotoxin, interface, lactoferrin, lipopolysaccharide

Procedia PDF Downloads 249

1722 Cosmetic Surgery on the Rise: The Impact of Remote Communication

Authors: Bruno Di Pace, Roxanne H. Padley

Abstract:

Aims: The recent increase in remote video interaction has increased the number of requests for teleconsultations with plastic surgeons in private practice (70% in the UK and 64% in the USA). This study investigated the motivations for such an increase and the underlying psychological impact on patients. Method: An anonymous web-based poll of 8 questions was designed and distributed to patients seeking cosmetic surgery through social networks in both Italy and the UK. The questions gathered responses regarding 1. Reasons for pursuing cosmetic surgery; 2. The effects of delays caused by the SARS-COV-2 pandemic; 3. The effects on mood; 4. The influence of video conferencing on body-image perception. Results: 85 respondents completed the online poll. Overall, 68% of respondents stated that seeing themselves more frequently online had influenced their decision to seek cosmetic surgery. The types of surgeries indicated were predominantly to the upper body and face (82%). Delays and access to surgeons during the pandemic were perceived as negatively impacting patients' moods (95%). Body-image perception and self-esteem were lower than in the pre-pandemic, particularly during lockdown (72%). Patients were more inclined to undergo cosmetic surgery during the pandemic, both due to the wish to improve their “lockdown face” for video conferencing (77%) and also due to the benefits of home recovery while in smart working (58%). Conclusions: Overall, findings suggest that video conferencing has led to a significant increase in requests for cosmetic surgery and the so-called “Zoom Boom” effect.

Keywords: cosmetic surgery, remote communication, telehealth, zoom boom

Procedia PDF Downloads 157

1721 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 299

1720 Reconfigurable Device for 3D Visualization of Three Dimensional Surfaces

Authors: Robson da C. Santos, Carlos Henrique de A. S. P. Coutinho, Lucas Moreira Dias, Gerson Gomes Cunha

Abstract:

The article refers to the development of an augmented reality 3D display, through the control of servo motors and projection of image with aid of video projector on the model. Augmented Reality is a branch that explores multiple approaches to increase real-world view by viewing additional information along with the real scene. The article presents the broad use of electrical, electronic, mechanical and industrial automation for geospatial visualizations, applications in mathematical models with the visualization of functions and 3D surface graphics and volumetric rendering that are currently seen in 2D layers. Application as a 3D display for representation and visualization of Digital Terrain Model (DTM) and Digital Surface Models (DSM), where it can be applied in the identification of canyons in the marine area of the Campos Basin, Rio de Janeiro, Brazil. The same can execute visualization of regions subject to landslides, as in Serra do Mar - Agra dos Reis and Serranas cities both in the State of Rio de Janeiro. From the foregoing, loss of human life and leakage of oil from pipelines buried in these regions may be anticipated in advance. The physical design consists of a table consisting of a 9 x 16 matrix of servo motors, totalizing 144 servos, a mesh is used on the servo motors for visualization of the models projected by a retro projector. Each model for by an image pre-processing, is sent to a server to be converted and viewed from a software developed in C # Programming Language.

Keywords: visualization, 3D models, servo motors, C# programming language

Procedia PDF Downloads 323

1719 Deep Feature Augmentation with Generative Adversarial Networks for Class Imbalance Learning in Medical Images

Authors: Rongbo Shen, Jianhua Yao, Kezhou Yan, Kuan Tian, Cheng Jiang, Ke Zhou

Abstract:

This study proposes a generative adversarial networks (GAN) framework to perform synthetic sampling in feature space, i.e., feature augmentation, to address the class imbalance problem in medical image analysis. A feature extraction network is first trained to convert images into feature space. Then the GAN framework incorporates adversarial learning to train a feature generator for the minority class through playing a minimax game with a discriminator. The feature generator then generates features for minority class from arbitrary latent distributions to balance the data between the majority class and the minority class. Additionally, a data cleaning technique, i.e., Tomek link, is employed to clean up undesirable conflicting features introduced from the feature augmentation and thus establish well-defined class clusters for the training. The experiment section evaluates the proposed method on two medical image analysis tasks, i.e., mass classification on mammogram and cancer metastasis classification on histopathological images. Experimental results suggest that the proposed method obtains superior or comparable performance over the state-of-the-art counterparts. Compared to all counterparts, our proposed method improves more than 1.5 percentage of accuracy.

Keywords: class imbalance, synthetic sampling, feature augmentation, generative adversarial networks, data cleaning

Procedia PDF Downloads 111

1718 Object Oriented Classification Based on Feature Extraction Approach for Change Detection in Coastal Ecosystem across Kochi Region

Authors: Mohit Modi, Rajiv Kumar, Manojraj Saxena, G. Ravi Shankar

Abstract:

Change detection of coastal ecosystem plays a vital role in monitoring and managing natural resources along the coastal regions. The present study mainly focuses on the decadal change in Kochi islands connecting the urban flatland areas and the coastal regions where sand deposits have taken place. With this, in view, the change detection has been monitored in the Kochi area to apprehend the urban growth and industrialization leading to decrease in the wetland ecosystem. The region lies between 76°11'19.134"E to 76°25'42.193"E and 9°52'35.719"N to 10°5'51.575"N in the south-western coast of India. The IRS LISS-IV satellite image has been processed using a rule-based algorithm to classify the LULC and to interpret the changes between 2005 & 2015. The approach takes two steps, i.e. extracting features as a single GIS vector layer using different parametric values and to dissolve them. The multi-resolution segmentation has been carried out on the scale ranging from 10-30. The different classes like aquaculture, agricultural land, built-up, wetlands etc. were extracted using parameters like NDVI, mean layer values, the texture-based feature with corresponding threshold values using a rule set algorithm. The objects obtained in the segmentation process were visualized to be overlaying the satellite image at a scale of 15. This layer was further segmented using the spectral difference segmentation rule between the objects. These individual class layers were dissolved in the basic segmented layer of the image and were interpreted in vector-based GIS programme to achieve higher accuracy. The result shows a rapid increase in an industrial area of 40% based on industrial area statistics of 2005. There is a decrease in wetlands area which has been converted into built-up. New roads have been constructed which are connecting the islands to urban areas as well as highways. The increase in coastal region has been visualized due to sand depositions. The outcome is well supported by quantitative assessments which will empower rich understanding of land use land cover change for appropriate policy intervention and further monitoring.

Keywords: land use land cover, multiresolution segmentation, NDVI, object based classification

Procedia PDF Downloads 168

1717 Melanoma and Non-Melanoma, Skin Lesion Classification, Using a Deep Learning Model

Authors: Shaira L. Kee, Michael Aaron G. Sy, Myles Joshua T. Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract:

Skin diseases are considered the fourth most common disease, with melanoma and non-melanoma skin cancer as the most common type of cancer in Caucasians. The alarming increase in Skin Cancer cases shows an urgent need for further research to improve diagnostic methods, as early diagnosis can significantly improve the 5-year survival rate. Machine Learning algorithms for image pattern analysis in diagnosing skin lesions can dramatically increase the accuracy rate of detection and decrease possible human errors. Several studies have shown the diagnostic performance of computer algorithms outperformed dermatologists. However, existing methods still need improvements to reduce diagnostic errors and generate efficient and accurate results. Our paper proposes an ensemble method to classify dermoscopic images into benign and malignant skin lesions. The experiments were conducted using the International Skin Imaging Collaboration (ISIC) image samples. The dataset contains 3,297 dermoscopic images with benign and malignant categories. The results show improvement in performance with an accuracy of 88% and an F1 score of 87%, outperforming other existing models such as support vector machine (SVM), Residual network (ResNet50), EfficientNetB0, EfficientNetB4, and VGG16.

Keywords: deep learning - VGG16 - efficientNet - CNN – ensemble – dermoscopic images - melanoma

Procedia PDF Downloads 66

1716 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 95

1715 Integrated Geophysical Approach for Subsurface Delineation in Srinagar, Uttarakhand, India

Authors: Pradeep Kumar Singh Chauhan, Gayatri Devi, Zamir Ahmad, Komal Chauhan, Abha Mittal

Abstract:

The application of geophysical methods to study the subsurface profile for site investigation is becoming popular globally. These methods are non-destructive and provide the image of subsurface at shallow depths. Seismic refraction method is one of the most common and efficient method being used for civil engineering site investigations particularly for knowing the seismic velocity of the subsurface layers. Resistivity imaging technique is a geo-electrical method used to image the subsurface, water bearing zone, bedrock and layer thickness. Integrated approach combining seismic refraction and 2-D resistivity imaging will provide a better and reliable picture of the subsurface. These are economical and less time-consuming field survey which provide high resolution image of the subsurface. Geophysical surveys carried out in this study include seismic refraction and 2D resistivity imaging method for delineation of sub-surface strata in different parts of Srinagar, Garhwal Himalaya, India. The aim of this survey was to map the shallow subsurface in terms of geological and geophysical properties mainly P-wave velocity, resistivity, layer thickness, and lithology of the area. Both sides of the river, Alaknanda which flows through the centre of the city, have been covered by taking two profiles on each side using both methods. Seismic and electrical surveys were carried out at the same locations to complement the results of each other. The seismic refraction survey was carried out using ABEM TeraLoc 24 channel Seismograph and 2D resistivity imaging was performed using ABEM Terrameter LS equipment. The results show three distinct layers on both sides of the river up to the depth of 20 m. The subsurface is divided into three distinct layers namely, alluvium extending up to, 3 m depth, conglomerate zone lying between the depth of 3 m to 15 m, and compacted pebbles and cobbles beyond 15 m. P-wave velocity in top layer is found in the range of 400 – 600 m/s, in second layer it varies from 700 – 1100 m/s and in the third layer it is 1500 – 3300 m/s. The resistivity results also show similar pattern and were in good agreement with seismic refraction results. The results obtained in this study were validated with an available exposed river scar at one site. The study established the efficacy of geophysical methods for subsurface investigations.

Keywords: 2D resistivity imaging, P-wave velocity, seismic refraction survey, subsurface

Procedia PDF Downloads 236

1714 Multi-Temporal Cloud Detection and Removal in Satellite Imagery for Land Resources Investigation

Authors: Feng Yin

Abstract:

Clouds are inevitable contaminants in optical satellite imagery, and prevent the satellite imaging systems from acquiring clear view of the earth surface. The presence of clouds in satellite imagery bring negative influences for remote sensing land resources investigation. As a consequence, detecting the locations of clouds in satellite imagery is an essential preprocessing step, and further remove the existing clouds is crucial for the application of imagery. In this paper, a multi-temporal based satellite imagery cloud detection and removal method is proposed, which will be used for large-scale land resource investigation. The proposed method is mainly composed of four steps. First, cloud masks are generated for cloud contaminated images by single temporal cloud detection based on multiple spectral features. Then, a cloud-free reference image of target areas is synthesized by weighted averaging time-series images in which cloud pixels are ignored. Thirdly, the refined cloud detection results are acquired by multi-temporal analysis based on the reference image. Finally, detected clouds are removed via multi-temporal linear regression. The results of a case application in Hubei province indicate that the proposed multi-temporal cloud detection and removal method is effective and promising for large-scale land resource investigation.

Keywords: cloud detection, cloud remove, multi-temporal imagery, land resources investigation

Procedia PDF Downloads 261

1713 Harnessing Emerging Creative Technology for Knowledge Discovery of Multiwavelenght Datasets

Authors: Basiru Amuneni

Abstract:

Astronomy is one domain with a rise in data. Traditional tools for data management have been employed in the quest for knowledge discovery. However, these traditional tools become limited in the face of big. One means of maximizing knowledge discovery for big data is the use of scientific visualisation. The aim of the work is to explore the possibilities offered by emerging creative technologies of Virtual Reality (VR) systems and game engines to visualize multiwavelength datasets. Game Engines are primarily used for developing video games, however their advanced graphics could be exploited for scientific visualization which provides a means to graphically illustrate scientific data to ease human comprehension. Modern astronomy is now in the era of multiwavelength data where a single galaxy for example, is captured by the telescope several times and at different electromagnetic wavelength to have a more comprehensive picture of the physical characteristics of the galaxy. Visualising this in an immersive environment would be more intuitive and natural for an observer. This work presents a standalone VR application that accesses galaxy FITS files. The application was built using the Unity Game Engine for the graphics underpinning and the OpenXR API for the VR infrastructure. The work used a methodology known as Design Science Research (DSR) which entails the act of ‘using design as a research method or technique’. The key stages of the galaxy modelling pipeline are FITS data preparation, Galaxy Modelling, Unity 3D Visualisation and VR Display. The FITS data format cannot be read by the Unity Game Engine directly. A DLL (CSHARPFITS) which provides a native support for reading and writing FITS files was used. The Galaxy modeller uses an approach that integrates cleaned FITS image pixels into the graphics pipeline of the Unity3d game Engine. The cleaned FITS images are then input to the galaxy modeller pipeline phase, which has a pre-processing script that extracts, pixel, galaxy world position, and colour maps the FITS image pixels. The user can visualise image galaxies in different light bands, control the blend of the image with similar images from different sources or fuse images for a holistic view. The framework will allow users to build tools to realise complex workflows for public outreach and possibly scientific work with increased scalability, near real time interactivity with ease of access. The application is presented in an immersive environment and can use all commercially available headset built on the OpenXR API. The user can select galaxies in the scene, teleport to the galaxy, pan, zoom in/out, and change colour gradients of the galaxy. The findings and design lessons learnt in the implementation of different use cases will contribute to the development and design of game-based visualisation tools in immersive environment by enabling informed decisions to be made.

Keywords: astronomy, visualisation, multiwavelenght dataset, virtual reality

Procedia PDF Downloads 72

1712 Estimating the Ladder Angle and the Camera Position From a 2D Photograph Based on Applications of Projective Geometry and Matrix Analysis

Authors: Inigo Beckett

Abstract:

In forensic investigations, it is often the case that the most potentially useful recorded evidence derives from coincidental imagery, recorded immediately before or during an incident, and that during the incident (e.g. a ‘failure’ or fire event), the evidence is changed or destroyed. To an image analysis expert involved in photogrammetric analysis for Civil or Criminal Proceedings, traditional computer vision methods involving calibrated cameras is often not appropriate because image metadata cannot be relied upon. This paper presents an approach for resolving this problem, considering in particular and by way of a case study, the angle of a simple ladder shown in a photograph. The UK Health and Safety Executive (HSE) guidance document published in 2014 (INDG455) advises that a leaning ladder should be erected at 75 degrees to the horizontal axis. Personal injury cases can arise in the construction industry because a ladder is too steep or too shallow. Ad-hoc photographs of such ladders in their incident position provide a basis for analysis of their angle. This paper presents a direct approach for ascertaining the position of the camera and the angle of the ladder simultaneously from the photograph(s) by way of a workflow that encompasses a novel application of projective geometry and matrix analysis. Mathematical analysis shows that for a given pixel ratio of directly measured collinear points (i.e. features that lie on the same line segment) from the 2D digital photograph with respect to a given viewing point, we can constrain the 3D camera position to a surface of a sphere in the scene. Depending on what we know about the ladder, we can enforce another independent constraint on the possible camera positions which enables us to constrain the possible positions even further. Experiments were conducted using synthetic and real-world data. The synthetic data modeled a vertical plane with a ladder on a horizontally flat plane resting against a vertical wall. The real-world data was captured using an Apple iPhone 13 Pro and 3D laser scan survey data whereby a ladder was placed in a known location and angle to the vertical axis. For each case, we calculated camera positions and the ladder angles using this method and cross-compared them against their respective ‘true’ values.

Keywords: image analysis, projective geometry, homography, photogrammetry, ladders, Forensics, Mathematical modeling, planar geometry, matrix analysis, collinear, cameras, photographs

Procedia PDF Downloads 31

1711 Accurate Mass Segmentation Using U-Net Deep Learning Architecture for Improved Cancer Detection

Authors: Ali Hamza

Abstract:

Accurate segmentation of breast ultrasound images is of paramount importance in enhancing the diagnostic capabilities of breast cancer detection. This study presents an approach utilizing the U-Net architecture for segmenting breast ultrasound images aimed at improving the accuracy and reliability of mass identification within the breast tissue. The proposed method encompasses a multi-stage process. Initially, preprocessing techniques are employed to refine image quality and diminish noise interference. Subsequently, the U-Net architecture, a deep learning convolutional neural network (CNN), is employed for pixel-wise segmentation of regions of interest corresponding to potential breast masses. The U-Net's distinctive architecture, characterized by a contracting and expansive pathway, enables accurate boundary delineation and detailed feature extraction. To evaluate the effectiveness of the proposed approach, an extensive dataset of breast ultrasound images is employed, encompassing diverse cases. Quantitative performance metrics such as the Dice coefficient, Jaccard index, sensitivity, specificity, and Hausdorff distance are employed to comprehensively assess the segmentation accuracy. Comparative analyses against traditional segmentation methods showcase the superiority of the U-Net architecture in capturing intricate details and accurately segmenting breast masses. The outcomes of this study emphasize the potential of the U-Net-based segmentation approach in bolstering breast ultrasound image analysis. The method's ability to reliably pinpoint mass boundaries holds promise for aiding radiologists in precise diagnosis and treatment planning. However, further validation and integration within clinical workflows are necessary to ascertain their practical clinical utility and facilitate seamless adoption by healthcare professionals. In conclusion, leveraging the U-Net architecture for breast ultrasound image segmentation showcases a robust framework that can significantly enhance diagnostic accuracy and advance the field of breast cancer detection. This approach represents a pivotal step towards empowering medical professionals with a more potent tool for early and accurate breast cancer diagnosis.

Keywords: mage segmentation, U-Net, deep learning, breast cancer detection, diagnostic accuracy, mass identification, convolutional neural network

Procedia PDF Downloads 64

1710 Hope in the Ruins of 'Ozymandias': Reimagining Temporal Horizons in Felicia Hemans 'the Image in Lava'

Authors: Lauren Schuldt Wilson

Abstract:

Felicia Hemans’ memorializing of the unwritten lives of women and the consequent allowance for marginalized voices to remember and be remembered has been considered by many critics in terms of ekphrasis and elegy, terms which privilege the question of whether Hemans’ poeticizing can represent lost voices of history or only her poetic expression. Amy Gates, Brian Elliott, and others point out Hemans’ acknowledgement of the self-projection necessary for imaginatively filling the absences of unrecorded histories. Yet, few have examined the complex temporal positioning Hemans inscribes in these moments of self-projection and imaginative historicizing. In poems like ‘The Image in Lava,’ Hemans maps not only a lost past, but also a lost potential future onto the image of a dead infant in its mother’s arms, the discovery and consideration of which moves the imagined viewer to recover and incorporate the ‘hope’ encapsulated in the figure of the infant into a reevaluation of national time embodied by the ‘relics / Left by the pomps of old.’ By examining Hemans’ acknowledgement and response to Percy Bysshe Shelley’s ‘Ozymandias,’ this essay explores how Hemans’ depictions of imaginative historicizing open new horizons of possibility and reevaluate temporal value structures by imagining previously undiscovered or unexplored potentialities of the past. Where Shelley’s poem mocks the futility of national power and time, this essay outlines Hemans’ suggestion of alternative threads of identity and temporal meaning-making which, regardless of historical veracity, exist outside of and against the structures Shelley challenges. Counter to previous readings of Hemans’ poem as celebration of either recovered or poetically constructed maternal love, this essay argues that Hemans offers a meditation on sites of reproduction—both of personal reproductive futurity and of national reproduction of power. This meditation culminates in Hemans’ gesturing towards a method of historicism by which the imagined viewer reinvigorates the sterile, ‘shattered visage’ of national time by forming temporal identity through the imagining of trans-historical hope inscribed on the infant body of the universal, individual subject rather than the broken monument of the king.

Keywords: futurity, national temporalities, reproduction, revisionary histories

Procedia PDF Downloads 146

1709 Optimizing the Efficiency of Measuring Instruments in Ouagadougou-Burkina Faso

Authors: Moses Emetere, Marvel Akinyemi, S. E. Sanni

Abstract:

At the moment, AERONET or AMMA database shows a large volume of data loss. With only about 47% data set available to the scientist, it is evident that accurate nowcast or forecast cannot be guaranteed. The calibration constants of most radiosonde or weather stations are not compatible with the atmospheric conditions of the West African climate. A dispersion model was developed to incorporate salient mathematical representations like a Unified number. The Unified number was derived to describe the turbulence of the aerosols transport in the frictional layer of the lower atmosphere. Fourteen years data set from Multi-angle Imaging SpectroRadiometer (MISR) was tested using the dispersion model. A yearly estimation of the atmospheric constants over Ouagadougou using the model was obtained with about 87.5% accuracy. It further revealed that the average atmospheric constant for Ouagadougou-Niger is a_1 = 0.626, a_2 = 0.7999 and the tuning constants is n_1 = 0.09835 and n_2 = 0.266. Also, the yearly atmospheric constants affirmed the lower atmosphere of Ouagadougou is very dynamic. Hence, it is recommended that radiosonde and weather station manufacturers should constantly review the atmospheric constant over a geographical location to enable about eighty percent data retrieval.

Keywords: aerosols retention, aerosols loading, statistics, analytical technique

Procedia PDF Downloads 293

1708 Accurate Positioning Method of Indoor Plastering Robot Based on Line Laser

Authors: Guanqiao Wang, Hongyang Yu

Abstract:

There is a lot of repetitive work in the traditional construction industry. These repetitive tasks can significantly improve production efficiency by replacing manual tasks with robots. There- fore, robots appear more and more frequently in the construction industry. Navigation and positioning are very important tasks for construction robots, and the requirements for accuracy of positioning are very high. Traditional indoor robots mainly use radiofrequency or vision methods for positioning. Compared with ordinary robots, the indoor plastering robot needs to be positioned closer to the wall for wall plastering, so the requirements for construction positioning accuracy are higher, and the traditional navigation positioning method has a large error, which will cause the robot to move. Without the exact position, the wall cannot be plastered, or the error of plastering the wall is large. A new positioning method is proposed, which is assisted by line lasers and uses image processing-based positioning to perform more accurate positioning on the traditional positioning work. In actual work, filter, edge detection, Hough transform and other operations are performed on the images captured by the camera. Each time the position of the laser line is found, it is compared with the standard value, and the position of the robot is moved or rotated to complete the positioning work. The experimental results show that the actual positioning error is reduced to less than 0.5 mm by this accurate positioning method.

Keywords: indoor plastering robot, navigation, precise positioning, line laser, image processing

Procedia PDF Downloads 127

1707 Rapid Flood Damage Assessment of Population and Crops Using Remotely Sensed Data

Authors: Urooj Saeed, Sajid Rashid Ahmad, Iqra Khalid, Sahar Mirza, Imtiaz Younas

Abstract:

Pakistan, a flood-prone country, has experienced worst floods in the recent past which have caused extensive damage to the urban and rural areas by loss of lives, damage to infrastructure and agricultural fields. Poor flood management system in the country has projected the risks of damages as the increasing frequency and magnitude of floods are felt as a consequence of climate change; affecting national economy directly or indirectly. To combat the needs of flood emergency, this paper focuses on remotely sensed data based approach for rapid mapping and monitoring of flood extent and its damages so that fast dissemination of information can be done, from local to national level. In this research study, spatial extent of the flooding caused by heavy rains of 2014 has been mapped by using space borne data to assess the crop damages and affected population in sixteen districts of Punjab. For this purpose, moderate resolution imaging spectroradiometer (MODIS) was used to daily mark the flood extent by using Normalised Difference Water Index (NDWI). The highest flood value data was integrated with the LandScan 2014, 1km x 1km grid based population, to calculate the affected population in flood hazard zone. It was estimated that the floods covered an area of 16,870 square kilometers, with 3.0 million population affected. Moreover, to assess the flood damages, Object Based Image Analysis (OBIA) aided with spectral signatures was applied on Landsat image to attain the thematic layers of healthy (0.54 million acre) and damaged crops (0.43 million acre). The study yields that the population of Jhang district (28% of 2.5 million population) was affected the most. Whereas, in terms of crops, Jhang and Muzzafargarh are the ‘highest damaged’ ranked district of floods 2014 in Punjab. This study was completed within 24 hours of the peak flood time, and proves to be an effective methodology for rapid assessment of damages due to flood hazard

Keywords: flood hazard, space borne data, object based image analysis, rapid damage assessment

Procedia PDF Downloads 308

1706 Rapid Fetal MRI Using SSFSE, FIESTA and FSPGR Techniques

Authors: Chen-Chang Lee, Po-Chou Chen, Jo-Chi Jao, Chun-Chung Lui, Leung-Chit Tsang, Lain-Chyr Hwang

Abstract:

Fetal Magnetic Resonance Imaging (MRI) is a challenge task because the fetal movements could cause motion artifact in MR images. The remedy to overcome this problem is to use fast scanning pulse sequences. The Single-Shot Fast Spin-Echo (SSFSE) T2-weighted imaging technique is routinely performed and often used as a gold standard in clinical examinations. Fast spoiled gradient-echo (FSPGR) T1-Weighted Imaging (T1WI) is often used to identify fat, calcification and hemorrhage. Fast Imaging Employing Steady-State Acquisition (FIESTA) is commonly used to identify fetal structures as well as the heart and vessels. The contrast of FIESTA image is related to T1/T2 and is different from that of SSFSE. The advantages and disadvantages of these two scanning sequences for fetal imaging have not been clearly demonstrated yet. This study aimed to compare these three rapid MRI techniques (SSFSE, FIESTA, and FSPGR) for fetal MRI examinations. The image qualities and influencing factors among these three techniques were explored. A 1.5T GE Discovery 450 clinical MR scanner with an eight-channel high-resolution abdominal coil was used in this study. Twenty-five pregnant women were recruited to enroll fetal MRI examination with SSFSE, FIESTA and FSPGR scanning. Multi-oriented and multi-slice images were acquired. Afterwards, MR images were interpreted and scored by two senior radiologists. The results showed that both SSFSE and T2W-FIESTA can provide good image quality among these three rapid imaging techniques. Vessel signals on FIESTA images are higher than those on SSFSE images. The Specific Absorption Rate (SAR) of FIESTA is lower than that of the others two techniques, but it is prone to cause banding artifacts. FSPGR-T1WI renders lower Signal-to-Noise Ratio (SNR) because it severely suffers from the impact of maternal and fetal movements. The scan times for these three scanning sequences were 25 sec (T2W-SSFSE), 20 sec (FIESTA) and 18 sec (FSPGR). In conclusion, all these three rapid MR scanning sequences can produce high contrast and high spatial resolution images. The scan time can be shortened by incorporating parallel imaging techniques so that the motion artifacts caused by fetal movements can be reduced. Having good understanding of the characteristics of these three rapid MRI techniques is helpful for technologists to obtain reproducible fetal anatomy images with high quality for prenatal diagnosis.

Keywords: fetal MRI, FIESTA, FSPGR, motion artifact, SSFSE

Procedia PDF Downloads 507

1705 Video Compression Using Contourlet Transform

Authors: Delara Kazempour, Mashallah Abasi Dezfuli, Reza Javidan

Abstract:

Video compression used for channels with limited bandwidth and storage devices has limited storage capabilities. One of the most popular approaches in video compression is the usage of different transforms. Discrete cosine transform is one of the video compression methods that have some problems such as blocking, noising and high distortion inappropriate effect in compression ratio. wavelet transform is another approach is better than cosine transforms in balancing of compression and quality but the recognizing of curve curvature is so limit. Because of the importance of the compression and problems of the cosine and wavelet transforms, the contourlet transform is most popular in video compression. In the new proposed method, we used contourlet transform in video image compression. Contourlet transform can save details of the image better than the previous transforms because this transform is multi-scale and oriented. This transform can recognize discontinuity such as edges. In this approach we lost data less than previous approaches. Contourlet transform finds discrete space structure. This transform is useful for represented of two dimension smooth images. This transform, produces compressed images with high compression ratio along with texture and edge preservation. Finally, the results show that the majority of the images, the parameters of the mean square error and maximum signal-to-noise ratio of the new method based contourlet transform compared to wavelet transform are improved but in most of the images, the parameters of the mean square error and maximum signal-to-noise ratio in the cosine transform is better than the method based on contourlet transform.

Keywords: video compression, contourlet transform, discrete cosine transform, wavelet transform

Procedia PDF Downloads 421

1704 Human Action Recognition Using Wavelets of Derived Beta Distributions

Authors: Neziha Jaouedi, Noureddine Boujnah, Mohamed Salim Bouhlel

Abstract:

In the framework of human machine interaction systems enhancement, we focus throw this paper on human behavior analysis and action recognition. Human behavior is characterized by actions and reactions duality (movements, psychological modification, verbal and emotional expression). It’s worth noting that many information is hidden behind gesture, sudden motion points trajectories and speeds, many research works reconstructed an information retrieval issues. In our work we will focus on motion extraction, tracking and action recognition using wavelet network approaches. Our contribution uses an analysis of human subtraction by Gaussian Mixture Model (GMM) and body movement through trajectory models of motion constructed from kalman filter. These models allow to remove the noise using the extraction of the main motion features and constitute a stable base to identify the evolutions of human activity. Each modality is used to recognize a human action using wavelets of derived beta distributions approach. The proposed approach has been validated successfully on a subset of KTH and UCF sports database.

Keywords: feautures extraction, human action classifier, wavelet neural network, beta wavelet

Procedia PDF Downloads 393