Search results for: Goutam%20Kumar%20Ghorai
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12

Search results for: Goutam%20Kumar%20Ghorai

12 Application of a Novel Audio Compression Scheme in Automatic Music Recommendation, Digital Rights Management and Audio Fingerprinting

Authors: Anindya Roy, Goutam Saha

Abstract:

Rapid progress in audio compression technology has contributed to the explosive growth of music available in digital form today. In a reversal of ideas, this work makes use of a recently proposed efficient audio compression scheme to develop three important applications in the context of Music Information Retrieval (MIR) for the effective manipulation of large music databases, namely automatic music recommendation (AMR), digital rights management (DRM) and audio finger-printing for song identification. The performance of these three applications has been evaluated with respect to a database of songs collected from a diverse set of genres.

Keywords: Audio compression, Music Information Retrieval, Digital Rights Management, Audio Fingerprinting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
11 Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.

Keywords: Speech Enhancement, Generalized Laplacian Distribution, Log Gabor Wavelet, Bayesian MAP Marginal Estimator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583
10 A Differential Calculus Based Image Steganography with Crossover

Authors: Srilekha Mukherjee, Subha Ash, Goutam Sanyal

Abstract:

Information security plays a major role in uplifting the standard of secured communications via global media. In this paper, we have suggested a technique of encryption followed by insertion before transmission. Here, we have implemented two different concepts to carry out the above-specified tasks. We have used a two-point crossover technique of the genetic algorithm to facilitate the encryption process. For each of the uniquely identified rows of pixels, different mathematical methodologies are applied for several conditions checking, in order to figure out all the parent pixels on which we perform the crossover operation. This is done by selecting two crossover points within the pixels thereby producing the newly encrypted child pixels, and hence the encrypted cover image. In the next lap, the first and second order derivative operators are evaluated to increase the security and robustness. The last lap further ensures reapplication of the crossover procedure to form the final stego-image. The complexity of this system as a whole is huge, thereby dissuading the third party interferences. Also, the embedding capacity is very high. Therefore, a larger amount of secret image information can be hidden. The imperceptible vision of the obtained stego-image clearly proves the proficiency of this approach.

Keywords: Steganography, Crossover, Differential Calculus, Peak Signal to Noise Ratio, Cross-correlation Coefficient.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1336
9 Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

Authors: Sandipan Chakroborty, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

Keywords: Gaussian Filter, Triangular Filter, Subbands, Correlation, MFCC, IMFCC, GMM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2384
8 Improved Closed Set Text-Independent Speaker Identification by Combining MFCC with Evidence from Flipped Filter Banks

Authors: Sandipan Chakroborty, Anindya Roy, Goutam Saha

Abstract:

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for SI applications. However, due to the structure of its filter bank, it captures vocal tract characteristics more effectively in the lower frequency regions. This paper proposes a new set of features using a complementary filter bank structure which improves distinguishability of speaker specific cues present in the higher frequency zone. Unlike high level features that are difficult to extract, the proposed feature set involves little computational burden during the extraction process. When combined with MFCC via a parallel implementation of speaker models, the proposed feature set outperforms baseline MFCC significantly. This proposition is validated by experiments conducted on two different kinds of public databases namely YOHO (microphone speech) and POLYCOST (telephone speech) with Gaussian Mixture Models (GMM) as a Classifier for various model orders.

Keywords: Complementary Information, Filter Bank, GMM, IMFCC, MFCC, Speaker Identification, Speaker Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2208
7 Attention Based Fully Convolutional Neural Network for Simultaneous Detection and Segmentation of Optic Disc in Retinal Fundus Images

Authors: Sandip Sadhukhan, Arpita Sarkar, Debprasad Sinha, Goutam Kumar Ghorai, Gautam Sarkar, Ashis K. Dhara

Abstract:

Accurate segmentation of the optic disc is very important for computer-aided diagnosis of several ocular diseases such as glaucoma, diabetic retinopathy, and hypertensive retinopathy. The paper presents an accurate and fast optic disc detection and segmentation method using an attention based fully convolutional network. The network is trained from scratch using the fundus images of extended MESSIDOR database and the trained model is used for segmentation of optic disc. The false positives are removed based on morphological operation and shape features. The result is evaluated using three-fold cross-validation on six public fundus image databases such as DIARETDB0, DIARETDB1, DRIVE, AV-INSPIRE, CHASE DB1 and MESSIDOR. The attention based fully convolutional network is robust and effective for detection and segmentation of optic disc in the images affected by diabetic retinopathy and it outperforms existing techniques.

Keywords: Ocular diseases, retinal fundus image, optic disc detection and segmentation, fully convolutional network, overlap measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 705
6 In Search of an SVD and QRcp Based Optimization Technique of ANN for Automatic Classification of Abnormal Heart Sounds

Authors: Samit Ari, Goutam Saha

Abstract:

Artificial Neural Network (ANN) has been extensively used for classification of heart sounds for its discriminative training ability and easy implementation. However, it suffers from overparameterization if the number of nodes is not chosen properly. In such cases, when the dataset has redundancy within it, ANN is trained along with this redundant information that results in poor validation. Also a larger network means more computational expense resulting more hardware and time related cost. Therefore, an optimum design of neural network is needed towards real-time detection of pathological patterns, if any from heart sound signal. The aims of this work are to (i) select a set of input features that are effective for identification of heart sound signals and (ii) make certain optimum selection of nodes in the hidden layer for a more effective ANN structure. Here, we present an optimization technique that involves Singular Value Decomposition (SVD) and QR factorization with column pivoting (QRcp) methodology to optimize empirically chosen over-parameterized ANN structure. Input nodes present in ANN structure is optimized by SVD followed by QRcp while only SVD is required to prune undesirable hidden nodes. The result is presented for classifying 12 common pathological cases and normal heart sound.

Keywords: ANN, Classification of heart diseases, murmurs, optimization, Phonocardiogram, QRcp, SVD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2011
5 Metallurgical Analysis of Surface Defect in Telescopic Front Fork

Authors: Souvik Das, Janak Lal, Arthita Dey, Goutam Mukhopadhyay, Sandip Bhattacharya

Abstract:

Telescopic Front Fork (TFF) used in two wheelers, mainly motorcycle, is made from high strength steel, and is manufactured by high frequency induction welding process wherein hot rolled and pickled coils are used as input raw material for rolling of hollow tubes followed by heat treatment, surface treatment, cold drawing, tempering, etc. The final application demands superior quality TFF tubes w.r.t. surface finish and dimensional tolerances. This paper presents the investigation of two different types of failure of fork during operation. The investigation consists of visual inspection, chemical analysis, characterization of microstructure, and energy dispersive spectroscopy. In this paper, comprehensive investigations of two failed tube samples were investigated. In case of Sample #1, the result revealed that there was a pre-existing crack, known as hook crack, which leads to the cracking of the tube. Metallographic examination exhibited that during field operation the pre-existing hook crack was surfaced out leading to crack in the pipe. In case of Sample #2, presence of internal oxidation with decarburised grains inside the material indicates origin of the defect from slab stage.

Keywords: Telescopic front fork, induction welding, hook crack, internal oxidation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 769
4 Asynchronous Parallel Distributed Genetic Algorithm with Elite Migration

Authors: Kazunori Kojima, Masaaki Ishigame, Goutam Chakraborty, Hiroshi Hatsuo, Shozo Makino

Abstract:

In most of the popular implementation of Parallel GAs the whole population is divided into a set of subpopulations, each subpopulation executes GA independently and some individuals are migrated at fixed intervals on a ring topology. In these studies, the migrations usually occur 'synchronously' among subpopulations. Therefore, CPUs are not used efficiently and the communication do not occur efficiently either. A few studies tried asynchronous migration but it is hard to implement and setting proper parameter values is difficult. The aim of our research is to develop a migration method which is easy to implement, which is easy to set parameter values, and which reduces communication traffic. In this paper, we propose a traffic reduction method for the Asynchronous Parallel Distributed GA by migration of elites only. This is a Server-Client model. Every client executes GA on a subpopulation and sends an elite information to the server. The server manages the elite information of each client and the migrations occur according to the evolution of sub-population in a client. This facilitates the reduction in communication traffic. To evaluate our proposed model, we apply it to many function optimization problems. We confirm that our proposed method performs as well as current methods, the communication traffic is less, and setting of the parameters are much easier.

Keywords: Parallel Distributed Genetic Algorithm (PDGA), asynchronousPDGA, Server-Client configuration, Elite Migration

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1312
3 Online Optic Disk Segmentation Using Fractals

Authors: Srinivasan Aruchamy, Partha Bhattacharjee, Goutam Sanyal

Abstract:

Optic disk segmentation plays a key role in the mass screening of individuals with diabetic retinopathy and glaucoma ailments. An efficient hardware-based algorithm for optic disk localization and segmentation would aid for developing an automated retinal image analysis system for real time applications. Herein, TMS320C6416DSK DSP board pixel intensity based fractal analysis algorithm for an automatic localization and segmentation of the optic disk is reported. The experiment has been performed on color and fluorescent angiography retinal fundus images. Initially, the images were pre-processed to reduce the noise and enhance the quality. The retinal vascular tree of the image was then extracted using canny edge detection technique. Finally, a pixel intensity based fractal analysis is performed to segment the optic disk by tracing the origin of the vascular tree. The proposed method is examined on three publicly available data sets of the retinal image and also with the data set obtained from an eye clinic. The average accuracy achieved is 96.2%. To the best of the knowledge, this is the first work reporting the use of TMS320C6416DSK DSP board and pixel intensity based fractal analysis algorithm for an automatic localization and segmentation of the optic disk. This will pave the way for developing devices for detection of retinal diseases in the future.

Keywords: Color retinal fundus images, Diabetic retinopathy, Fluorescein angiography retinal fundus images, Fractal analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2456
2 Nonlinear Thermal Hydraulic Model to Analyze Parallel Channel Density Wave Instabilities in Natural Circulation Boiling Water Reactor with Asymmetric Power Distribution

Authors: Sachin Kumar, Vivek Tiwari, Goutam Dutta

Abstract:

The paper investigates parallel channel instabilities of natural circulation boiling water reactor. A thermal-hydraulic model is developed to simulate two-phase flow behavior in the natural circulation boiling water reactor (NCBWR) with the incorporation of ex-core components and recirculation loop such as steam separator, down-comer, lower-horizontal section and upper-horizontal section and then, numerical analysis is carried out for parallel channel instabilities of the reactor undergoing both in-phase and out-of-phase modes of oscillations. To analyze the relative effect on stability of the reactor due to inclusion of various ex-core components and recirculation loop, marginal stable point is obtained at a particular inlet enthalpy of the reactor core without the inclusion of ex-core components and recirculation loop and then with the inclusion of the same. Numerical simulations are also conducted to determine the relative dominance between two modes of oscillations i.e. in-phase and out-of-phase. Simulations are also carried out when the channels are subjected to asymmetric power distribution keeping the inlet enthalpy same.

Keywords: Asymmetric power distribution, Density wave oscillations, In-phase and out-of-phase modes of instabilities, Natural circulation boiling water reactor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2216
1 Speaker Identification by Joint Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

Real world Speaker Identification (SI) application differs from ideal or laboratory conditions causing perturbations that leads to a mismatch between the training and testing environment and degrade the performance drastically. Many strategies have been adopted to cope with acoustical degradation; wavelet based Bayesian marginal model is one of them. But Bayesian marginal models cannot model the inter-scale statistical dependencies of different wavelet scales. Simple nonlinear estimators for wavelet based denoising assume that the wavelet coefficients in different scales are independent in nature. However wavelet coefficients have significant inter-scale dependency. This paper enhances this inter-scale dependency property by a Circularly Symmetric Probability Density Function (CS-PDF) related to the family of Spherically Invariant Random Processes (SIRPs) in Log Gabor Wavelet (LGW) domain and corresponding joint shrinkage estimator is derived by Maximum a Posteriori (MAP) estimator. A framework is proposed based on these to denoise speech signal for automatic speaker identification problems. The robustness of the proposed framework is tested for Text Independent Speaker Identification application on 100 speakers of POLYCOST and 100 speakers of YOHO speech database in three different noise environments. Experimental results show that the proposed estimator yields a higher improvement in identification accuracy compared to other estimators on popular Gaussian Mixture Model (GMM) based speaker model and Mel-Frequency Cepstral Coefficient (MFCC) features.

Keywords: Speaker Identification, Log Gabor Wavelet, Bayesian Bivariate Estimator, Circularly Symmetric Probability Density Function, SIRP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591