Search results for: trademark image classification
3322 Grid Pattern Recognition and Suppression in Computed Radiographic Images
Authors: Igor Belykh
Abstract:
Anti-scatter grids used in radiographic imaging for the contrast enhancement leave specific artifacts. Those artifacts may be visible or may cause Moiré effect when a digital image is resized on a diagnostic monitor. In this paper, we propose an automated grid artifacts detection and suppression algorithm which is still an actual problem. Grid artifacts detection is based on statistical approach in spatial domain. Grid artifacts suppression is based on Kaiser bandstop filter transfer function design and application avoiding ringing artifacts. Experimental results are discussed and concluded with description of advantages over existing approaches.Keywords: grid, computed radiography, pattern recognition, image processing, filtering
Procedia PDF Downloads 2813321 Life Expansion: Autobiography, Ficctionalized Digital Diaries and Forged Narratives of Everyday Life on Instagram
Authors: Pablo M. S. Vallejos
Abstract:
The article aims to analyze the autobiographical practices of users on Instagram, observing the instrumentalization of image resources in the construction of visual narratives that make up that archive and digital diary. Through bibliographical review, discourse exploration and case studies, the research also aims to present a new theoretical perception about everyday records - edited with a collage of filters and aesthetic tools - that permeate that social network, understanding it as a platform fictionalizing and an expansion of life. In this way, therefore, the work reflects on possible futures in the elaboration of representations and identities in the context of digital spaces in the 21st century.Keywords: visual culture, social media, autobiography, image
Procedia PDF Downloads 793320 Rank-Based Chain-Mode Ensemble for Binary Classification
Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu
Abstract:
In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble
Procedia PDF Downloads 1343319 Intelligent Rheumatoid Arthritis Identification System Based Image Processing and Neural Classifier
Authors: Abdulkader Helwan
Abstract:
Rheumatoid joint inflammation is characterized as a perpetual incendiary issue which influences the joints by hurting body tissues Therefore, there is an urgent need for an effective intelligent identification system of knee Rheumatoid arthritis especially in its early stages. This paper is to develop a new intelligent system for the identification of Rheumatoid arthritis of the knee utilizing image processing techniques and neural classifier. The system involves two principle stages. The first one is the image processing stage in which the images are processed using some techniques such as RGB to gryascale conversion, rescaling, median filtering, background extracting, images subtracting, segmentation using canny edge detection, and features extraction using pattern averaging. The extracted features are used then as inputs for the neural network which classifies the X-ray knee images as normal or abnormal (arthritic) based on a backpropagation learning algorithm which involves training of the network on 400 X-ray normal and abnormal knee images. The system was tested on 400 x-ray images and the network shows good performance during that phase, resulting in a good identification rate 97%.Keywords: rheumatoid arthritis, intelligent identification, neural classifier, segmentation, backpropoagation
Procedia PDF Downloads 5313318 Similarity Based Retrieval in Case Based Reasoning for Analysis of Medical Images
Authors: M. Dasgupta, S. Banerjee
Abstract:
Content Based Image Retrieval (CBIR) coupled with Case Based Reasoning (CBR) is a paradigm that is becoming increasingly popular in the diagnosis and therapy planning of medical ailments utilizing the digital content of medical images. This paper presents a survey of some of the promising approaches used in the detection of abnormalities in retina images as well in mammographic screening and detection of regions of interest in MRI scans of the brain. We also describe our proposed algorithm to detect hard exudates in fundus images of the retina of Diabetic Retinopathy patients.Keywords: case based reasoning, exudates, retina image, similarity based retrieval
Procedia PDF Downloads 3473317 Attention Multiple Instance Learning for Cancer Tissue Classification in Digital Histopathology Images
Authors: Afaf Alharbi, Qianni Zhang
Abstract:
The identification of malignant tissue in histopathological slides holds significant importance in both clinical settings and pathology research. This paper introduces a methodology aimed at automatically categorizing cancerous tissue through the utilization of a multiple-instance learning framework. This framework is specifically developed to acquire knowledge of the Bernoulli distribution of the bag label probability by employing neural networks. Furthermore, we put forward a neural network based permutation-invariant aggregation operator, equivalent to attention mechanisms, which is applied to the multi-instance learning network. Through empirical evaluation of an openly available colon cancer histopathology dataset, we provide evidence that our approach surpasses various conventional deep learning methods.Keywords: attention multiple instance learning, MIL and transfer learning, histopathological slides, cancer tissue classification
Procedia PDF Downloads 1093316 Detecting Tomato Flowers in Greenhouses Using Computer Vision
Authors: Dor Oppenheim, Yael Edan, Guy Shani
Abstract:
This paper presents an image analysis algorithm to detect and count yellow tomato flowers in a greenhouse with uneven illumination conditions, complex growth conditions and different flower sizes. The algorithm is designed to be employed on a drone that flies in greenhouses to accomplish several tasks such as pollination and yield estimation. Detecting the flowers can provide useful information for the farmer, such as the number of flowers in a row, and the number of flowers that were pollinated since the last visit to the row. The developed algorithm is designed to handle the real world difficulties in a greenhouse which include varying lighting conditions, shadowing, and occlusion, while considering the computational limitations of the simple processor in the drone. The algorithm identifies flowers using an adaptive global threshold, segmentation over the HSV color space, and morphological cues. The adaptive threshold divides the images into darker and lighter images. Then, segmentation on the hue, saturation and volume is performed accordingly, and classification is done according to size and location of the flowers. 1069 images of greenhouse tomato flowers were acquired in a commercial greenhouse in Israel, using two different RGB Cameras – an LG G4 smartphone and a Canon PowerShot A590. The images were acquired from multiple angles and distances and were sampled manually at various periods along the day to obtain varying lighting conditions. Ground truth was created by manually tagging approximately 25,000 individual flowers in the images. Sensitivity analyses on the acquisition angle of the images, periods throughout the day, different cameras and thresholding types were performed. Precision, recall and their derived F1 score were calculated. Results indicate better performance for the view angle facing the flowers than any other angle. Acquiring images in the afternoon resulted with the best precision and recall results. Applying a global adaptive threshold improved the median F1 score by 3%. Results showed no difference between the two cameras used. Using hue values of 0.12-0.18 in the segmentation process provided the best results in precision and recall, and the best F1 score. The precision and recall average for all the images when using these values was 74% and 75% respectively with an F1 score of 0.73. Further analysis showed a 5% increase in precision and recall when analyzing images acquired in the afternoon and from the front viewpoint.Keywords: agricultural engineering, image processing, computer vision, flower detection
Procedia PDF Downloads 3283315 Classification Based on Deep Neural Cellular Automata Model
Authors: Yasser F. Hassan
Abstract:
Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.Keywords: cellular automata, neural cellular automata, deep learning, classification
Procedia PDF Downloads 1943314 Categorical Metadata Encoding Schemes for Arteriovenous Fistula Blood Flow Sound Classification: Scaling Numerical Representations Leads to Improved Performance
Authors: George Zhou, Yunchan Chen, Candace Chien
Abstract:
Kidney replacement therapy is the current standard of care for end-stage renal diseases. In-center or home hemodialysis remains an integral component of the therapeutic regimen. Arteriovenous fistulas (AVF) make up the vascular circuit through which blood is filtered and returned. Naturally, AVF patency determines whether adequate clearance and filtration can be achieved and directly influences clinical outcomes. Our aim was to build a deep learning model for automated AVF stenosis screening based on the sound of blood flow through the AVF. A total of 311 patients with AVF were enrolled in this study. Blood flow sounds were collected using a digital stethoscope. For each patient, blood flow sounds were collected at 6 different locations along the patient’s AVF. The 6 locations are artery, anastomosis, distal vein, middle vein, proximal vein, and venous arch. A total of 1866 sounds were collected. The blood flow sounds are labeled as “patent” (normal) or “stenotic” (abnormal). The labels are validated from concurrent ultrasound. Our dataset included 1527 “patent” and 339 “stenotic” sounds. We show that blood flow sounds vary significantly along the AVF. For example, the blood flow sound is loudest at the anastomosis site and softest at the cephalic arch. Contextualizing the sound with location metadata significantly improves classification performance. How to encode and incorporate categorical metadata is an active area of research1. Herein, we study ordinal (i.e., integer) encoding schemes. The numerical representation is concatenated to the flattened feature vector. We train a vision transformer (ViT) on spectrogram image representations of the sound and demonstrate that using scalar multiples of our integer encodings improves classification performance. Models are evaluated using a 10-fold cross-validation procedure. The baseline performance of our ViT without any location metadata achieves an AuROC and AuPRC of 0.68 ± 0.05 and 0.28 ± 0.09, respectively. Using the following encodings of Artery:0; Arch: 1; Proximal: 2; Middle: 3; Distal 4: Anastomosis: 5, the ViT achieves an AuROC and AuPRC of 0.69 ± 0.06 and 0.30 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 10; Proximal: 20; Middle: 30; Distal 40: Anastomosis: 50, the ViT achieves an AuROC and AuPRC of 0.74 ± 0.06 and 0.38 ± 0.10, respectively. Using the following encodings of Artery:0; Arch: 100; Proximal: 200; Middle: 300; Distal 400: Anastomosis: 500, the ViT achieves an AuROC and AuPRC of 0.78 ± 0.06 and 0.43 ± 0.11. respectively. Interestingly, we see that using increasing scalar multiples of our integer encoding scheme (i.e., encoding “venous arch” as 1,10,100) results in progressively improved performance. In theory, the integer values do not matter since we are optimizing the same loss function; the model can learn to increase or decrease the weights associated with location encodings and converge on the same solution. However, in the setting of limited data and computation resources, increasing the importance at initialization either leads to faster convergence or helps the model escape a local minimum.Keywords: arteriovenous fistula, blood flow sounds, metadata encoding, deep learning
Procedia PDF Downloads 863313 A Nonlinear Parabolic Partial Differential Equation Model for Image Enhancement
Authors: Tudor Barbu
Abstract:
We present a robust nonlinear parabolic partial differential equation (PDE)-based denoising scheme in this article. Our approach is based on a second-order anisotropic diffusion model that is described first. Then, a consistent and explicit numerical approximation algorithm is constructed for this continuous model by using the finite-difference method. Finally, our restoration experiments and method comparison, which prove the effectiveness of this proposed technique, are discussed in this paper.Keywords: anisotropic diffusion, finite differences, image denoising and restoration, nonlinear PDE model, anisotropic diffusion, numerical approximation schemes
Procedia PDF Downloads 3113312 An 8-Bit, 100-MSPS Fully Dynamic SAR ADC for Ultra-High Speed Image Sensor
Authors: F. Rarbi, D. Dzahini, W. Uhring
Abstract:
In this paper, a dynamic and power efficient 8-bit and 100-MSPS Successive Approximation Register (SAR) Analog-to-Digital Converter (ADC) is presented. The circuit uses a non-differential capacitive Digital-to-Analog (DAC) architecture segmented by 2. The prototype is produced in a commercial 65-nm 1P7M CMOS technology with 1.2-V supply voltage. The size of the core ADC is 208.6 x 103.6 µm2. The post-layout noise simulation results feature a SNR of 46.9 dB at Nyquist frequency, which means an effective number of bit (ENOB) of 7.5-b. The total power consumption of this SAR ADC is only 1.55 mW at 100-MSPS. It achieves then a figure of merit of 85.6 fJ/step.Keywords: CMOS analog to digital converter, dynamic comparator, image sensor application, successive approximation register
Procedia PDF Downloads 4163311 A Four-Step Ortho-Rectification Procedure for Geo-Referencing Video Streams from a Low-Cost UAV
Authors: B. O. Olawale, C. R. Chatwin, R. C. D. Young, P. M. Birch, F. O. Faithpraise, A. O. Olukiran
Abstract:
Ortho-rectification is the process of geometrically correcting an aerial image such that the scale is uniform. The ortho-image formed from the process is corrected for lens distortion, topographic relief, and camera tilt. This can be used to measure true distances, because it is an accurate representation of the Earth’s surface. Ortho-rectification and geo-referencing are essential to pin point the exact location of targets in video imagery acquired at the UAV platform. This can only be achieved by comparing such video imagery with an existing digital map. However, it is only when the image is ortho-rectified with the same co-ordinate system as an existing map that such a comparison is possible. The video image sequences from the UAV platform must be geo-registered, that is, each video frame must carry the necessary camera information before performing the ortho-rectification process. Each rectified image frame can then be mosaicked together to form a seamless image map covering the selected area. This can then be used for comparison with an existing map for geo-referencing. In this paper, we present a four-step ortho-rectification procedure for real-time geo-referencing of video data from a low-cost UAV equipped with multi-sensor system. The basic procedures for the real-time ortho-rectification are: (1) Decompilation of video stream into individual frames; (2) Finding of interior camera orientation parameters; (3) Finding the relative exterior orientation parameters for each video frames with respect to each other; (4) Finding the absolute exterior orientation parameters, using self-calibration adjustment with the aid of a mathematical model. Each ortho-rectified video frame is then mosaicked together to produce a 2-D planimetric mapping, which can be compared with a well referenced existing digital map for the purpose of georeferencing and aerial surveillance. A test field located in Abuja, Nigeria was used for testing our method. Fifteen minutes video and telemetry data were collected using the UAV and the data collected were processed using the four-step ortho-rectification procedure. The results demonstrated that the geometric measurement of the control field from ortho-images are more reliable than those from original perspective photographs when used to pin point the exact location of targets on the video imagery acquired by the UAV. The 2-D planimetric accuracy when compared with the 6 control points measured by a GPS receiver is between 3 to 5 meters.Keywords: geo-referencing, ortho-rectification, video frame, self-calibration
Procedia PDF Downloads 4773310 A Combination of Independent Component Analysis, Relative Wavelet Energy and Support Vector Machine for Mental State Classification
Authors: Nguyen The Hoang Anh, Tran Huy Hoang, Vu Tat Thang, T. T. Quyen Bui
Abstract:
Mental state classification is an important step for realizing a control system based on electroencephalography (EEG) signals which could benefit a lot of paralyzed people including the locked-in or Amyotrophic Lateral Sclerosis. Considering that EEG signals are nonstationary and often contaminated by various types of artifacts, classifying thoughts into correct mental states is not a trivial problem. In this work, our contribution is that we present and realize a novel model which integrates different techniques: Independent component analysis (ICA), relative wavelet energy, and support vector machine (SVM) for the same task. We applied our model to classify thoughts in two types of experiment whether with two or three mental states. The experimental results show that the presented model outperforms other models using Artificial Neural Network, K-Nearest Neighbors, etc.Keywords: EEG, ICA, SVM, wavelet
Procedia PDF Downloads 3823309 An Intelligent Nondestructive Testing System of Ultrasonic Infrared Thermal Imaging Based on Embedded Linux
Authors: Hao Mi, Ming Yang, Tian-yue Yang
Abstract:
Ultrasonic infrared nondestructive testing is a kind of testing method with high speed, accuracy and localization. However, there are still some problems, such as the detection requires manual real-time field judgment, the methods of result storage and viewing are still primitive. An intelligent non-destructive detection system based on embedded linux is put forward in this paper. The hardware part of the detection system is based on the ARM (Advanced Reduced Instruction Set Computer Machine) core and an embedded linux system is built to realize image processing and defect detection of thermal images. The CLAHE algorithm and the Butterworth filter are used to process the thermal image, and then the boa server and CGI (Common Gateway Interface) technology are used to transmit the test results to the display terminal through the network for real-time monitoring and remote monitoring. The system also liberates labor and eliminates the obstacle of manual judgment. According to the experiment result, the system provides a convenient and quick solution for industrial non-destructive testing.Keywords: remote monitoring, non-destructive testing, embedded Linux system, image processing
Procedia PDF Downloads 2213308 Assessment of Agricultural Land Use Land Cover, Land Surface Temperature and Population Changes Using Remote Sensing and GIS: Southwest Part of Marmara Sea, Turkey
Authors: Melis Inalpulat, Levent Genc
Abstract:
Land Use Land Cover (LULC) changes due to human activities and natural causes have become a major environmental concern. Assessment of temporal remote sensing data provides information about LULC impacts on environment. Land Surface Temperature (LST) is one of the important components for modeling environmental changes in climatological, hydrological, and agricultural studies. In this study, LULC changes (September 7, 1984 and July 8, 2014) especially in agricultural lands together with population changes (1985-2014) and LST status were investigated using remotely sensed and census data in South Marmara Watershed, Turkey. LULC changes were determined using Landsat TM and Landsat OLI data acquired in 1984 and 2014 summers. Six-band TM and OLI images were classified using supervised classification method to prepare LULC map including five classes including Forest (F), Grazing Land (G), Agricultural Land (A), Water Surface (W), and Residential Area-Bare Soil (R-B) classes. The LST image was also derived from thermal bands of the same dates. LULC classification results showed that forest areas, agricultural lands, water surfaces and residential area-bare soils were increased as 65751 ha, 20163 ha, 1924 ha and 20462 ha respectively. In comparison, a dramatic decrement occurred in grazing land (107985 ha) within three decades. The population increased % 29 between years 1984-2014 in whole study area. Along with the natural causes, migration also caused this increase since the study area has an important employment potential. LULC was transformed among the classes due to the expansion in residential, commercial and industrial areas as well as political decisions. In the study, results showed that agricultural lands around the settlement areas transformed to residential areas in 30 years. The LST images showed that mean temperatures were ranged between 26-32 °C in 1984 and 27-33 °C in 2014. Minimum temperature of agricultural lands was increased 3 °C and reached to 23 °C. In contrast, maximum temperature of A class decreased to 41 °C from 44 °C. Considering temperatures of the 2014 R-B class and 1984 status of same areas, it was seen that mean, min and max temperatures increased by 2 °C. As a result, the dynamism of population, LULC and LST resulted in increasing mean and maximum surface temperatures, living spaces/industrial areas and agricultural lands.Keywords: census data, landsat, land surface temperature (LST), land use land cover (LULC)
Procedia PDF Downloads 3913307 Digital Material Characterization Using the Quantum Fourier Transform
Authors: Felix Givois, Nicolas R. Gauger, Matthias Kabel
Abstract:
The efficient digital material characterization is of great interest to many fields of application. It consists of the following three steps. First, a 3D reconstruction of 2D scans must be performed. Then, the resulting gray-value image of the material sample is enhanced by image processing methods. Finally, partial differential equations (PDE) are solved on the segmented image, and by averaging the resulting solutions fields, effective properties like stiffness or conductivity can be computed. Due to the high resolution of current CT images, the latter is typically performed with matrix-free solvers. Among them, a solver that uses the explicit formula of the Green-Eshelby operator in Fourier space has been proposed by Moulinec and Suquet. Its algorithmic, most complex part is the Fast Fourier Transformation (FFT). In our talk, we will discuss the potential quantum advantage that can be obtained by replacing the FFT with the Quantum Fourier Transformation (QFT). We will especially show that the data transfer for noisy intermediate-scale quantum (NISQ) devices can be improved by using appropriate boundary conditions for the PDE, which also allows using semi-classical versions of the QFT. In the end, we will compare the results of the QFT-based algorithm for simple geometries with the results of the FFT-based homogenization method.Keywords: most likelihood amplitude estimation (MLQAE), numerical homogenization, quantum Fourier transformation (QFT), NISQ devises
Procedia PDF Downloads 753306 View Synthesis of Kinetic Depth Imagery for 3D Security X-Ray Imaging
Authors: O. Abusaeeda, J. P. O. Evans, D. Downes
Abstract:
We demonstrate the synthesis of intermediary views within a sequence of X-ray images that exhibit depth from motion or kinetic depth effect in a visual display. Each synthetic image replaces the requirement for a linear X-ray detector array during the image acquisition process. Scale invariant feature transform, SIFT, in combination with epipolar morphing is employed to produce synthetic imagery. Comparison between synthetic and ground truth images is reported to quantify the performance of the approach. Our work is a key aspect in the development of a 3D imaging modality for the screening of luggage at airport checkpoints. This programme of research is in collaboration with the UK Home Office and the US Dept. of Homeland Security.Keywords: X-ray, kinetic depth, KDE, view synthesis
Procedia PDF Downloads 2633305 Moving Object Detection Using Histogram of Uniformly Oriented Gradient
Authors: Wei-Jong Yang, Yu-Siang Su, Pau-Choo Chung, Jar-Ferr Yang
Abstract:
Moving object detection (MOD) is an important issue in advanced driver assistance systems (ADAS). There are two important moving objects, pedestrians and scooters in ADAS. In real-world systems, there exist two important challenges for MOD, including the computational complexity and the detection accuracy. The histogram of oriented gradient (HOG) features can easily detect the edge of object without invariance to changes in illumination and shadowing. However, to reduce the execution time for real-time systems, the image size should be down sampled which would lead the outlier influence to increase. For this reason, we propose the histogram of uniformly-oriented gradient (HUG) features to get better accurate description of the contour of human body. In the testing phase, the support vector machine (SVM) with linear kernel function is involved. Experimental results show the correctness and effectiveness of the proposed method. With SVM classifiers, the real testing results show the proposed HUG features achieve better than classification performance than the HOG ones.Keywords: moving object detection, histogram of oriented gradient, histogram of uniformly-oriented gradient, linear support vector machine
Procedia PDF Downloads 5923304 Analysis of Two Phase Hydrodynamics in a Column Flotation by Particle Image Velocimetry
Authors: Balraju Vadlakonda, Narasimha Mangadoddy
Abstract:
The hydrodynamic behavior in a laboratory column flotation was analyzed using particle image velocimetry. For complete characterization of column flotation, it is necessary to determine the flow velocity induced by bubbles in the liquid phase, the bubble velocity and bubble characteristics:diameter,shape and bubble size distribution. An experimental procedure for analyzing simultaneous, phase-separated velocity measurements in two-phase flows was introduced. The non-invasive PIV technique has used to quantify the instantaneous flow field, as well as the time averaged flow patterns in selected planes of the column. Using the novel particle velocimetry (PIV) technique by the combination of fluorescent tracer particles, shadowgraphy and digital phase separation with masking technique measured the bubble velocity as well as the Reynolds stresses in the column. Axial and radial mean velocities as well as fluctuating components were determined for both phases by averaging the sufficient number of double images. Bubble size distribution was cross validated with high speed video camera. Average turbulent kinetic energy of bubble were analyzed. Different air flow rates were considered in the experiments.Keywords: particle image velocimetry (PIV), bubble velocity, bubble diameter, turbulent kinetic energy
Procedia PDF Downloads 5103303 A Framework of Product Information Service System Using Mobile Image Retrieval and Text Mining Techniques
Authors: Mei-Yi Wu, Shang-Ming Huang
Abstract:
The online shoppers nowadays often search the product information on the Internet using some keywords of products. To use this kind of information searching model, shoppers should have a preliminary understanding about their interesting products and choose the correct keywords. However, if the products are first contact (for example, the worn clothes or backpack of passengers which you do not have any idea about the brands), these products cannot be retrieved due to insufficient information. In this paper, we discuss and study the applications in E-commerce using image retrieval and text mining techniques. We design a reasonable E-commerce application system containing three layers in the architecture to provide users product information. The system can automatically search and retrieval similar images and corresponding web pages on Internet according to the target pictures which taken by users. Then text mining techniques are applied to extract important keywords from these retrieval web pages and search the prices on different online shopping stores with these keywords using a web crawler. Finally, the users can obtain the product information including photos and prices of their favorite products. The experiments shows the efficiency of proposed system.Keywords: mobile image retrieval, text mining, product information service system, online marketing
Procedia PDF Downloads 3573302 Training a Neural Network to Segment, Detect and Recognize Numbers
Authors: Abhisek Dash
Abstract:
This study had three neural networks, one for number segmentation, one for number detection and one for number recognition all of which are coupled to one another. All networks were trained on the MNIST dataset and were convolutional. It was assumed that the images had lighter background and darker foreground. The segmentation network took 28x28 images as input and had sixteen outputs. Segmentation training starts when a dark pixel is encountered. Taking a window(7x7) over that pixel as focus, the eight neighborhood of the focus was checked for further dark pixels. The segmentation network was then trained to move in those directions which had dark pixels. To this end the segmentation network had 16 outputs. They were arranged as “go east”, ”don’t go east ”, “go south east”, “don’t go south east”, “go south”, “don’t go south” and so on w.r.t focus window. The focus window was resized into a 28x28 image and the network was trained to consider those neighborhoods which had dark pixels. The neighborhoods which had dark pixels were pushed into a queue in a particular order. The neighborhoods were then popped one at a time stitched to the existing partial image of the number one at a time and trained on which neighborhoods to consider when the new partial image was presented. The above process was repeated until the image was fully covered by the 7x7 neighborhoods and there were no more uncovered black pixels. During testing the network scans and looks for the first dark pixel. From here on the network predicts which neighborhoods to consider and segments the image. After this step the group of neighborhoods are passed into the detection network. The detection network took 28x28 images as input and had two outputs denoting whether a number was detected or not. Since the ground truth of the bounds of a number was known during training the detection network outputted in favor of number not found until the bounds were not met and vice versa. The recognition network was a standard CNN that also took 28x28 images and had 10 outputs for recognition of numbers from 0 to 9. This network was activated only when the detection network votes in favor of number detected. The above methodology could segment connected and overlapping numbers. Additionally the recognition unit was only invoked when a number was detected which minimized false positives. It also eliminated the need for rules of thumb as segmentation is learned. The strategy can also be extended to other characters as well.Keywords: convolutional neural networks, OCR, text detection, text segmentation
Procedia PDF Downloads 1603301 Depth Estimation in DNN Using Stereo Thermal Image Pairs
Authors: Ahmet Faruk Akyuz, Hasan Sakir Bilge
Abstract:
Depth estimation using stereo images is a challenging problem in computer vision. Many different studies have been carried out to solve this problem. With advancing machine learning, tackling this problem is often done with neural network-based solutions. The images used in these studies are mostly in the visible spectrum. However, the need to use the Infrared (IR) spectrum for depth estimation has emerged because it gives better results than visible spectra in some conditions. At this point, we recommend using thermal-thermal (IR) image pairs for depth estimation. In this study, we used two well-known networks (PSMNet, FADNet) with minor modifications to demonstrate the viability of this idea.Keywords: thermal stereo matching, deep neural networks, CNN, Depth estimation
Procedia PDF Downloads 2783300 Sentiment Analysis of Consumers’ Perceptions on Social Media about the Main Mobile Providers in Jamaica
Authors: Sherrene Bogle, Verlia Bogle, Tyrone Anderson
Abstract:
In recent years, organizations have become increasingly interested in the possibility of analyzing social media as a means of gaining meaningful feedback about their products and services. The aspect based sentiment analysis approach is used to predict the sentiment for Twitter datasets for Digicel and Lime, the main mobile companies in Jamaica, using supervised learning classification techniques. The results indicate an average of 82.2 percent accuracy in classifying tweets when comparing three separate classification algorithms against the purported baseline of 70 percent and an average root mean squared error of 0.31. These results indicate that the analysis of sentiment on social media in order to gain customer feedback can be a viable solution for mobile companies looking to improve business performance.Keywords: machine learning, sentiment analysis, social media, supervised learning
Procedia PDF Downloads 4403299 Visual Search Based Indoor Localization in Low Light via RGB-D Camera
Authors: Yali Zheng, Peipei Luo, Shinan Chen, Jiasheng Hao, Hong Cheng
Abstract:
Most of traditional visual indoor navigation algorithms and methods only consider the localization in ordinary daytime, while we focus on the indoor re-localization in low light in the paper. As RGB images are degraded in low light, less discriminative infrared and depth image pairs are taken, as the input, by RGB-D cameras, the most similar candidates, as the output, are searched from databases which is built in the bag-of-word framework. Epipolar constraints can be used to relocalize the query infrared and depth image sequence. We evaluate our method in two datasets captured by Kinect2. The results demonstrate very promising re-localization results for indoor navigation system in low light environments.Keywords: indoor navigation, low light, RGB-D camera, vision based
Procedia PDF Downloads 4583298 Content Based Video Retrieval System Using Principal Object Analysis
Authors: Van Thinh Bui, Anh Tuan Tran, Quoc Viet Ngo, The Bao Pham
Abstract:
Video retrieval is a searching problem on videos or clips based on content in which they are relatively close to an input image or video. The application of this retrieval consists of selecting video in a folder or recognizing a human in security camera. However, some recent approaches have been in challenging problem due to the diversity of video types, frame transitions and camera positions. Besides, that an appropriate measures is selected for the problem is a question. In order to overcome all obstacles, we propose a content-based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is performed on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show. The performance is evaluated in promising comparison to the other approaches.Keywords: video retrieval, principal objects, keyframe, segmentation of aggregating superpixels, speeded up robust features, bag-of-words, SVM
Procedia PDF Downloads 3003297 Mapping of Geological Structures Using Aerial Photography
Authors: Ankit Sharma, Mudit Sachan, Anurag Prakash
Abstract:
Rapid growth in data acquisition technologies through drones, have led to advances and interests in collecting high-resolution images of geological fields. Being advantageous in capturing high volume of data in short flights, a number of challenges have to overcome for efficient analysis of this data, especially while data acquisition, image interpretation and processing. We introduce a method that allows effective mapping of geological fields using photogrammetric data of surfaces, drainage area, water bodies etc, which will be captured by airborne vehicles like UAVs, we are not taking satellite images because of problems in adequate resolution, time when it is captured may be 1 yr back, availability problem, difficult to capture exact image, then night vision etc. This method includes advanced automated image interpretation technology and human data interaction to model structures and. First Geological structures will be detected from the primary photographic dataset and the equivalent three dimensional structures would then be identified by digital elevation model. We can calculate dip and its direction by using the above information. The structural map will be generated by adopting a specified methodology starting from choosing the appropriate camera, camera’s mounting system, UAVs design ( based on the area and application), Challenge in air borne systems like Errors in image orientation, payload problem, mosaicing and geo referencing and registering of different images to applying DEM. The paper shows the potential of using our method for accurate and efficient modeling of geological structures, capture particularly from remote, of inaccessible and hazardous sites.Keywords: digital elevation model, mapping, photogrammetric data analysis, geological structures
Procedia PDF Downloads 6853296 Encephalon-An Implementation of a Handwritten Mathematical Expression Solver
Authors: Shreeyam, Ranjan Kumar Sah, Shivangi
Abstract:
Recognizing and solving handwritten mathematical expressions can be a challenging task, particularly when certain characters are segmented and classified. This project proposes a solution that uses Convolutional Neural Network (CNN) and image processing techniques to accurately solve various types of equations, including arithmetic, quadratic, and trigonometric equations, as well as logical operations like logical AND, OR, NOT, NAND, XOR, and NOR. The proposed solution also provides a graphical solution, allowing users to visualize equations and their solutions. In addition to equation solving, the platform, called CNNCalc, offers a comprehensive learning experience for students. It provides educational content, a quiz platform, and a coding platform for practicing programming skills in different languages like C, Python, and Java. This all-in-one solution makes the learning process engaging and enjoyable for students. The proposed methodology includes horizontal compact projection analysis and survey for segmentation and binarization, as well as connected component analysis and integrated connected component analysis for character classification. The compact projection algorithm compresses the horizontal projections to remove noise and obtain a clearer image, contributing to the accuracy of character segmentation. Experimental results demonstrate the effectiveness of the proposed solution in solving a wide range of mathematical equations. CNNCalc provides a powerful and user-friendly platform for solving equations, learning, and practicing programming skills. With its comprehensive features and accurate results, CNNCalc is poised to revolutionize the way students learn and solve mathematical equations. The platform utilizes a custom-designed Convolutional Neural Network (CNN) with image processing techniques to accurately recognize and classify symbols within handwritten equations. The compact projection algorithm effectively removes noise from horizontal projections, leading to clearer images and improved character segmentation. Experimental results demonstrate the accuracy and effectiveness of the proposed solution in solving a wide range of equations, including arithmetic, quadratic, trigonometric, and logical operations. CNNCalc features a user-friendly interface with a graphical representation of equations being solved, making it an interactive and engaging learning experience for users. The platform also includes tutorials, testing capabilities, and programming features in languages such as C, Python, and Java. Users can track their progress and work towards improving their skills. CNNCalc is poised to revolutionize the way students learn and solve mathematical equations with its comprehensive features and accurate results.Keywords: AL, ML, hand written equation solver, maths, computer, CNNCalc, convolutional neural networks
Procedia PDF Downloads 1213295 Visual Inspection of Road Conditions Using Deep Convolutional Neural Networks
Authors: Christos Theoharatos, Dimitris Tsourounis, Spiros Oikonomou, Andreas Makedonas
Abstract:
This paper focuses on the problem of visually inspecting and recognizing the road conditions in front of moving vehicles, targeting automotive scenarios. The goal of road inspection is to identify whether the road is slippery or not, as well as to detect possible anomalies on the road surface like potholes or body bumps/humps. Our work is based on an artificial intelligence methodology for real-time monitoring of road conditions in autonomous driving scenarios, using state-of-the-art deep convolutional neural network (CNN) techniques. Initially, the road and ego lane are segmented within the field of view of the camera that is integrated into the front part of the vehicle. A novel classification CNN is utilized to identify among plain and slippery road textures (e.g., wet, snow, etc.). Simultaneously, a robust detection CNN identifies severe surface anomalies within the ego lane, such as potholes and speed bumps/humps, within a distance of 5 to 25 meters. The overall methodology is illustrated under the scope of an integrated application (or system), which can be integrated into complete Advanced Driver-Assistance Systems (ADAS) systems that provide a full range of functionalities. The outcome of the proposed techniques present state-of-the-art detection and classification results and real-time performance running on AI accelerator devices like Intel’s Myriad 2/X Vision Processing Unit (VPU).Keywords: deep learning, convolutional neural networks, road condition classification, embedded systems
Procedia PDF Downloads 1333294 Mammographic Multi-View Cancer Identification Using Siamese Neural Networks
Authors: Alisher Ibragimov, Sofya Senotrusova, Aleksandra Beliaeva, Egor Ushakov, Yuri Markin
Abstract:
Mammography plays a critical role in screening for breast cancer in women, and artificial intelligence has enabled the automatic detection of diseases in medical images. Many of the current techniques used for mammogram analysis focus on a single view (mediolateral or craniocaudal view), while in clinical practice, radiologists consider multiple views of mammograms from both breasts to make a correct decision. Consequently, computer-aided diagnosis (CAD) systems could benefit from incorporating information gathered from multiple views. In this study, the introduce a method based on a Siamese neural network (SNN) model that simultaneously analyzes mammographic images from tri-view: bilateral and ipsilateral. In this way, when a decision is made on a single image of one breast, attention is also paid to two other images – a view of the same breast in a different projection and an image of the other breast as well. Consequently, the algorithm closely mimics the radiologist's practice of paying attention to the entire examination of a patient rather than to a single image. Additionally, to the best of our knowledge, this research represents the first experiments conducted using the recently released Vietnamese dataset of digital mammography (VinDr-Mammo). On an independent test set of images from this dataset, the best model achieved an AUC of 0.87 per image. Therefore, this suggests that there is a valuable automated second opinion in the interpretation of mammograms and breast cancer diagnosis, which in the future may help to alleviate the burden on radiologists and serve as an additional layer of verification.Keywords: breast cancer, computer-aided diagnosis, deep learning, multi-view mammogram, siamese neural network
Procedia PDF Downloads 1363293 The Residual Effects of Special Merchandising Sections on Consumers' Shopping Behavior
Authors: Shih-Ching Wang, Mark Lang
Abstract:
This paper examines the secondary effects and consequences of special displays on subsequent shopping behavior. Special displays are studied as a prominent form of in-store or shopper marketing activity. Two experiments are performed using special value and special quality-oriented displays in an online simulated store environment. The impact of exposure to special displays on mindsets and resulting product choices are tested in a shopping task. Impact on store image is also tested. The experiments find that special displays do trigger shopping mindsets that affect product choices and shopping basket composition and value. There are intended and unintended positive and negative effects found. Special value displays improve store price image but trigger a price sensitive shopping mindset that causes more lower-priced items to be purchased, lowering total basket dollar value. Special natural food displays improve store quality image and trigger a quality-oriented mindset that causes fewer lower-priced items to be purchased, increasing total basket dollar value. These findings extend the theories of product categorization, mind-sets, and price sensitivity found in communication research into the retail store environment. Findings also warn retailers to consider the total effects and consequences of special displays when designing and executing in-store or shopper marketing activity.Keywords: special displays, mindset, shopping behavior, price consciousness, product categorization, store image
Procedia PDF Downloads 281