A Neural Network Classifier for Identifying Duplicate Image Entries in Real-Estate Databases
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 87786
A Neural Network Classifier for Identifying Duplicate Image Entries in Real-Estate Databases

Authors: Sergey Ermolin, Olga Ermolin

Abstract:

A Deep Convolution Neural Network with Triplet Loss is used to identify duplicate images in real-estate advertisements in the presence of image artifacts such as watermarking, cropping, hue/brightness adjustment, and others. The effects of batch normalization, spatial dropout, and various convergence methodologies on the resulting detection accuracy are discussed. For comparative Return-on-Investment study (per industry request), end-2-end performance is benchmarked on both Nvidia Titan GPUs and Intel’s Xeon CPUs. A new real-estate dataset from San Francisco Bay Area is used for this work. Sufficient duplicate detection accuracy is achieved to supplement other database-grounded methods of duplicate removal. The implemented method is used in a Proof-of-Concept project in the real-estate industry.

Keywords: visual recognition, convolutional neural networks, triplet loss, spatial batch normalization with dropout, duplicate removal, advertisement technologies, performance benchmarking

Procedia PDF Downloads 341