Depth Estimation in DNN Using Stereo Thermal Image Pairs
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32870
Depth Estimation in DNN Using Stereo Thermal Image Pairs

Authors: Ahmet Faruk Akyuz, Hasan Sakir Bilge


Depth estimation using stereo images is a challenging problem in computer vision. Many different studies have been carried out to solve this problem. With advancing machine learning, tackling this problem is often done with neural network-based solutions. The images used in these studies are mostly in the visible spectrum. However, the need to use the Infrared (IR) spectrum for depth estimation has emerged because it gives better results than visible spectra in some conditions. At this point, we recommend using thermal-thermal (IR) image pairs for depth estimation. In this study, we used two well-known networks (PSMNet, FADNet) with minor modifications to demonstrate the viability of this idea.

Keywords: thermal stereo matching, depth estimation, deep neural networks, CNN

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 627


[1] H. Hirschmuller, “Stereo processing by semiglobal matching and mutual information,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328–341, 2008.
[2] J. Zbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” 2016.
[3] A. Shaked and L. Wolf, “Improved stereo matching with constant highway networks and reflective confidence learning,” 2016.
[4] J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” 2018.
[5] Q. Wang, S. Shi, S. Zheng, K. Zhao, and X. Chu, “Fadnet: A fast and accurate network for disparity estimation,” 2020.
[6] N. Mayer, E. Ilg, P. Hausser,¨ P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” CoRR, vol. abs/1512.02134, 2015.
[Online]. Available:
[7] D. Scharstein, R. Szeliski, and R. Zabih, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” in Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), 2001, pp. 131–140.
[8] G. Yang, X. Song, C. Huang, Z. Deng, J. Shi, and B. Zhou, “Driving-stereo: A large-scale dataset for stereo matching in autonomous driving scenarios,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 899–908.
[9] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 3354–3361.
[10] A. Geiger, F. Moosmann, O. Car, and B. Schuster, “Automatic camera and range sensor calibration using a single shot,” in 2012 IEEE Interna-tional Conference on Robotics and Automation, 2012, pp. 3936–3943.
[11] K. R. Beier, R. Boehl, J. Fries, W. Hahn, D. Hausamann, V. Tank, G. Wagner, and H. Weisser, “Measurement and modeling of infrared imaging systems at conditions of reduced visibility (fog) for traffic applications,” in Characterization and Propagation of Sources and Backgrounds, W. R. Watkins and D. Clement, Eds., vol. 2223, International Society for Optics and Photonics. SPIE, 1994, pp. 175 – 186.
[Online]. Available:
[12] A. Dhua, F. Cutu, R. Hammoud, and S. Kiselewich, “Triangulation based technique for efficient stereo computation in infrared images,” in IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No.03TH8683), 2003, pp. 673–678.
[13] M. Bertozzi, A. Broggi, A. Lasagni, and M. Rose, “Infrared stereo vision-based pedestrian detection,” in IEEE Proceedings. Intelligent Vehicles Symposium, 2005., 2005, pp. 24–29.
[14] Y. W. K. Zoetgnande, G. Cormier, A.-J. Fougeres,` and J.-L. Dillenseger, “Sub-pixel matching method for low-resolution thermal stereo images,” 2019.
[15] W. Treible, P. Saponaro, S. Sorensen, A. Kolagunda, M. O’Neal, B. Phelan, K. Sherbondy, and C. Kambhamettu, “Cats: A color and thermal stereo benchmark,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[16] J. Mallon and P. F. Whelan, “Projective rectification from the funda-mental matrix,” Image Vision Comput., vol. 23, no. 7, p. 643–650, Jul. 2005.
[Online]. Available:
[17] E. Trucco and A. Verri, Introductory techniques for 3-D computer vision., 01 1998.