Unsupervised Deep Learning Model for Homography Estimation

Authors

  • Lang Zhou

DOI:

https://doi.org/10.62051/ijcsit.v4n3.46

Keywords:

Depth homography estimation, Unsupervised, Image stitching

Abstract

This paper proposes an unsupervised deep learning model for homography estimation, addressing limitations of traditional feature-based methods and supervised learning approaches. By leveraging reprojection error as the optimization objective, the model eliminates the need for labeled data while achieving precise homography estimation. The framework comprises a feature extraction module, a feature difference module, and a homography regression network. Extensive experiments on the MS-COCO and HPatches datasets demonstrate that the proposed model achieves a mean reprojection error (MRE) of 3.67 pixels and an accuracy (ACC) of 88.3%, closely approaching the performance of supervised methods like DeepHomography while significantly outperforming classical SIFT + RANSAC. The model's lightweight design ensures efficient inference, requiring only 12ms per estimation, making it suitable for real-time applications. Ablation studies validate the effectiveness of key components such as the feature difference module and regularization loss, highlighting their contributions to performance improvement. Compared to traditional methods, the proposed approach exhibits superior robustness under varying lighting conditions, viewpoint changes, and noise interference. Moreover, it removes the dependency on labeled data, reducing application costs and barriers. This unsupervised framework presents a practical and efficient solution for homography estimation and offers potential for broader applications in multi-view geometry and 3D reconstruction tasks. This is an example abstract. It describes the content and purpose of the paper succinctly. The abstract should be single-spaced and use Times New Roman 10 pt font.

Downloads

Download data is not yet available.

References

[1] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision.

[2] DeTone, D., Malisiewicz, T., & Rabinovich, A. (2016). Deep image homography estimation. arXiv preprint arXiv:1606.03798.

[3] Dosovitskiy, A., et al. (2015). FlowNet: Learning optical flow with convolutional networks. ICCV.

[4] M. Hong, Y. Lu, N. Ye, C. Lin, Q. Zhao and S. Liu, "Unsupervised Homography Estimation with Coplanarity-Aware GAN," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 17642-17651, doi: 10.1109/CVPR52688.2022.01714.

[5] Heng Guo, Shuaicheng Liu, Tong He, Shuyuan Zhu, Bing Zeng, and Moncef Gabbouj. Joint video stitching and stabilization from moving cameras. IEEE Transactions on Image Processing, 25(11):5491–5503, 2016.

[6] Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. As-projective-as-possible image stitching with moving DLT. In Proc. CVPR, pages 2339–2346, 2013.

[7] Zhengyou Zhang. A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000.

[8] Natasha Gelfand, Andrew Adams, Sung Hee Park, and Kari Pulli. Multi-exposure imaging on mobile devices. In Proceedings of the 18th ACM international conference on Multimedia, pages 823–826, 2010.

[9] Raul Mur-Artal, J. M. M. Montiel, and Juan D. Tardos. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robotics, 31(5):1147–1163, 2015.

[10] Ruizhi Shao, Gaochang Wu, Yuemei Zhou, Ying Fu, Lu Fang, and Yebin Liu. Localtrans: A multiscale local transformer network for cross-resolution homography estimation. arXiv preprint arXiv:2106.04067, 2021.

[11] Richard Hartley and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge University Press,2003.

[12] Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary R. Bradski. ORB: an efficient alternative to SIFT or SURF. In Proc. ICCV, pages 2564–2571, 2011.

[13] Hoang Le, Feng Liu, Shu Zhang, and Aseem Agarwala. Deep homography estimation for dynamic scenes. In Proc. CVPR, pages 7649–7658, 2020.

[14] Jirong Zhang, Chuan Wang, Shuaicheng Liu, Lanpeng Jia, Nianjin Ye, Jue Wang, Ji Zhou, and Jian Sun. Content-aware unsupervised deep homography estimation. In Proc. ECCV, pages 653–669, 2020.

[15] Ty Nguyen, Steven W. Chen, Shreyas S. Shivakumar, Camillo Jose Taylor, and Vijay Kumar. Unsupervised deep homography: A fast and robust homography estimation model. IEEE Robotics Autom. Lett., 3(3):2346–2353, 2018.

[16] Lang Nie, Chunyu Lin, Kang Liao, Meiqin Liu, and Yao Zhao. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020.

[17] Dae-Young Song, Geonsoo Lee, HeeKyung Lee, Gi-MunUm, and Donghyeon Cho. Weakly-supervised stitching network for real-world panoramic image generation. arXiv preprint arXiv:2209.05968, 2022.

[18] Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: Speeded up robust features. In European conference on computer vision, pages 404–417. Springer, 2006.

[19] Martin A Fischler and Robert C Bolles. Random sampleconsensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.

[20] Daniel Barath, Jiri Matas, and Jana Noskova. Magsac: marginalizing sample consensus. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10197–10205, 2019.

[21] Farzan Erlik Nowruzi, Robert Laganiere, and Nathalie Japkowicz. Homography estimation from image pairs with hierarchical convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 913–920, 2017.

[22] Hoang Le, Feng Liu, Shu Zhang, and Aseem Agarwala. Deep homography estimation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7661, 2020.

[23] Bruce D Lucas, Takeo Kanade, et al. An iterative imageregistration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial intelligence. Vancouver, British Columbia, 1981.

[24] Yiming Zhao, Xinming Huang, and Ziming Zhang. Deep Lucas-Kanade homography for multimodal image alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15950–15959, 2021.

[25] Che-Han Chang, Chun-Nan Chou, and Edward Y Chang. CLKN: Cascaded lucas-kanade networks for image alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2213–2221, 2017.

Downloads

Published

23-12-2024

Issue

Section

Articles

How to Cite

Zhou, L. (2024). Unsupervised Deep Learning Model for Homography Estimation. International Journal of Computer Science and Information Technology, 4(3), 408-419. https://doi.org/10.62051/ijcsit.v4n3.46