Multi-Modal Pedestrian Detection with Misalignment

In this research, we study on pedestrian detection from RGB (visible)
and long-wavelength infrared (thermal) images, when misalignment between them is present.
How to efficiently use the information of both modals is our main concern.
Contributions

We analyzed the misalignment problem of existing multi-modal detection
We proposed new evaluation metrics for multi-modal detection, multi-modal IoU (IoUM) and multi-modal MR (MRM)
We proposed multi-modal Faster R-CNN for pedestrian detection based on modal-wise regression and IoUM
Framework Comparison

Proposed Network Overview

Visualization Examples
MSDS-RCNN |
AR-CNN |
MBNet |
Ours |

- References
MSDS-RCNN: Chengyang Li, Dan Song, Ruofeng Tong, and Min Tang. Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation. In British Machine Vision Conference (BMVC), 2018.
AR-CNN: Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, and Zhiyong Liu. Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.
MBNet: Kailai Zhou, Linsen Chen, and Xun Cao. Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems. In Proceedings of the European Conference on Computer Vision (ECCV), pages 787-803, 2020.
Publication
- Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU [arXiv]
- Napat Wanchaitanawong, Masayuki Tanaka, Takashi Shibata, and Masatoshi Okutomi
- Proceedings of the 17th International Conference on Machine Vision Applications (MVA2021), pp.O1-1-4-1-6, July 2021
- Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU [SPIE]
- Napat Wanchaitanawong, Masayuki Tanaka, Takashi Shibata, and Masatoshi Okutomi
- Journal of Electronic Imaging, Vol.32, Issue 1, pp.013025-1-19, February 2023