Reference camera poses for the query images of the San Francisco Landmarks dataset
This package provides the 6DoF reference camera poses computed in our CVPR17 paper [1]. It also includes the localization benchmarks (figure 3a, b, c in the paper) that evaluate the positional accuracy for 2D image-based and 3D structure-based localization baselines.
Data format description
The reference poses for the query images of the San Francisco Landmarks dataset [2,3] are provided in two formats:
-
reference_poses_442.txt
(plain text)Each line in this file contains query name, rotation in quaternion, and camera position in the UTM coordinates, e.g.
0 <query name> <1x4 quaternion> <1x3 camera position>
Note that we call C (t = -R*C for P = [R | t]) as the camera position. -
reference_poses_442.mat
(matlab binary)This file contains struct array
poses
which has fieldsname
andP
. For example,poses(1).name
returns the name of query image andposes(1).P
returnes a 3x4 projection matrix P = [R | t] of the query in the UTM coordinates. -
sf0bundler2utm_similarity_transformation.txt
(plain text)This file contains the similarity transformation from SF-0 model [4,5] to UTM coordinates:
cs (1x1 scale) Rs (3x3 rotation matrix) ts (3x1 translation vector)
3D pointX
in SF-0 can be transformed to UTM coordinates byXutm = cs * Rs * X + ts
.
References
[1] T, Sattler, A. Torii, J. Sivic, M. Pollefeys, H. Taira, M. Okutomi, T. Pajdla: Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization? CVPR 2017.
[2] D. Chen, G. Baatz, K. Koeser, S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk: City-scale landmark identification on mobile devices. CVPR 2011.
[3] San Francisco Landmark Dataset. https://purl.stanford.edu/vn158kj2087
[4] Y, Li, N. Snavely, D. Huttenlocher, P. Fua: Worldwide Pose Estimation using 3D Point Clouds. ECCV 2012.
[5] http://landmark.cs.cornell.edu/