Target: Bump Sign
Group: Low Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Unlike dense stereo, optical flow or multi-view stereo, template based tracking lacks benchmark datasets allowing a fair comparison between state-of-the-art algorithms. Until now, in order to evaluate objectively and quantitatively the performance and the robustness of template-based tracking algorithms, mainly synthetically generated image sequences were used. The evaluation is therefore often intrinsically based.
This website accompanies our upcoming ISMAR 2009 paper "A Dataset and Evaluation Methodology for Template-based Tracking Algorithms" in which we describe the process we carried out to perform the acquisition of real scene image sequences with very precise and accurate ground truth poses using an industrial camera rigidly mounted on the end-effector of a high-precision robotic measurement arm. For the acquisition, we considered most of the critical parameters that influence the tracking results such as: the texture richness and the texture repeatability of the objects to be tracked, the camera motion and speed, and the changes of the object scale in the images and variations of the lighting conditions over time.We designed an evaluation scheme for object detection and inter-frame tracking algorithms and used the image sequences to apply this scheme to several state-of-the-art algorithms. The image sequences are freely available for testing, submitting and evaluating new template-based tracking algorithms.

On this page you find the datasets generated until now. Each dataset consists of a movie, an image of the tracking target, the intrinsics of the camera used and a file providing undistorted ground truth positions for every 250th frame.
All movies consist of 1200 frames, and we offer the movies both as they were captured (i.e. distorted) and rectified (i.e. undistorted, using parameters from undist.txt). The distortion model we used is from section 2.3.2 of the book "Nahbereichsphotogrammetrie" by Prof. Luhmann, second edition, 2003.
There are five movies per target focusing on:
The movies are encoded with the lossless FFV1 codec from the ffmpeg-project, a DirectShow codec is available at Sourceforge. You can use Virtual Dub to convert the sequences into still images if necessary.
The task now is to detect the target image in the frames of the movie. All reference targets are 640x480 images. For every 250th frame, we provide the coordinates of four corners that are placed at the pixels (+- 512; +-384), the origin of the tracking target is in its middle (see image on the right, the white frame represents the 640x480 px target, the reference points given for initialization lie on the diagonal). All images have their origin in the upper left corner.


We offer to evaluate the results you obtain with your tracking algorithm and send you the results. If you agree, we can additionally publish your results on the web page. To evaluate your results against the ground truth we have for every frame, please send an email to research where you attach a tabulator-separated log file of your experiments (1 per sequence) formatted like this example. Please use the same order of the pixels as in the example, i.e. (oc1u,oc1v) is the current position of pixel (+512;+384) of the reference template, (oc2u;oc2v) corresponds to (-512;+384), (oc3u;oc3v) to (-512;-384) and (oc4u;oc4v) to (+512;-384). We evaluate your log files and then send you the results (example results for SIFT see on the right). As measure we use the RMS of the four pixels. A frame is considered successfully tracked if the RMS is below 10 px.
For the evaluation results of SIFT, SURF, FERNS and ESM please refer to our ISMAR 2009 paper "A Dataset and Evaluation Methodology for Template-based Tracking Algorithms".
This work was partially supported by BMBF grant Avilus / 01 IM08001 P.
For comments and suggestions, please contact research.
Target: Bump Sign
Group: Low Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Stop Sign
Group: Low Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Lucent
Group: Repetitive Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Mac Mini board
Group: Repetitive Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Isetta
Group: Normal Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Philidelphia
Group: Normal Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Grass
Group: High Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames
Target: Wall
Group: High Texture
Download:
Sequences Undistorted
Sequences Distorted
Camera Calibration
Tracking Target
Initialization Frames