TAPIP3D: Tracking Any Point in Persistent 3D Geometry
This repository contains the TAPIP3D model as presented in TAPIP3D: Tracking Any Point in Persistent 3D Geometry.
Code: https://github.com/zbww/tapip3d
Overview
TAPIP3D is a method for long-term feed-forward 3D point tracking in monocular RGB and RGB-D video sequences. It introduces a 3D feature cloud representation that lifts image features into a persistent world coordinate space, canceling out camera motion and enabling accurate trajectory estimation across frames.
Demo Usage
We provide a simple demo script inference.py
, along with sample input data located in the demo_inputs/
directory. The script accepts as input either an .mp4
video file or an .npz
file. If providing an .npz
file, it should follow the following format:
video
: array of shape (T, H, W, 3), dtype: uint8depths
(optional): array of shape (T, H, W), dtype: float32intrinsics
(optional): array of shape (T, 3, 3), dtype: float32extrinsics
(optional): array of shape (T, 4, 4), dtype: float32
For demonstration purposes, the script uses a 32x32 grid of points at the first frame as queries.
Inference with Monocular Video
By providing a video as --input_path
, the script first runs MegaSAM with MoGe to estimate depth maps and camera parameters. Subsequently, the model will process these inputs within the global frame.
To run inference:
python inference.py --input_path demo_inputs/sheep.mp4 --checkpoint checkpoints/tapip3d_final.pth --resolution_factor 2
An npz file will be saved to outputs/inference/
. To visualize the results:
python visualize.py <result_npz_path>
Inference with Known Depths and Camera Parameters
If an .npz
file containing all four keys (rgb
, depths
, intrinsics
, extrinsics
) is provided, the model will operate in an aligned global frame, generating point trajectories in world coordinates.
Citation
If you find this project useful, please consider citing:
@article{tapip3d,
title={TAPIP3D: Tracking Any Point in Persistent 3D Geometry},
author={Zhang, Bowei and Ke, Lei and Harley, Adam W and Fragkiadaki, Katerina},
journal={arXiv preprint arXiv:2504.14717},
year={2025}
}