TAPIP3D: Tracking Any Point in Persistent 3D Geometry

This repository contains the TAPIP3D model as presented in TAPIP3D: Tracking Any Point in Persistent 3D Geometry.

Overview

TAPIP3D is a method for long-term feed-forward 3D point tracking in monocular RGB and RGB-D video sequences. It introduces a 3D feature cloud representation that lifts image features into a persistent world coordinate space, canceling out camera motion and enabling accurate trajectory estimation across frames.

Demo Usage

We provide a simple demo script inference.py, along with sample input data located in the demo_inputs/ directory. The script accepts as input either an .mp4 video file or an .npz file. If providing an .npz file, it should follow the following format:

video: array of shape (T, H, W, 3), dtype: uint8
depths (optional): array of shape (T, H, W), dtype: float32
intrinsics (optional): array of shape (T, 3, 3), dtype: float32
extrinsics (optional): array of shape (T, 4, 4), dtype: float32

For demonstration purposes, the script uses a 32x32 grid of points at the first frame as queries.

Inference with Monocular Video

By providing a video as --input_path, the script first runs MegaSAM with MoGe to estimate depth maps and camera parameters. Subsequently, the model will process these inputs within the global frame.

To run inference:

python inference.py --input_path demo_inputs/sheep.mp4 --checkpoint checkpoints/tapip3d_final.pth --resolution_factor 2

An npz file will be saved to outputs/inference/. To visualize the results:

python visualize.py <result_npz_path>

Inference with Known Depths and Camera Parameters

If an .npz file containing all four keys (rgb, depths, intrinsics, extrinsics) is provided, the model will operate in an aligned global frame, generating point trajectories in world coordinates.

Citation

If you find this project useful, please consider citing:

@article{tapip3d,
  title={TAPIP3D: Tracking Any Point in Persistent 3D Geometry},
  author={Zhang, Bowei and Ke, Lei and Harley, Adam W and Fragkiadaki, Katerina},
  journal={arXiv preprint arXiv:2504.14717},
  year={2025}
}