SkeletonDiffusion Model Card

This model card focuses on the model associated with the SkeletonDiffusion model, from Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction, codebase available here.

SkeletonDiffusion is a probabilistic human motion prediction model that takes as input 0.5s of human motion and generates future motions of 2s with a inference time of 0.4s. SkeletonDiffusion generates motions that are at the same time realistic and diverse. It is a latent diffusion model that with a custom graph attention architecture trained with nonisotropic Gaussian diffusion.

We provide a model for each dataset mentioned in the paper (AMASS, FreeMan, Human3.6M), and a further model trained on AMASS with hands joints (AMASS-MANO).

trailer

Online demo

The model trained on AMASS is accessible in a demo workflow that predicts future motions from videos. The demo extracts 3D human poses from video via Neural Localizer Fields (NLF) by Sarandi et al., and SkeletonDiffusion generates future motions conditioned on the extracted poses: SkeletonDiffusion has not been trained with real-world, noisy data, but despite this fact it can handle most cases reasonably.

Usage

Direct use

You can use the model for purposes under the BSD 2-Clause License.

Train and Inference

Please refer to our GitHub codebase for both usecases.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train SkeletonDiffusion/ModelCheckpoints