TripoSG-scribble - Fast 3D Shape Prototyping with Scribble and Prompt
TripoSG-scribble converts a scribble image and a text prompt to a 3D shape. TripoSG-scribble is a variant of TripoSG. TripoSG is a state-of-the-art image-to-3D generation foundation model that leverages large-scale rectified flow transformers to produce high-fidelity 3D shapes from single images.
Model Description
Model Architecture
TripoSG utilizes a novel architecture combining:
- Rectified Flow (RF) based Transformer for stable, linear trajectory modeling
- Advanced VAE with SDF-based representation and hybrid geometric supervision
- Cross-attention mechanism for image feature condition
- 1.5B parameters operating on 2048 latent tokens
For inference efficiency, TripoSG-scribble is different from TripoSG in:
- TripoSG-scribble is a CFG-distilled model and should be used with CFG=0
- TripoSG-scribble is trained with 512 latent tokens
Intended Uses
This model is designed for:
- Converting scribble image and text prompt to high-quality 3D meshes
- Creative and design applications
- Gaming and VFX asset creation
- Prototyping and visualization
Requirements
- CUDA-capable GPU (>8GB VRAM)
Usage
For detailed usage instructions, please visit our GitHub repository.
About
TripoSG-scribble is developed by Tripo, VAST AI Research, pushing the boundaries of 3D Generative AI. For more information:
- Downloads last month
- 42
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The HF Inference API does not support image-to-3d models for diffusers
library.