library_name: trellis
pipeline_tag: image-to-3d
license: mit
language:
- en
TRELLIS Image Large
TRELLIS Image Large generates 3D objects from images. The inputs are images (.jpg
, .png
) and the outputs are meshes (.glb
) and splats (.ply
).
ποΈ Model Details
- Name: TRELLIS-image-large
- Type: Image-to-3D
- Size: 1.2 billion parameters
- Code: https://github.com/microsoft/TRELLIS
- Paper: https://arxiv.org/abs/2412.01506
- Training Data: TRELLIS-500K
π‘ Usage
Minimal Example
Here is an example of how to use the pretrained models for 3D asset generation.
import os
# os.environ['ATTN_BACKEND'] = 'xformers' # Can be 'flash-attn' or 'xformers', default is 'flash-attn'
os.environ['SPCONV_ALGO'] = 'native' # Can be 'native' or 'auto', default is 'auto'.
# 'auto' is faster but will do benchmarking at the beginning.
# Recommended to set to 'native' if run only once.
import imageio
from PIL import Image
from trellis.pipelines import TrellisImageTo3DPipeline
from trellis.utils import render_utils, postprocessing_utils
# Load a pipeline from a model folder or a Hugging Face model hub.
pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()
# Load an image
image = Image.open("assets/example_image/T.png")
# Run the pipeline
outputs = pipeline.run(
image,
seed=1,
# Optional parameters
# sparse_structure_sampler_params={
# "steps": 12,
# "cfg_strength": 7.5,
# },
# slat_sampler_params={
# "steps": 12,
# "cfg_strength": 3,
# },
)
# outputs is a dictionary containing generated 3D assets in different formats:
# - outputs['gaussian']: a list of 3D Gaussians
# - outputs['radiance_field']: a list of radiance fields
# - outputs['mesh']: a list of meshes
# Render the outputs
video = render_utils.render_video(outputs['gaussian'][0])['color']
imageio.mimsave("sample_gs.mp4", video, fps=30)
video = render_utils.render_video(outputs['radiance_field'][0])['color']
imageio.mimsave("sample_rf.mp4", video, fps=30)
video = render_utils.render_video(outputs['mesh'][0])['normal']
imageio.mimsave("sample_mesh.mp4", video, fps=30)
# GLB files can be extracted from the outputs
glb = postprocessing_utils.to_glb(
outputs['gaussian'][0],
outputs['mesh'][0],
# Optional parameters
simplify=0.95, # Ratio of triangles to remove in the simplification process
texture_size=1024, # Size of the texture used for the GLB
)
glb.export("sample.glb")
# Save Gaussians as PLY files
outputs['gaussian'][0].save_ply("sample.ply")
After running the code, you will get the following files:
- sample_gs.mp4: a video showing the 3D Gaussian representation
- sample_rf.mp4: a video showing the Radiance Field representation
- sample_mesh.mp4: a video showing the mesh representation
- sample.glb: a GLB file containing the extracted textured mesh
- sample.ply: a PLY file containing the 3D Gaussian representation
βοΈ License
TRELLIS models and the majority of the code are licensed under the MIT License. The following submodules may have different licenses:
diffoctreerast: We developed a CUDA-based real-time differentiable octree renderer for rendering radiance fields as part of this project. This renderer is derived from the diff-gaussian-rasterization project and is available under the LICENSE.
Modified Flexicubes: In this project, we used a modified version of Flexicubes to support vertex attributes. This modified version is licensed under the LICENSE.
π Citation
If you find this work helpful, please consider citing our paper:
@article{xiang2024structured,
title = {Structured 3D Latents for Scalable and Versatile 3D Generation},
author = {Xiang, Jianfeng and Lv, Zelong and Xu, Sicheng and Deng, Yu and Wang, Ruicheng and Zhang, Bowen and Chen, Dong and Tong, Xin and Yang, Jiaolong},
journal = {arXiv preprint arXiv:2412.01506},
year = {2024}
}