LGM Full
NLP to 3D Model β Custom Pipeline
Overview
This project showcases an experimental pipeline that bridges natural language prompts to 3D model generation using a modified version of a pre-trained multi-view diffusion model.
It is part of a final year project for the Comprehensive Creative Technologies Project at UWE Bristol. The primary aim was to explore the potential of AI-assisted 3D content creation using natural language input.
Model Source & Attribution
This project relies on the pre-trained model from the LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation, developed for the ML for 3D Course by researchers at Google Research.
π Original Model: [https://huggingface.co/spaces/dylanebert/LGM-tiny]
π Paper: arXiv:2402.05054
π License: MIT
I do not claim authorship of the model architecture or training process. This space serves as a custom wrapper for experimentation with text-to-3D workflows.
What This Model Does
- Allows input of a natural language description
- Internally maps the input to a representative image or multi-view description
- Generates a 3D model using the LGM pipeline
Limitations
- This is a prototype for academic use only.
- The modelβs ability to handle complex or abstract text is limited.
- Performance and quality depend entirely on the base pre-trained model.
Acknowledgements
Thanks to Hugging Face and the authors of LGM for making their models publicly available.
Author
Gordon CHIN HO AU
Final Year BSc Digital Media
University of the West of England, Bristol
- Downloads last month
- 22