Jonathan Lorraine PRO

lorraine2

AI & ML interests

machine learning, computer vision, generative AI

Recent Activity

View all activity

Organizations

Social Post Explorers's profile picture

lorraine2's activity

posted an update about 1 month ago
view post
Post
2000
🦙New NVIDIA paper: LLaMA-Mesh 🦙

We enable large language models to generate and understand 3D meshes by representing them as text and fine-tuning. This unifies the 3D and text modalities in a single model and preserves language abilities, unlocking conversational 3D creation with mesh understanding.

🔎 Project Page: https://research.nvidia.com/labs/toronto-ai/LLaMA-Mesh/
🕹️ Interactive Demo: Zhengyi/LLaMA-Mesh (courtesy of HuggingFace and Gradio)
📖 Full Paper: https://arxiv.org/abs/2411.09595
👨‍💻Code: https://github.com/nv-tlabs/LLaMa-Mesh
💾 Model Checkpoint: Zhengyi/LLaMA-Mesh
🧩 Blender Addon: https://github.com/huggingface/meshgen (courtesy of Dylan Ebert)
🎥 5-min Overview Video: https://youtu.be/eZNazN-1lPo?si=-idQa5aaceVw0Bbj (courtesy of AI Papers Academy)
reacted to their post with 👀 about 2 months ago
view post
Post
1210
New NVIDIA paper: ⚡ Multi-student Diffusion Distillation for Better One-step Generators ⚡

Do you want to make your diffusion models (a) run in a single step, (b) run with a smaller model, and (c) have improved quality simultaneously? Check out our multi-student distillation (MSD) method, which is simple and applicable to most diffusion models! The only catch is now we have to distill (and store) a mixture-of-expert student generators.

Explore the MSD project page to learn more: https://research.nvidia.com/labs/toronto-ai/MSD/

Work led by Yanke Song along with Weili Nie, Karsten Kreis and James Lucas

Check out more work from the Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/
  • 1 reply
·
posted an update about 2 months ago
view post
Post
1210
New NVIDIA paper: ⚡ Multi-student Diffusion Distillation for Better One-step Generators ⚡

Do you want to make your diffusion models (a) run in a single step, (b) run with a smaller model, and (c) have improved quality simultaneously? Check out our multi-student distillation (MSD) method, which is simple and applicable to most diffusion models! The only catch is now we have to distill (and store) a mixture-of-expert student generators.

Explore the MSD project page to learn more: https://research.nvidia.com/labs/toronto-ai/MSD/

Work led by Yanke Song along with Weili Nie, Karsten Kreis and James Lucas

Check out more work from the Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/
  • 1 reply
·
reacted to their post with 👀 about 2 months ago
view post
Post
294
New NeurIPS paper: “Training Data Attribution via Approximate Unrolling”

Ever wondered how individual data points influence AI decisions? 🤔 We explore how specific training data pieces affect machine learning models' behavior, which can be crucial for making AI systems more transparent, trustworthy, and fair.

Our method, SOURCE, bridges the gap between implicit differentiation and unrolling approaches, combining computational efficiency with flexibility making it suitable for non-converged models and multi-stage training pipelines.

📄 Full paper: https://openreview.net/pdf?id=3NaqGg92KZ

Juhan Bae led along with Wu Lin and Roger Grosse.

Supported by the University of Toronto, Vector Institute, NVIDIA, and Anthropic
posted an update 2 months ago
view post
Post
294
New NeurIPS paper: “Training Data Attribution via Approximate Unrolling”

Ever wondered how individual data points influence AI decisions? 🤔 We explore how specific training data pieces affect machine learning models' behavior, which can be crucial for making AI systems more transparent, trustworthy, and fair.

Our method, SOURCE, bridges the gap between implicit differentiation and unrolling approaches, combining computational efficiency with flexibility making it suitable for non-converged models and multi-stage training pipelines.

📄 Full paper: https://openreview.net/pdf?id=3NaqGg92KZ

Juhan Bae led along with Wu Lin and Roger Grosse.

Supported by the University of Toronto, Vector Institute, NVIDIA, and Anthropic
posted an update 6 months ago
view post
Post
564
🚨 Code now available for "Using Large Language Models for Hyperparameter Optimization" at https://github.com/michaelrzhang/LLM-HyperOpt 🚨

TLDR: You can just ask LLMs which hyperparameters to use, and it works pretty well! You can even directly optimize your model’s code as a hyperparameter with this.

Check out the paper at https://arxiv.org/abs/2312.04528 - with Michael Zhang, Nishkrit Desai, Juhan Bae, and Jimmy Ba
reacted to their post with 👍🔥 6 months ago
view post
Post
2706
⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526
  • 3 replies
·
reacted to their post with 👍 6 months ago
view post
Post
1694
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·
reacted to their post with 🚀🤗🔥 6 months ago
view post
Post
2394
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit
reacted to their post with ❤️🚀 6 months ago
view post
Post
1694
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·
posted an update 6 months ago
view post
Post
2706
⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526
  • 3 replies
·
posted an update 7 months ago
view post
Post
2394
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit
replied to their post 11 months ago
view reply

We include a narrated 30s summary video here and, additionally, on our project webpage, a video demonstrating our model's usage and a 3-minute video overview explaining our method.

posted an update 11 months ago
view post
Post
1694
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·