Jonathan Lorraine's picture
3

Jonathan Lorraine

lorraine2

AI & ML interests

machine learning, computer vision, generative AI

Organizations

lorraine2's activity

posted an update 3 months ago
view post
Post
556
🚨 Code now available for "Using Large Language Models for Hyperparameter Optimization" at https://github.com/michaelrzhang/LLM-HyperOpt 🚨

TLDR: You can just ask LLMs which hyperparameters to use, and it works pretty well! You can even directly optimize your model’s code as a hyperparameter with this.

Check out the paper at https://arxiv.org/abs/2312.04528 - with Michael Zhang, Nishkrit Desai, Juhan Bae, and Jimmy Ba
reacted to their post with 👍🔥 3 months ago
view post
Post
2687
⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526
  • 3 replies
·
reacted to their post with 👍 3 months ago
view post
Post
1678
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·
reacted to their post with 🚀🤗🔥 3 months ago
view post
Post
2391
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit
reacted to their post with ❤️🚀 3 months ago
view post
Post
1678
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·
posted an update 3 months ago
view post
Post
2687
⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526
  • 3 replies
·
posted an update 4 months ago
view post
Post
2391
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit
replied to their post 8 months ago
view reply

We include a narrated 30s summary video here and, additionally, on our project webpage, a video demonstrating our model's usage and a 3-minute video overview explaining our method.

posted an update 8 months ago
view post
Post
1678
New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/
  • 2 replies
·