@lorraine2 on Hugging Face: "New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed…"

Post

2396

New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit

Join the conversation