Post
2391
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights
Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.
In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.
🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630
Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.
Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/
You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit
Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.
In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.
🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630
Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.
Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/
You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit