Abstract
Recent research suggests that tree search algorithms (e.g. Monte Carlo Tree Search) can dramatically boost LLM performance on complex mathematical reasoning tasks. However, they often require more than 10 times the computational resources of greedy decoding due to wasteful search strategies, making them difficult to be deployed in practical applications. This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget (maximum number of children) calculation to tackle this issue. By considering the search progress towards the final answer (history) and the guidance from a value network (future) trained without any step-wise annotations, our algorithm iteratively selects the most promising tree node before expanding it within the boundaries of the allocated computational budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach not only offers competitive performance but also enjoys significantly lower computational costs compared to baseline methods.
Community
Recent research suggests that tree search algorithms (e.g. Monte Carlo Tree
Search) can dramatically boost LLM performance on complex mathematical
reasoning tasks. However, they often require more than 10 times the
computational resources of greedy decoding due to wasteful search strategies,
making them difficult to be deployed in practical applications. This study
introduces a novel guided tree search algorithm with dynamic node selection and
node-level exploration budget (maximum number of children) calculation to
tackle this issue. By considering the search progress towards the final answer
(history) and the guidance from a value network (future) trained without any
step-wise annotations, our algorithm iteratively selects the most promising
tree node before expanding it within the boundaries of the allocated
computational budget. Experiments conducted on the GSM8K and TabMWP datasets
demonstrate that our approach not only offers competitive performance but also
enjoys significantly lower computational costs compared to baseline methods.
Now API costs wont be so high for my experiments, I am very happy
I wonder if this algorithm is also useful for non-llm search
Also, will there be a GitHub repo at some point for this? I know the algo is easy to implement but I am still curious
This is what I thought tree of thought should have been. Well done.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper