arxiv:2508.08401

Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery

Published on Aug 11

· Submitted by

weidawang on Aug 14

#3 Paper of the day

Upvote

Authors:

Jiatong Li ,

Weida Wang ,

Junxian Li ,

Di Zhang ,

Abstract

Mol-R1 framework enhances molecule discovery by improving reasoning performance and explainability through PRID and MoIA strategies.

AI-generated summary

Large language models (LLMs), especially Explicit Long Chain-of-Thought (CoT) reasoning models like DeepSeek-R1 and QWQ, have demonstrated powerful reasoning capabilities, achieving impressive performance in commonsense reasoning and mathematical inference. Despite their effectiveness, Long-CoT reasoning models are often criticized for their limited ability and low efficiency in knowledge-intensive domains such as molecule discovery. Success in this field requires a precise understanding of domain knowledge, including molecular structures and chemical principles, which is challenging due to the inherent complexity of molecular data and the scarcity of high-quality expert annotations. To bridge this gap, we introduce Mol-R1, a novel framework designed to improve explainability and reasoning performance of R1-like Explicit Long-CoT reasoning LLMs in text-based molecule generation. Our approach begins with a high-quality reasoning dataset curated through Prior Regulation via In-context Distillation (PRID), a dedicated distillation strategy to effectively generate paired reasoning traces guided by prior regulations. Building upon this, we introduce MoIA, Molecular Iterative Adaptation, a sophisticated training strategy that iteratively combines Supervised Fine-tuning (SFT) with Reinforced Policy Optimization (RPO), tailored to boost the reasoning performance of R1-like reasoning models for molecule discovery. Finally, we examine the performance of Mol-R1 in the text-based molecule reasoning generation task, showing superior performance against existing baselines.

View arXiv page View PDF Add to collection

Community

weidawang

Paper author Paper submitter 9 days ago

•

edited 9 days ago

Mol-R1 introduces explicit long chain-of-thought reasoning into molecular generation via Prior-Regulated In-Context Distillation (PRID) plus iterative SFT/RPO within MoIA, balancing interpretability and accuracy and achieving robust gains under limited annotations.

AlexiaJM

8 days ago

No pre-existing benchmark used. No comparison to the many non-LLMs methods which are SOTA. In practice, models trained from scratch are much better at molecule generation.

Mike0722

8 days ago

Bro, I'd have to push back on that. The whole criticism of not comparing to existing non-LLM methods fundamentally misses the point. Those methods normally can't do text-based molecule generation. I'd argue that this work represents a crucial step forward for the application of long-CoT reasoning in molecule generation, which has significant implications for future research. You can't just compare it with models trained from scratch.