whitead commited on
Commit
d97863d
·
1 Parent(s): e4a8589

Added images for repo

Browse files
.gitattributes CHANGED
@@ -34,3 +34,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ *.png filter=lfs diff=lfs merge=lfs -text
38
+ *.jpg filter=lfs diff=lfs merge=lfs -text
39
+ *.svg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -13,6 +13,10 @@ tags:
13
 
14
  # ether0
15
 
 
 
 
 
16
  ether0 is a 24B language model trained to reason in English and output molecular structures as SMILES.
17
  It is derived from fine-tuning and reinforcement learning training from Mistral-Small-24B-Instruct-2501.
18
  Ask questions in English, but they may also include molecules specified as SMILES. The SMILES do not need to be canonical and may contain stereochemistry information.
@@ -39,6 +43,8 @@ It has been trained specifically for these tasks:
39
  * natural product elucidation (formula + organism to SMILES)
40
  * blood-brain barrier permeability
41
 
 
 
42
  For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
43
  If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.
44
 
@@ -50,6 +56,11 @@ For example, we have observed that the model often confuses lysine and glutamic
50
 
51
  ## Training data and details
52
 
 
 
 
 
 
53
  See our [preprint](arxiv.org) for details on data and training process.
54
 
55
  ## Safety
@@ -57,9 +68,21 @@ See our [preprint](arxiv.org) for details on data and training process.
57
  We performed refusal post-training for compounds listed on OPCW schedules 1 and 2.
58
  We also post-trained ether0 to refuse questions about standard malicious topics like making explosives or poisons.
59
  As the model knows pharmacokinetics, it can modulate toxicity.
60
- However, the tructure of toxic or narcotic compounds are generally known and thus we do not consider this a safety risk. The model can provide
61
  no uplift on "tacit knowledge" tasks like purification, scale-up, or processing beyond a web search or similar sized language model.
62
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  ## License
64
 
65
  Open-weights (Apache 2.0)
 
13
 
14
  # ether0
15
 
16
+ <img src="./images/ether0_logo.svg" width="200">
17
+
18
+
19
+
20
  ether0 is a 24B language model trained to reason in English and output molecular structures as SMILES.
21
  It is derived from fine-tuning and reinforcement learning training from Mistral-Small-24B-Instruct-2501.
22
  Ask questions in English, but they may also include molecules specified as SMILES. The SMILES do not need to be canonical and may contain stereochemistry information.
 
43
  * natural product elucidation (formula + organism to SMILES)
44
  * blood-brain barrier permeability
45
 
46
+ <img src="./images/benchmarks.png" width="800">
47
+
48
  For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
49
  If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.
50
 
 
56
 
57
  ## Training data and details
58
 
59
+ We first pre-trained Mistral-Small-24B-Instruct-2501 via mostly incorrect reasoning traces from DeepSeek r1 to elicit reasoning and follow the new tokens/templates. Next, we used indepedent rounds of specialists trained with GRPO and verifiable rewards on one of the above tasks. We then aggregated and filtered reasoning traces (correct answers with reasoning) from the specialists to again fine-tune Mistral-Small-24B-Instruct-2501. Then, we did GRPO over all tasks. This last model was then put through safety post-training.
60
+
61
+
62
+ <img src="./images/training_info.png" width="800">
63
+
64
  See our [preprint](arxiv.org) for details on data and training process.
65
 
66
  ## Safety
 
68
  We performed refusal post-training for compounds listed on OPCW schedules 1 and 2.
69
  We also post-trained ether0 to refuse questions about standard malicious topics like making explosives or poisons.
70
  As the model knows pharmacokinetics, it can modulate toxicity.
71
+ However, the structure of toxic or narcotic compounds are generally known and thus we do not consider this a safety risk. The model can provide
72
  no uplift on "tacit knowledge" tasks like purification, scale-up, or processing beyond a web search or similar sized language model.
73
 
74
+ ## Citation
75
+
76
+ ```bibtex
77
+ @article{narayanan2025training,
78
+ title={Training a Scientific Reasoning Model for Chemistry},
79
+ author={Narayanan, Siddharth M. and Braza, James D. and Griffiths, Ryan-Rhys and Bou, Albert and Wellawatte, Geemi P. and Ramos, Mayk Caldas and Mitchener, Ludovico and Rodriques, Samuel G. and White, Andrew D.},
80
+ journal={arXiv preprint arXiv:XXXX.XXXXX},
81
+ year={2025}
82
+ }
83
+ ```
84
+
85
+
86
  ## License
87
 
88
  Open-weights (Apache 2.0)
images/benchmarks.png ADDED

Git LFS Details

  • SHA256: a73504ea5a2d775ebc57b01f90d199d9bc76e31b3ff4aaa48df6569ea21aeb65
  • Pointer size: 131 Bytes
  • Size of remote file: 348 kB
images/ether0_logo.svg ADDED

Git LFS Details

  • SHA256: e0867c7e4ab64596bdf22702cf5928c06a31eb793769439eaf676f42ef0b578a
  • Pointer size: 129 Bytes
  • Size of remote file: 7.36 kB
images/training_info.png ADDED

Git LFS Details

  • SHA256: 00385743792f71434cf262a549ce424507adee96096a8b60bbbdc358db702a43
  • Pointer size: 131 Bytes
  • Size of remote file: 587 kB