openai
/

gpt-oss-20b

Text Generation

8-bit precision

Model card Files Files and versions

danielhanchen commited on Oct 6

Commit

a4594b0

·

verified ·

1 Parent(s): 6cee5e8

Reinforcement Learning example

@dkundel-openai
:)

Files changed (1) hide show

README.md +10 -2

README.md CHANGED Viewed

@@ -163,9 +163,17 @@ The gpt-oss models are excellent for:
 # Fine-tuning
-Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
-This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 node.
 # Citation

 # Fine-tuning
+Both gpt-oss models can be fine-tuned for a variety of specialized use-cases by using [transformers](https://github.com/huggingface/transformers) and [Unsloth](https://docs.unsloth.ai/new/gpt-oss-how-to-run-and-fine-tune).
+This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 GPU.
+You can learn more about fine-tuning gpt-oss from [Hugging Face](https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers) or [Unsloth’s guide](https://docs.unsloth.ai/new/gpt-oss-how-to-run-and-fine-tune#fine-tuning-gpt-oss-with-unsloth).
+## Reinforcement Fine-tuning
+You can also train `gpt-oss` with reinforcement learning (RL).
+[OpenAI’s notebook](https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb) shows how you can train `gpt-oss-20b` with RL to autonomously solve the 2048 game.
 # Citation