Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,12 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
[Release](https://osmosis.ai/blog/applying-rl-mcp)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
Read our full release here: [Release](https://osmosis.ai/blog/applying-rl-mcp)
|
6 |
+
|
7 |
+
Using reinforcement learning, we trained a 4B model that can hook into any MCP client to work with every MCP server.
|
8 |
+
|
9 |
+
This was done through the use of Dr. GRPO, in addition to generating synthetic multi turn data that requires calls to multiple MCP servers. (Such as given the weather in San Francisco, what are the top locations to hike?)
|
10 |
+
|
11 |
+
We observe that through using this training data, the model will now sample much more predictably and rely more on available tools rather than intuition.
|
12 |
+
|
13 |
+
Through the initial training process, we hope to build strong SLMs that can reason and arrive at the solution given that the environment is sufficient, i.e. the correct tools are present to the model.
|