PommesPeter
PommesPeter
AI & ML interests
MM-LLM
Recent Activity
updated
a model
about 5 hours ago
PommesPeter/dp_ckpts
reacted
to
IliaLarchenko's
post
with š
2 days ago
I am presenting Decoder-Only Transformer (DOT) Policy a simple Behavioral Control policy that outperforms SOTA models on two simple benchmark tasks:
ā
PushT (pushing an object to a goal) ā 84% success on keypoints, 74% on images (previous best: 75% / 69%)
ā
ALOHA Insert (precise bimanual insertion) ā 30% success (previous best: ~21%)
The best part? DOT is much smaller (sometimes 100 times less parameters) than previous SOTA models, trains faster, and avoids complexity:
š« No generative models (Diffusion, VAE, GANs)
š« No discretization/tokenization of actions
š« No reinforcement learning or multi-stage training
ā
Just learns from human demos, plain and simple
This is still early ā more complex real-life tasks need testing, and no guarantees it will actually work well there, but I think it's interesting to share. Sometimes, simpler approaches can be just as effective (or even better) than complex ones.
š Open-source code and detailed description: https://github.com/IliaLarchenko/dot_policy
Trained models on Hugging Face:
https://huggingface.co/IliaLarchenko/dot_pusht_keypoints
https://huggingface.co/IliaLarchenko/dot_pusht_images
https://huggingface.co/IliaLarchenko/dot_bimanual_insert
reacted
to
IliaLarchenko's
post
with š„
2 days ago
I am presenting Decoder-Only Transformer (DOT) Policy a simple Behavioral Control policy that outperforms SOTA models on two simple benchmark tasks:
ā
PushT (pushing an object to a goal) ā 84% success on keypoints, 74% on images (previous best: 75% / 69%)
ā
ALOHA Insert (precise bimanual insertion) ā 30% success (previous best: ~21%)
The best part? DOT is much smaller (sometimes 100 times less parameters) than previous SOTA models, trains faster, and avoids complexity:
š« No generative models (Diffusion, VAE, GANs)
š« No discretization/tokenization of actions
š« No reinforcement learning or multi-stage training
ā
Just learns from human demos, plain and simple
This is still early ā more complex real-life tasks need testing, and no guarantees it will actually work well there, but I think it's interesting to share. Sometimes, simpler approaches can be just as effective (or even better) than complex ones.
š Open-source code and detailed description: https://github.com/IliaLarchenko/dot_policy
Trained models on Hugging Face:
https://huggingface.co/IliaLarchenko/dot_pusht_keypoints
https://huggingface.co/IliaLarchenko/dot_pusht_images
https://huggingface.co/IliaLarchenko/dot_bimanual_insert