gchnkang commited on
Commit
62c1c50
·
verified ·
1 Parent(s): ccff9c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - clip-rt/modified_libero_hdf5
5
+ language:
6
+ - en
7
+ tags:
8
+ - robotics
9
+ - vla
10
+ - clip
11
+ - contrastive_learning
12
+ ---
13
+
14
+ # CLIP-RT Finetuned on LIBERO-Goal
15
+
16
+ We finetune the original [CLIP-RT model](https://clip-rt.github.io/) with a 300M-parameter action decoder to enable continuous action prediction. This checkpoint is the model finetuned on [LIBERO](https://libero-project.github.io/main.html) goal task suite.
17
+
18
+
19
+ ## Hyperparemeters
20
+
21
+ | Category | Details |
22
+ |----------------------|---------------------------------------------------------------------|
23
+ | **Train** | 8 × H100 GPUs, each with 80GB VRAM (batch size: 256) |
24
+ | **Model size** | 1.3B (CLIP-RT base + 0.3B action decoder) |
25
+ | **Action dimension** | 7D end-effector action × 8 action chunks |
26
+ | **Loss** | L1 regression |
27
+ | **Epochs** | 128 |
28
+ | **Performance** | 92.2% success rate on the LIBERO-Goal task suite |
29
+ | **Throughput** | 163Hz |
30
+ | **Inference** | One GPU with 9GB VRAM |
31
+
32
+ ## Usage Instructions
33
+ If you want to evaluate this model on the LIBERO simulator, please refer to the [clip-rt github repository](https://github.com/clip-rt/clip-rt/tree/main/libero).
34
+
35
+ ## Citation
36
+
37
+ ```bibtex
38
+ @article{kang2024cliprt,
39
+ title={CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision},
40
+ author={Kang, Gi-Cheon and Kim, Junghyun and Shim, Kyuhwan and Lee, Jun Ki and Zhang, Byoung-Tak},
41
+ journal={arXiv preprint arXiv:2411.00508},
42
+ year = {2024}
43
+ }
44
+ ```