underthelights commited on
Commit
ccff9c5
·
verified ·
1 Parent(s): a98504e

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +3 -41
  2. cliprt_libero_goal.pt +3 -0
README.md CHANGED
@@ -1,41 +1,3 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - clip-rt/modified_libero_hdf5
5
- language:
6
- - en
7
- tags:
8
- - robotics
9
- - vla
10
- - clip
11
- - contrastive_learning
12
- ---
13
-
14
- # CLIP-RT Finetuned on LIBERO-Goal
15
-
16
- This model was produced by fine-tuning the [CLIP-RT model](https://clip-rt.github.io/) with a 0.3B parameter action decoder added to enable continuous action prediction on the LIBERO-Goal dataset from the [LIBERO simulation benchmark](https://libero-project.github.io/main.html).
17
-
18
- ## Hyperparemeters
19
-
20
- | Category | Details |
21
- |----------------------|---------------------------------------------------------------------|
22
- | **Hardware** | 8 × H100 GPUs with 80GB memory |
23
- | **Model size** | 1.3B (CLIP-RT base + 0.3B action decoder) |
24
- | **Action dimension** | 7D per step × 8 steps (chunked) |
25
- | **Loss** | L1 regression |
26
- | **Batch size** | 256 |
27
- | **Epochs** | 128 |
28
-
29
- ## Usage Instructions
30
- To evaluate this model on the LIBERO simulator or in your own imitation learning pipeline, use the action decoder module with precomputed CLIP image and language embeddings. Refer to the original CLIP-RT GitHub repository for code and inference scripts.
31
-
32
- ## Citation
33
-
34
- ```bibtex
35
- @article{kang2024cliprt,
36
- title={CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision},
37
- author={Kang, Gi-Cheon and Kim, Junghyun and Shim, Kyuhwan and Lee, Jun Ki and Zhang, Byoung-Tak},
38
- journal={arXiv preprint arXiv:2411.00508},
39
- year = {2024}
40
- }
41
- ```
 
1
+ ---
2
+ license: mit
3
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cliprt_libero_goal.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0b142300bf8d12afad7a1d6e610de0aa780c72e82e2545ef5f82c587e0cc7bd
3
+ size 16203732598