guyuchao commited on
Commit
cca1829
Β·
verified Β·
1 Parent(s): 433b076

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -8,9 +8,9 @@ license: mit
8
  <div align="center">
9
 
10
  [![Project Page](https://img.shields.io/badge/Project-Website-orange)](https://farlongctx.github.io/)
11
- [![arXiv](https://img.shields.io/badge/arXiv-2503.19325-b31b1b.svg)](https://arxiv.org/abs/2503.19325)&nbsp;
12
- [![huggingface weights](https://img.shields.io/badge/%F0%9F%A4%97%20Weights-FAR-yellow)](https://huggingface.co/guyuchao/FAR_Models)&nbsp;
13
- [![SOTA](https://img.shields.io/badge/State%20of%20the%20Art-Video%20Generation%20-32B1B4?logo=data%3Aimage%2Fsvg%2Bxml%3Bbase64%2CPHN2ZyB3aWR0aD0iNjA2IiBoZWlnaHQ9IjYwNiIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgb3ZlcmZsb3c9ImhpZGRlbiI%2BPGRlZnM%2BPGNsaXBQYXRoIGlkPSJjbGlwMCI%2BPHJlY3QgeD0iLTEiIHk9Ii0xIiB3aWR0aD0iNjA2IiBoZWlnaHQ9IjYwNiIvPjwvY2xpcFBhdGg%2BPC9kZWZzPjxnIGNsaXAtcGF0aD0idXJsKCNjbGlwMCkiIHRyYW5zbGF0ZSgxIDEpIj48cmVjdCB4PSI1MjkiIHk9IjY2IiB3aWR0aD0iNTYiIGhlaWdodD0iNDczIiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iMTkiIHk9IjY2IiB3aWR0aD0iNTciIGhlaWdodD0iNDczIiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iMjc0IiB5PSIxNTEiIHdpZHRoPSI1NyIgaGVpZ2h0PSIzMDIiIGZpbGw9IiM0NEYyRjYiLz48cmVjdCB4PSIxMDQiIHk9IjE1MSIgd2lkdGg9IjU3IiBoZWlnaHQ9IjMwMiIgZmlsbD0iIzQ0RjJGNiIvPjxyZWN0IHg9IjQ0NCIgeT0iMTUxIiB3aWR0aD0iNTciIGhlaWdodD0iMzAyIiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iMzU5IiB5PSIxNzAiIHdpZHRoPSI1NiIgaGVpZ2h0PSIyNjQiIGZpbGw9IiM0NEYyRjYiLz48cmVjdCB4PSIxODgiIHk9IjE3MCIgd2lkdGg9IjU3IiBoZWlnaHQ9IjI2NCIgZmlsbD0iIzQ0RjJGNiIvPjxyZWN0IHg9Ijc2IiB5PSI2NiIgd2lkdGg9IjQ3IiBoZWlnaHQ9IjU3IiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iNDgyIiB5PSI2NiIgd2lkdGg9IjQ3IiBoZWlnaHQ9IjU3IiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iNzYiIHk9IjQ4MiIgd2lkdGg9IjQ3IiBoZWlnaHQ9IjU3IiBmaWxsPSIjNDRGMkY2Ii8%2BPHJlY3QgeD0iNDgyIiB5PSI0ODIiIHdpZHRoPSI0NyIgaGVpZ2h0PSI1NyIgZmlsbD0iIzQ0RjJGNiIvPjwvZz48L3N2Zz4%3D)](https://paperswithcode.com/sota/video-generation-on-ucf-101)
14
 
15
  </div>
16
 
@@ -18,7 +18,7 @@ license: mit
18
  <a href="https://arxiv.org/abs/2503.19325">Long-Context Autoregressive Video Modeling with Next-Frame Prediction</a>
19
  </p>
20
 
21
- ![dmlab_sample](https://github.com/showlab/FAR/blob/main/assets/dmlab_sample.png)
22
 
23
  ## πŸ“’ News
24
 
@@ -31,12 +31,12 @@ license: mit
31
 
32
  FAR (i.e., <u>**F**</u>rame <u>**A**</u>uto<u>**R**</u>egressive Model) learns to predict continuous frames based on an autoregressive context. Its objective aligns well with video modeling, similar to the next-token prediction in language modeling.
33
 
34
- ![dmlab_sample](./assets/pipeline.png)
35
 
36
  ### πŸ”₯ FAR achieves better convergence than video diffusion models with the same continuous latent space
37
 
38
  <p align="center">
39
- <img src="./assets/converenge.jpg" width=55%>
40
  <p>
41
 
42
  ### πŸ”₯ FAR leverages clean visual context without additional image-to-video fine-tuning:
@@ -44,19 +44,19 @@ FAR (i.e., <u>**F**</u>rame <u>**A**</u>uto<u>**R**</u>egressive Model) learns t
44
  Unconditional pretraining on UCF-101 achieves state-of-the-art results in both video generation (context frame = 0) and video prediction (context frame β‰₯ 1) within a single model.
45
 
46
  <p align="center">
47
- <img src="./assets/performance.png" width=75%>
48
  <p>
49
 
50
  ### πŸ”₯ FAR supports 16x longer temporal extrapolation at test time
51
 
52
  <p align="center">
53
- <img src="./assets/extrapolation.png" width=100%>
54
  <p>
55
 
56
  ### πŸ”₯ FAR supports efficient training on long-video sequence with managable token lengths
57
 
58
  <p align="center">
59
- <img src="./assets/long_short_term_ctx.jpg" width=55%>
60
  <p>
61
 
62
  #### πŸ“š For more details, check out our [paper](https://arxiv.org/abs/2503.19325).
 
8
  <div align="center">
9
 
10
  [![Project Page](https://img.shields.io/badge/Project-Website-orange)](https://farlongctx.github.io/)
11
+ [![arXiv](https://img.shields.io/badge/arXiv-2503.19325-b31b1b.svg)](https://arxiv.org/abs/2503.19325)
12
+ [![huggingface weights](https://img.shields.io/badge/%F0%9F%A4%97%20Weights-FAR-yellow)](https://huggingface.co/guyuchao/FAR_Models)
13
+ [![SOTA](https://img.shields.io/badge/State%20of%20the%20Art-Video%20Generation%20-32B1B4)](https://paperswithcode.com/sota/video-generation-on-ucf-101)
14
 
15
  </div>
16
 
 
18
  <a href="https://arxiv.org/abs/2503.19325">Long-Context Autoregressive Video Modeling with Next-Frame Prediction</a>
19
  </p>
20
 
21
+ ![dmlab_sample](https://github.com/showlab/FAR/blob/main/assets/dmlab_sample.png?raw=true)
22
 
23
  ## πŸ“’ News
24
 
 
31
 
32
  FAR (i.e., <u>**F**</u>rame <u>**A**</u>uto<u>**R**</u>egressive Model) learns to predict continuous frames based on an autoregressive context. Its objective aligns well with video modeling, similar to the next-token prediction in language modeling.
33
 
34
+ ![dmlab_sample](https://github.com/showlab/FAR/blob/main/assets/pipeline.png?raw=true)
35
 
36
  ### πŸ”₯ FAR achieves better convergence than video diffusion models with the same continuous latent space
37
 
38
  <p align="center">
39
+ <img src="https://github.com/showlab/FAR/blob/main/assets/converenge.jpg?raw=true" width=55%>
40
  <p>
41
 
42
  ### πŸ”₯ FAR leverages clean visual context without additional image-to-video fine-tuning:
 
44
  Unconditional pretraining on UCF-101 achieves state-of-the-art results in both video generation (context frame = 0) and video prediction (context frame β‰₯ 1) within a single model.
45
 
46
  <p align="center">
47
+ <img src="https://github.com/showlab/FAR/blob/main/assets/performance.png?raw=true" width=75%>
48
  <p>
49
 
50
  ### πŸ”₯ FAR supports 16x longer temporal extrapolation at test time
51
 
52
  <p align="center">
53
+ <img src="https://github.com/showlab/FAR/blob/main/assets/extrapolation.png?raw=true" width=100%>
54
  <p>
55
 
56
  ### πŸ”₯ FAR supports efficient training on long-video sequence with managable token lengths
57
 
58
  <p align="center">
59
+ <img src="https://github.com/showlab/FAR/blob/main/assets/long_short_term_ctx.jpg?raw=true" width=55%>
60
  <p>
61
 
62
  #### πŸ“š For more details, check out our [paper](https://arxiv.org/abs/2503.19325).