Update README.md
Browse files
README.md
CHANGED
@@ -21,16 +21,13 @@ tags:
|
|
21 |
|
22 |
# Sundial
|
23 |
|
24 |
-
Sundial is a familiy of **generative** time series foundation models.
|
25 |
|
26 |
-
The
|
27 |
|
28 |
-
|
29 |
-
|
30 |
-
For more information, please see this [paper](https://arxiv.org/pdf/2502.00816) and [GitHub](https://github.com/thuml/Sundial).
|
31 |
-
|
32 |
-

|
33 |
|
|
|
34 |
|
35 |
# Evaluation
|
36 |
|
@@ -83,6 +80,7 @@ A notebook example is also provided [here](https://github.com/thuml/Sundial/blob
|
|
83 |
* Patch Length: 16
|
84 |
* Parameter Count: 128M
|
85 |
* Number of Layers: 12
|
|
|
86 |
|
87 |
## Acknowledgments
|
88 |
|
|
|
21 |
|
22 |
# Sundial
|
23 |
|
24 |
+
Sundial is a familiy of **generative** time series foundation models. The model can make zero-shot predictions for both **point** and **probabilistic** forecasting.
|
25 |
|
26 |
+
The base version is pre-trained on **1 trillion** time points with **128M** parameters. For more information, please see this [paper](https://arxiv.org/pdf/2502.00816) and [GitHub](https://github.com/thuml/Sundial).
|
27 |
|
28 |
+

|
|
|
|
|
|
|
|
|
29 |
|
30 |
+
Figure 1. Overall architecture of Sundial. The input time series is divided into patch tokens, which are embedded from original continuous values. The patch embeddings are fed into a decoder-only Transformer, a stable and speedup version that learns token representations via causal self-attention. The model is optimized using our TimeFlow Loss, a parameterized loss function that models per-token probability distribution conditioned on the learned representations, and generates multiple plausible predictions under the flow-matching framework.
|
31 |
|
32 |
# Evaluation
|
33 |
|
|
|
80 |
* Patch Length: 16
|
81 |
* Parameter Count: 128M
|
82 |
* Number of Layers: 12
|
83 |
+
* Speedup with KV Cache & FlashAttention
|
84 |
|
85 |
## Acknowledgments
|
86 |
|