Bit Transformer Dashboard
Initialize Model
d_model:
nhead:
num_layers:
dim_feedforward:
max_seq_len:
chunk_size:
overlap:
Reversible:
Gradient Checkpointing:
act_threshold:
c_floor:
s_floor:
Init
Train Step
Bits (e.g. 0 1 0 1):
Upload file:
Train
Load sample dataset:
--Select--
Wikitext-2 (train)
Wikitext-2 (validation)
Scale Up
Width Mult:
Scale Model
Collapse Submodel
Cluster Bits (JSON array of arrays):
[[0,1,0,1],[1,1,0,0]]
Target Params (JSON):
{"d_model":32,"nhead":4,"num_layers":1,"dim_feedforward":64,"max_seq_len":16}
Width Scale:
Collapse
Inference
Bits:
Upload file:
Infer
Long Inference
Bits:
ctx_bits:
overlap:
Infer Long
Text Inference
Text:
Infer Text
λ Weights
λ
K
:
λ
C
:
λ
S
:
Update
Diffusion LM
Enable Diffusion Mode
GPU Acceleration
Enable FSDP & CUDA
Enable Compression
Compress I/O
Ratio:
1.0
Quantization Aware Training
Enable 4-bit QAT
Model Status
Telemetry
Hugging Face Checkpoints
Repo ID:
Token:
Upload weights
Download weights