Progressive Growth Transformers (PGT) [pretrain] Transformers grown layer-by-layer on frozen embeddings. Explores emergent capabilities with depth. Collection by Bochkov 5 days ago - Bochkov/abs-bvv-6 Text Generation • Updated 1 day ago • 14 Bochkov/abs-bvv-5 Text Generation • Updated 1 day ago • 14 Bochkov/abs-bvv-4 Text Generation • Updated 1 day ago • 12 Bochkov/abs-bvv-3 Text Generation • Updated 1 day ago • 14
Pro models [pretrain] Frozen-embedding LMs for English, Russian, Chinese; demonstration & comparison with standard LM. Collection by Bochkov 5 days ago - Bochkov/pro_bvv_en Text Generation • Updated 1 day ago • 18 Bochkov/pro_bvv_unfrozen Text Generation • Updated 1 day ago • 15 Bochkov/pro_bvv_ru Text Generation • Updated 1 day ago • 10 Bochkov/pro_bvv_zh Text Generation • Updated 1 day ago • 11
Max models [pretrain] Multilingual language model collection with frozen, unified Unicode-based embeddings. Includes Russian, Chinese, and their MoE fusion. Collection by Bochkov 5 days ago - Bochkov/max_bvv_moe Text Generation • Updated 1 day ago • 13 Bochkov/max_bvv_ru Text Generation • Updated 1 day ago • 21 Bochkov/max_bvv_zh Text Generation • Updated 1 day ago • 17 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Best demo models [pretrain] Frozen embedding LMs (en/ru/zh) & their MoE fusion. Baselines: frozen vs unfrozen embedding ablation. Collection by Bochkov 5 days ago - Bochkov/best_bvv_moe Text Generation • Updated 1 day ago • 16 Bochkov/best_bvv_ru Text Generation • Updated 1 day ago • 14 Bochkov/best_bvv_unfrozen_ru Text Generation • Updated 1 day ago • 18 Bochkov/best_bvv_zh Text Generation • Updated 1 day ago • 14
Nemo models [pretrain] Proof-of-concept: SOTA tokenizers can be used for compatible precomputed embeddings, industry can repeat with their tokenizers Collection by Bochkov 5 days ago - Bochkov/nemo_bvv_moe Text Generation • Updated 1 day ago • 13 Bochkov/nemo_bvv_ru Text Generation • Updated 1 day ago • 14 Bochkov/nemo_bvv_zh Text Generation • Updated 1 day ago • 13 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Tokenizers This collection features frozen, precomputed token embedding tensors designed for experimentation with semantic emergence in language models. Collection by Bochkov 3 days ago - Bochkov/bvv241-2-3 Feature Extraction • Updated 1 day ago • 13 Bochkov/bvv241-max Feature Extraction • Updated 1 day ago • 13 Bochkov/bvv241-nemo Feature Extraction • Updated 1 day ago • 11 Bochkov/bvv241-abs Feature Extraction • Updated 1 day ago • 11
Progressive Growth Transformers (PGT) [pretrain] Transformers grown layer-by-layer on frozen embeddings. Explores emergent capabilities with depth. Collection by Bochkov 5 days ago - Bochkov/abs-bvv-6 Text Generation • Updated 1 day ago • 14 Bochkov/abs-bvv-5 Text Generation • Updated 1 day ago • 14 Bochkov/abs-bvv-4 Text Generation • Updated 1 day ago • 12 Bochkov/abs-bvv-3 Text Generation • Updated 1 day ago • 14
Best demo models [pretrain] Frozen embedding LMs (en/ru/zh) & their MoE fusion. Baselines: frozen vs unfrozen embedding ablation. Collection by Bochkov 5 days ago - Bochkov/best_bvv_moe Text Generation • Updated 1 day ago • 16 Bochkov/best_bvv_ru Text Generation • Updated 1 day ago • 14 Bochkov/best_bvv_unfrozen_ru Text Generation • Updated 1 day ago • 18 Bochkov/best_bvv_zh Text Generation • Updated 1 day ago • 14
Pro models [pretrain] Frozen-embedding LMs for English, Russian, Chinese; demonstration & comparison with standard LM. Collection by Bochkov 5 days ago - Bochkov/pro_bvv_en Text Generation • Updated 1 day ago • 18 Bochkov/pro_bvv_unfrozen Text Generation • Updated 1 day ago • 15 Bochkov/pro_bvv_ru Text Generation • Updated 1 day ago • 10 Bochkov/pro_bvv_zh Text Generation • Updated 1 day ago • 11
Nemo models [pretrain] Proof-of-concept: SOTA tokenizers can be used for compatible precomputed embeddings, industry can repeat with their tokenizers Collection by Bochkov 5 days ago - Bochkov/nemo_bvv_moe Text Generation • Updated 1 day ago • 13 Bochkov/nemo_bvv_ru Text Generation • Updated 1 day ago • 14 Bochkov/nemo_bvv_zh Text Generation • Updated 1 day ago • 13 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Max models [pretrain] Multilingual language model collection with frozen, unified Unicode-based embeddings. Includes Russian, Chinese, and their MoE fusion. Collection by Bochkov 5 days ago - Bochkov/max_bvv_moe Text Generation • Updated 1 day ago • 13 Bochkov/max_bvv_ru Text Generation • Updated 1 day ago • 21 Bochkov/max_bvv_zh Text Generation • Updated 1 day ago • 17 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations Paper • 2507.04886 • Published 9 days ago • 2
Tokenizers This collection features frozen, precomputed token embedding tensors designed for experimentation with semantic emergence in language models. Collection by Bochkov 3 days ago - Bochkov/bvv241-2-3 Feature Extraction • Updated 1 day ago • 13 Bochkov/bvv241-max Feature Extraction • Updated 1 day ago • 13 Bochkov/bvv241-nemo Feature Extraction • Updated 1 day ago • 11 Bochkov/bvv241-abs Feature Extraction • Updated 1 day ago • 11