Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

custom_generate

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

19

Full-text search

Active filters: custom_generate

transformers-community/custom_generate_example

0.5B • Updated 14 days ago • 2.73k • 2

transformers-community/custom_generate_bad_requirements

0.5B • Updated May 13, 2025 • 13

transformers-community/sink_cache

0.8B • Updated May 27, 2025 • 18 • 2

CMB-AI-LAB/lagkv_cache

Updated Jun 4, 2025

manueldeprada/sampling_with_kvcache

Text Generation • 0.1B • Updated Jun 27, 2025 • 1

manueldeprada/sampling_with_kvcache_hf_helpers

Text Generation • 0.1B • Updated Jun 27, 2025 • 1

manueldeprada/sampling

Text Generation • 0.1B • Updated Jun 27, 2025 • 2

ligongh/squat

Updated Jul 10, 2025 • 2

transformers-community/sep_cache

8B • Updated Aug 4, 2025 • 9 • 9

Gausson/sep_cache

8B • Updated Aug 4, 2025 • 2 • 1

Pramodith/topN_sigma_generation

Text Generation • Updated Aug 5, 2025 • 4 • 2

transformers-community/dola

Text Generation • 0.8B • Updated 14 days ago • 57 • 2

manueldeprada/dola

Text Generation • 0.8B • Updated Aug 25, 2025 • 5

transformers-community/contrastive-search

Text Generation • 0.8B • Updated 14 days ago • 61 • 2

transformers-community/group-beam-search

Text Generation • 0.8B • Updated Nov 28, 2025 • 11

transformers-community/constrained-beam-search

Text Generation • 0.8B • Updated Nov 10, 2025 • 13 • 2

kashif/DeepConf

Text Generation • Updated 18 days ago • 22 • 3

Todokete/xtc

Updated Dec 1, 2025

radia/speculative-cascades

Text Generation • 2B • Updated Jan 5 • 9