Edit Models filters

Apps

Inference Providers

HF Inference API

Misc

preference-alignment

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

12

Full-text search

Active filters: preference-alignment

Txoka/GLEAM-Mixtral-8x7B-Instruct-v2.0

Text Generation • 47B • Updated Jun 19, 2024 • 2

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-first-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-second-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-third-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-first-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-second-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-third-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-third-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-second-2-5e-06

ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-first-2-5e-06

Shekswess/trlm-stage-3-dpo-final-2

Text Generation • 0.1B • Updated 25 days ago • 91 • 1

Shekswess/trlm-135m

Text Generation • 0.1B • Updated 24 days ago • 1.45k • 43