-
-
-
-
-
-
Inference Providers
Active filters:
rlhf
sileod/deberta-v3-base-tasksource-nli
Zero-Shot Classification
•
Updated
•
1.49M
•
•
127
argilla/distilabeled-OpenHermes-2.5-Mistral-7B
Text Generation
•
Updated
•
32
•
33
mlabonne/NeuralBeagle14-7B
Text Generation
•
Updated
•
125
•
157
zhuohaoyu/RewardAnything-8B-v1
Text Generation
•
Updated
•
4
•
1
WisdomShell/RewardAnything-8B-v1
Text Generation
•
Updated
•
1
•
1
stanfordnlp/SteamSHP-flan-t5-xl
Text2Text Generation
•
Updated
•
43
•
43
stanfordnlp/SteamSHP-flan-t5-large
Text2Text Generation
•
Updated
•
63
•
33
trl-lib/llama-7b-se-peft
sileod/deberta-v3-large-tasksource-nli
Zero-Shot Classification
•
Updated
•
1.72k
•
•
36
sileod/deberta-v3-large-tasksource-rlhf-reward-model
Text Classification
•
Updated
•
24
•
11
trl-lib/llama-7b-se-rl-peft
Updated
•
103
trl-lib/llama-7b-se-rm-peft
toloka/gpt2-large-rl-prompt-writing
Text Generation
•
Updated
•
17
•
3
AdamG012/chat-opt-1.3b-rlhf-actor-deepspeed
Text Generation
•
Updated
•
26
•
5
AdamG012/chat-opt-1.3b-rlhf-critic-deepspeed
Text Generation
•
Updated
•
19
•
3
AdamG012/chat-opt-1.3b-rlhf-actor-ema-deepspeed
Text Generation
•
Updated
•
22
•
8
sileod/mdeberta-v3-base-tasksource-nli
Zero-Shot Classification
•
Updated
•
80
•
•
18
agi-css/socially-good-lm
Text Generation
•
Updated
•
11
•
5
agi-css/hh-rlhf-sft
Text Generation
•
Updated
•
716
•
3
agi-css/better-base
Text Generation
•
Updated
•
22
•
5
argilla/roberta-base-reward-model-falcon-dolly
Text Classification
•
Updated
•
37
•
4
merve/peft-copy-test
Text Generation
•
Updated
•
3
PKU-Alignment/beaver-7b-v1.0
Reinforcement Learning
•
Updated
•
13
•
11
lyogavin/Anima33B-DPO-Belle-1k
Text Generation
•
Updated
•
1
lyogavin/Anima33B-DPO-Belle-1k-merged
Text Generation
•
Updated
•
8
•
12
PKU-Alignment/beaver-7b-v1.0-reward
Reinforcement Learning
•
Updated
•
5.9k
•
17
PKU-Alignment/beaver-dam-7b
Updated
•
2.38k
•
9
PKU-Alignment/beaver-7b-v1.0-cost
Reinforcement Learning
•
Updated
•
5.79k
•
10
Ablustrund/moss-rlhf-reward-model-7B-zh
Updated
•
5
•
23
fnlp/moss-rlhf-reward-model-7B-en