Why Not α-Entmax? — A Learnable Sparse Alternative to Softmax in Attention
#88 opened about 23 hours ago
by
mc112611
llama-4-scout
#87 opened 3 days ago
by
guptashailender
Output decode
#86 opened 8 days ago
by
kajalnegi
ConnectionReset error
#85 opened 8 days ago
by
kajalnegi
Bug: Llama4 Multimodal (Llama4ForConditionalGeneration) Fails with Optimized Attention (sdpa, eager) and KV Cache for Effective Sequence Lengths > attention_chunk_size (8192)
#84 opened 12 days ago
by
Peter-233234
ValueError
#82 opened 17 days ago
by
bijays09
Update processor_config.json
#81 opened 20 days ago
by
Akshay47

Update config.json
#80 opened 20 days ago
by
Akshay47

Please Check Your Access Requests
➕
2
8
#76 opened 26 days ago
by
omerdemirugm
Student Request for Llama-4-Scout-17B-16E-Instruct
3
#75 opened 27 days ago
by
garytzehay
Commercial license?
#74 opened 30 days ago
by
Blazgo

Student Request for Llama-4 series
2
#73 opened about 1 month ago
by
Nckuhsu
Please create a smaller reasoning model.
#72 opened about 1 month ago
by
ZeroWw
Gated Repo Permission Still Pending for Llama-4
👀
3
4
#71 opened about 1 month ago
by
brando

Gated Repo Permission Still Pending for Llama-4
👀
3
2
#70 opened about 1 month ago
by
bryka

Gated Repo Permission Still Pending for Llama-4
#69 opened about 1 month ago
by
brando

Gated Repo Permission Still Pending for Llama-4
😔
➕
3
4
#68 opened about 1 month ago
by
bitmman-nch
World's Largest Dataset
#67 opened about 2 months ago
by
UJJAWAL-TYAGI

Is it possible to reduce the number of llama4 expert models to use less memory?
#65 opened about 2 months ago
by
gukui

Does LLama4 have chunked attention in generation phase ?
1
#64 opened about 2 months ago
by
vanshils
The "force_words_ids" does not seem to be available on llama4
#63 opened about 2 months ago
by
nlp-g
Access Rejected
3
#62 opened about 2 months ago
by
ansenang

Less Knowledge Than Llama 3.3 70b?
👀
2
5
#60 opened about 2 months ago
by
phil111
No attribute `sliding_window`?
2
#59 opened about 2 months ago
by
farzadab

Any luck doing inference in 8xA100?
5
#57 opened about 2 months ago
by
taytun
Fine-tuning with BitsAndBytes
#56 opened about 2 months ago
by
arnavgrg

Update config.json -- important default parameters were left out from the config
1
#55 opened about 2 months ago
by
mdabbah-nvidia

VLLM not loading meta-llama/Llama-4-Scout-17B-16E-Instruct
🔥
1
3
#53 opened about 2 months ago
by
alokkrsahu
13 B and34 B Pleeease!!! Most people cannot even run this.
👍
❤️
4
4
#52 opened about 2 months ago
by
UniversalLove333
🍭Llama4 SFT Training Script
❤️
🚀
2
#47 opened about 2 months ago
by
study-hjt

Max Output Tokens of Llama-4
#46 opened about 2 months ago
by
MengboZhou
[Issue report] missing keys in the json files
👍
9
4
#45 opened about 2 months ago
by
ShervinGhasemlou

access denied
👍
👀
8
1
#44 opened about 2 months ago
by
qulong
FP8 weights
4
#41 opened about 2 months ago
by
getfit

Update README.md
#40 opened about 2 months ago
by
mfarre

Update README.md
🔥
1
#39 opened about 2 months ago
by
mfarre

torch compile compatibility issue
👍
6
#38 opened about 2 months ago
by
axiomlab
Sagemaker - How to test the image multimodal?
#37 opened about 2 months ago
by
TheSuperAgent
Access request got denied
👍
👀
6
13
#35 opened about 2 months ago
by
migtissera

Deploying production ready Llama-4 models on your AWS with vLLM
❤️
🔥
3
1
#34 opened about 2 months ago
by
agam30

Unethical comparisons with Deepseek replacing chinese languages by thai/vietnamese only
🔥
❤️
9
5
#32 opened 2 months ago
by
krustik
Request: DOI
#31 opened 2 months ago
by
ylx2ai
Couldn't connect
#30 opened 2 months ago
by
Wouze

Ridicolous demands for model gate.
🚀
🔥
9
2
#29 opened 2 months ago
by
marksverdhei

Llama 4 - open-source fine-tuning script
🔥
6
#27 opened 2 months ago
by
hiyouga

Bug in AutoModel
👍
1
3
#26 opened 2 months ago
by
random-checkin

pad error
➕
👍
7
8
#25 opened 2 months ago
by
bobber
AWQ version?
👍
17
#24 opened 2 months ago
by
devops724
Object Detection?
5
#23 opened 2 months ago
by
buckeye17-bah