Michael Goin
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Recent Activity
upvoted
a
paper
7 days ago
SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language
Models
updated
a model
7 days ago
RedHatAI/DeepSeek-R1-0528-quantized.w4a16
published
a model
10 days ago
RedHatAI/granite-3.1-8b-instruct-GGUF
Organizations
mgoin's activity
How should I input the image?
1
#3 opened 26 days ago
by
CyberWolf0

用vllm serve启动不了
1
#2 opened 2 months ago
by
VenomEY
Fix processor_class to match upstream
#4 opened about 2 months ago
by
zifeitong
Remove image_processor_type
#1 opened 2 months ago
by
pooya-davoodi-parasail
OSError: nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic does not appear to have a file named decilm.py
2
#2 opened about 2 months ago
by
TheDrummer

how to deploy this model without internet connection
1
#1 opened about 2 months ago
by
superahn
Why not FP8 with static and per-tensor quantization?
👍
1
1
#2 opened about 2 months ago
by
wanzhenchn
Address discrepancies in the languages supported by the Mistral Small 3.1 2503
🔥
1
3
#54 opened 2 months ago
by
fpaupier

Please update the chat template
1
#1 opened 2 months ago
by
stelterlab

FP8 Dynamic/W8A16 Quants Please
4
#44 opened 2 months ago
by
rjmehta
Problem hosting the model using vllm
➕
3
4
#45 opened 2 months ago
by
ShaoServient
Remove image_processor_type
#1 opened 3 months ago
by
pooya-davoodi-parasail
Remove image_processor_type
1
#1 opened 3 months ago
by
pooya-davoodi-parasail
Remove image_processor_type
#2 opened 3 months ago
by
pooya-davoodi-parasail
Use Qwen2VLImageProcessor for image_processor_type
5
#2 opened 3 months ago
by
pooya-davoodi-parasail
Use Qwen2VLImageProcessor for image_processor_type
#3 opened 3 months ago
by
pooya-davoodi-parasail
when i use vllm v0.7.2 to deploy r1 awq, i got empty content
13
#10 opened 4 months ago
by
bupalinyu
MLA is not supported with moe_wna16 quantization. Disabling MLA.
5
#7 opened 4 months ago
by
AMOSE
compressed-tensors MLA support requires fp8 activations and weights in group 'group_0',
2
#1 opened 4 months ago
by
samos123