Mayank Mishra

mayank-mishra

AI & ML interests

Large Language Models, Distributed Training and Inference

Recent Activity

Organizations

IBM's profile picture BigCode's profile picture Aurora-M/MDEL's profile picture Blog-explorers's profile picture Aurora-M's profile picture IBM Granite's profile picture IBM Research's profile picture

mayank-mishra's activity

New activity in ibm-granite/granite-3.1-2b-instruct 5 months ago
New activity in ibm-granite/granite-3.0-2b-instruct 8 months ago
New activity in ibm-granite/granite-3.0-8b-instruct 8 months ago

add base model metadata

#5 opened 8 months ago by
davanstrien
New activity in ibm-granite/granite-3.0-1b-a400m-instruct 8 months ago

Add base model metadata

#2 opened 8 months ago by
davanstrien
New activity in ibm-research/PowerMoE-3b 9 months ago
New activity in cfahlgren1/model-release-heatmap 10 months ago

Add IBM

3
#5 opened 10 months ago by
mayank-mishra
New activity in ibm-granite/granite-8b-code-instruct-128k 10 months ago

Fix: link to 128k paper

1
#1 opened 10 months ago by
timrbula
New activity in meta-llama/Llama-3.1-405B 11 months ago

405B or 410B ?

2
#8 opened 11 months ago by
alielfilali01
New activity in ibm-granite/granite-3b-code-instruct-2k 11 months ago
New activity in ibm-granite/granite-8b-code-instruct-4k 12 months ago

Input context length

3
#6 opened 12 months ago by
dyoung
New activity in ibm-granite/granite-8b-code-instruct-4k about 1 year ago

Official quants?

3
#2 opened about 1 year ago by
joshuaturner
New activity in ibm-granite/granite-3b-code-instruct-2k-GGUF about 1 year ago
New activity in ibm-granite/granite-3b-code-base-2k about 1 year ago

Release GGUF models?

3
#5 opened about 1 year ago by
CosmicSound
New activity in ibm-granite/granite-20b-code-base-8k-GGUF about 1 year ago