AI & ML interests

JAX, Flax, TPU, 🤗

Recent Activity

flax-community's activity

stefan-it 
posted an update 2 months ago
view post
Post
1529
My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.

👉 Link: https://github.com/stefan-it/model-garden-lms

An overview of some features:

- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS

I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!

👉 Model Hub Link: https://huggingface.co/model-garden-lms

If you find these resources useful, please give them a like!

Made from Bavarian Oberland with ❤️ and 🥨.
christopher 
posted an update 3 months ago
view post
Post
1682
The folks at Foursquare released a dataset of 104.5 million places of interest ( foursquare/fsq-os-places) and here's all of them on a plot
·
christopher 
posted an update 3 months ago
nbroad 
posted an update 4 months ago
view post
Post
3627
hi florent and livestream!
·
christopher 
posted an update 6 months ago
view post
Post
1329
4 million chess puzzles
morgan 
posted an update 7 months ago
view post
Post
1304
Llama 3.1 405B Instruct beats GPT-4o on MixEval-Hard

Just ran MixEval for 405B, Sonnet-3.5 and 4o, with 405B landing right between the other two at 66.19

The GPT-4o result of 64.7 replicated locally but Sonnet-3.5 actually scored 70.25/69.45 in my replications 🤔 Still well ahead of the other 2 though.

Sammple of 1 of the eval calls here: https://wandb.ai/morgan/MixEval/weave/calls/07b05ae2-2ef5-4525-98a6-c59963b76fe1

Quick auto-logging tracing for openai-compatible clients and many more here: https://wandb.github.io/weave/quickstart/