13 9 79

fahrizalfarid

akahana

https://fahrizalfarid.com

fahrizalfarid

AI & ML interests

NLP

Recent Activity

upvoted an article 1 day ago

Open R1: Update #3

reacted to lewtun's post with 🔥 1 day ago

Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems 🧑‍💻 - 7B https://huggingface.co/open-r1/OlympicCoder-7B - 32B https://huggingface.co/open-r1/OlympicCoder-32B We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger 💪 Together with the models, we are releasing: 📊CodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python https://huggingface.co/datasets/open-r1/codeforces-cots 🏆 IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance https://huggingface.co/datasets/open-r1/ioi For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3

reacted to prithivMLmods's post with 🤗 1 day ago

Variable Demo for Two Image-to-Text-to-Text Multimodals 🌠 📜Space: https://huggingface.co/spaces/prithivMLmods/Multimodal-OCR By default, it will use: https://huggingface.co/prithivMLmods/Qwen2-VL-OCR-2B-Instruct or https://huggingface.co/prithivMLmods/Qwen2-VL-OCR2-2B-Instruct To trigger Aya-Vision's 8B by @aya-vision, use the prompt: https://huggingface.co/CohereForAI/aya-vision-8b

View all activity

Organizations

None yet

akahana's activity

upvoted an article 1 day ago

Article

Open R1: Update #3

and 9 others •

1 day ago

• 160

reacted to lewtun's post with 🔥 1 day ago

Post

1300

Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems 🧑‍💻

- 7B open-r1/OlympicCoder-7B
- 32B open-r1/OlympicCoder-32B

We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger 💪

Together with the models, we are releasing:

📊CodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python open-r1/codeforces-cots

🏆 IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance open-r1/ioi

For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3

1 reply

reacted to prithivMLmods's post with 🤗 1 day ago

Post

1982

Variable Demo for Two Image-to-Text-to-Text Multimodals 🌠

📜Space: prithivMLmods/Multimodal-OCR

By default, it will use:
prithivMLmods/Qwen2-VL-OCR-2B-Instruct or
prithivMLmods/Qwen2-VL-OCR2-2B-Instruct

To trigger Aya-Vision's 8B by @aya-vision , use the prompt:
CohereForAI/aya-vision-8b

updated a model 2 days ago

akahana/llm-models

Updated 2 days ago • 119

published a model 2 days ago

akahana/llm-models

Updated 2 days ago • 119

updated a dataset 2 days ago

akahana/llamacpp

Updated 2 days ago • 11

published a dataset 2 days ago

akahana/llamacpp

Updated 2 days ago • 11

reacted to tomaarsen's post with ❤️ 2 days ago

Post

6095

An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.

🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi
3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion
➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common.
⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported.
🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models
📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight.
📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code.

Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release
* EuroBERT/EuroBERT-210m
* EuroBERT/EuroBERT-610m
* EuroBERT/EuroBERT-2.1B

The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!

1 reply

updated a dataset 3 days ago

akahana/camel-ai-sains

Updated 3 days ago • 26

published a dataset 3 days ago

akahana/camel-ai-sains

Updated 3 days ago • 26

updated a dataset 4 days ago

akahana/llm-opus-ParaCrawl-english-id-v2

Updated 4 days ago • 12

published a dataset 4 days ago

akahana/llm-opus-ParaCrawl-english-id-v2

Updated 4 days ago • 12

updated a dataset 4 days ago

akahana/big-machine-translations

Updated 4 days ago • 44

updated a model 4 days ago

akahana/translation

Updated 4 days ago

published a model 4 days ago

akahana/translation

Updated 4 days ago

published a dataset 5 days ago

akahana/big-machine-translations

Updated 4 days ago • 44

updated a dataset 5 days ago

akahana/rocov2-full

Updated 5 days ago • 37

published a dataset 5 days ago

akahana/rocov2-full

Updated 5 days ago • 37

reacted to Undi95's post with ❤️ 6 days ago

Post

4502

Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

5 replies

liked a dataset 8 days ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Viewer • Updated 20 days ago • 110k • 7.42k • 512