Abdullah Abdelrhim's picture

Abdullah Abdelrhim

abdullah

·

abodacs

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 hour ago

open-thoughts/OpenThoughts3-1.2M

upvoted a paper 4 days ago

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

liked a model 6 days ago

mohamed2811/Muffakir_Embedding_V2

View all activity

Organizations

abdullah's activity

upvoted a paper 4 days ago

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

Paper • 2505.24098 • Published 8 days ago • 41

upvoted an article 14 days ago

Article

Falcon-Arabic: A Breakthrough in Arabic Language Models

By

and 7 others •

16 days ago

• 30

upvoted a paper 14 days ago

Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval

Paper • 2505.16967 • Published 15 days ago • 22

upvoted 2 collections 25 days ago

U-MATH and μ-MATH - University-level math evaluation

Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS • 4 items • Updated Jan 14 • 17

SARD: Synthetic Arabic Recognition Dataset

A large-scale synthetic Arabic OCR dataset comprising 843,622 book-style document images across 10 fonts, designed to advance VLM for Arabic Texts • 2 items • Updated 18 days ago • 3

upvoted an article 26 days ago

Article

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

By

and 6 others •

27 days ago

• 57

upvoted a paper 29 days ago

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published 30 days ago • 64

upvoted 4 papers about 1 month ago

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

Paper • 2504.20605 • Published Apr 29 • 13

Sadeed: Advancing Arabic Diacritization Through Small Language Model

Paper • 2504.21635 • Published Apr 30 • 59

Kuwain 1.5B: An Arabic SLM via Language Injection

Paper • 2504.15120 • Published Apr 21 • 120

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

Paper • 2504.12322 • Published Apr 11 • 28

upvoted an article about 1 month ago

Article

I trained a Language Model to schedule events with GRPO!

By

•

Apr 29

• 76

upvoted a collection about 1 month ago

Arabic Speech Datasets

Best Datasets for Arabic Speech Tasks • 3 items • Updated May 3 • 1

upvoted 2 papers about 1 month ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30 • 47

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30 • 56

upvoted 2 collections about 1 month ago

Phi-4 (All Versions)

Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 7 days ago • 70

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Apr 30 • 83

upvoted 2 papers about 1 month ago

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Paper • 2504.07934 • Published Apr 10 • 18

The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks

Paper • 2504.15521 • Published Apr 22 • 64

upvoted an article about 1 month ago

Article

Reasoning Datasets Competition

By

and 6 others •

Apr 9

• 37