9 88 587

Anthonny OLIME

Citaman

Citaman

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

Open-R1: Update #1

updated a collection about 1 month ago

Keep in Mind's Model

updated a collection about 1 month ago

omni models

View all activity

Organizations

Citaman's activity

upvoted an article about 1 month ago

Article

Open-R1: Update #1

and 7 others •

Feb 2

• 293

upvoted a paper about 1 month ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 63

upvoted an article about 1 month ago

Article

Welcome to Inference Providers on the Hub 🔥

Jan 28

• 417

upvoted a paper about 1 month ago

Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Paper • 2501.11651 • Published Jan 20 • 1

upvoted a collection about 1 month ago

ProLIP

Collection

Official ProLIP weights • 4 items • Updated Dec 9, 2024 • 6

upvoted an article 6 months ago

Article

Token Merging for fast LLM inference : Background and first trials with Mistral

•

Apr 30, 2024

• 4

upvoted a paper 7 months ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 67

upvoted an article 8 months ago

Article

How I train a LoRA: m3lt style training overview

•

Jul 1, 2024

• 49

upvoted a paper 8 months ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 97

upvoted 2 papers 9 months ago

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Paper • 2406.08973 • Published Jun 13, 2024 • 87

Needle In A Multimodal Haystack

Paper • 2406.07230 • Published Jun 11, 2024 • 53

upvoted a collection 9 months ago

Universal token classification

Collection

Collection of universal token classification (UTC) models capable in prompt-tuned manner to solve many information extraction tasks. • 11 items • Updated Sep 10, 2024 • 12

upvoted a paper 9 months ago

Yuan 2.0-M32: Mixture of Experts with Attention Router

Paper • 2405.17976 • Published May 28, 2024 • 20

upvoted 2 papers 10 months ago

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published May 24, 2024 • 46

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 55

upvoted an article 10 months ago

Article

GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing

•

May 25, 2024

• 10

upvoted 2 articles 11 months ago

Article

Transformers

•

Jul 2, 2024

• 7

Article

Diffusion Models

•

May 19, 2024

• 15

upvoted 2 papers 12 months ago

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 79

Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Paper • 2403.18795 • Published Mar 27, 2024 • 20