cs-fxr (fxrc)

upvoted a collection 7 months ago

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 8 items • Updated 1 day ago • 252

upvoted a paper 8 months ago

Harnessing the Universal Geometry of Embeddings

Paper • 2505.12540 • Published May 18, 2025 • 9

upvoted an article 8 months ago

Article

Efficient MultiModal Data Pipeline

+3

Jul 8, 2025

•

70

upvoted a paper 9 months ago

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Paper • 2506.06444 • Published Jun 6, 2025 • 73

upvoted an article 9 months ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.09k

upvoted a paper 10 months ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15, 2025 • 120

upvoted an article 10 months ago

Article

The Transformers Library: standardizing model definitions

+2

May 15, 2025

•

121

upvoted 2 papers 11 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 94

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 170

upvoted an article 12 months ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25, 2025

•

172

upvoted a paper about 1 year ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 141

upvoted 2 articles about 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

+3

May 24, 2023

•

175

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4, 2025

•

1.32k

upvoted a paper about 1 year ago

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Paper • 2306.13649 • Published Jun 23, 2023 • 31

fxrc

AI & ML interests

Organizations

GLM-4.5

Harnessing the Universal Geometry of Embeddings

Efficient MultiModal Data Pipeline

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Mixture of Experts Explained

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

The Transformers Library: standardizing model definitions

Training Large Language Models to Reason in a Continuous Latent Space

Transformers without Normalization

FastRTC: The Real-Time Communication Library for Python

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

fxrc

AI & ML interests

Organizations

cs-fxr's activity

Efficient MultiModal Data Pipeline

Mixture of Experts Explained

The Transformers Library: standardizing model definitions

FastRTC: The Real-Time Communication Library for Python

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents