tokenbender

TokenBender

14 9 93

https://tokenbender.com

AI & ML interests

Fine-tune small useful models, build datasets and anything related to local LLM hosting and serving.

Recent Activity

updated a dataset 11 days ago

TokenBender/lin-alg-kernels-core

updated a dataset 15 days ago

TokenBender/circuit-discovery

updated a model 16 days ago

TokenBender/doramuon-trinity-nano-base-gsm8k-adamw-lora-8xh200

View all activity

Organizations

upvoted a collection about 1 year ago

Llama Nemotron

Collection

Open, Production-ready Enterprise Models • 12 items • Updated 18 days ago • 79

upvoted a paper about 1 year ago

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Paper • 2505.01441 • Published Apr 28, 2025 • 39

upvoted an article over 1 year ago

Article

Releasing the largest multilingual open pretraining dataset

Pclanglais

•

Nov 13, 2024

• 108

upvoted an article almost 2 years ago

Article

Introduction to ggml

ngxson, ggerganov, slaren

•

Aug 13, 2024

• 295

upvoted a collection almost 2 years ago

Gemma 2 2B Release

Collection

The 2.6B parameter version of Gemma 2. • 6 items • Updated Mar 12 • 85

upvoted a paper about 2 years ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 96

upvoted an article about 2 years ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

yuxiang630, cassanof, ganler, YifengDing, StringChaos, harmdevries, lvwerra, arjunguha, lingming

•

Apr 29, 2024

• 80

upvoted 2 papers almost 3 years ago

LongNet: Scaling Transformers to 1,000,000,000 Tokens

Paper • 2307.02486 • Published Jul 5, 2023 • 82

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Paper • 2305.07185 • Published May 12, 2023 • 10

tokenbender

AI & ML interests

Recent Activity

Organizations

TokenBender's activity

Releasing the largest multilingual open pretraining dataset

Introduction to ggml

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation