Argilla

company

https://www.argilla.io/

argilla_io

argilla-io

Activity Feed

AI & ML interests

LLMs, NLP, Alignment, DPO, RLHF, data labeling, text-classification, text-generation, token-classification

argilla 's collections 10

Synthetic Data Generator

A collection of tools and datasets related to no-code the Synthetic Data Generation.

Running

118

AutoTrain Advanced

🚀

118

Create powerful AI models without code
Sleeping

9

Transformers Pipeline Playground

🐇

9

Search, load and play with transformer pipelines
Running

7

Synthetic Data Generator Argilla Reviewer

🧬

7

Review datasets created with the Synthetic Data Generator
argilla/synthetic-sft-customer-support-single-turn

Viewer • Updated Dec 11, 2024 • 100 • 49 • 7

Open Image Generation Models

A collection of models that are open source equivalents of flux-schnell and flux-dev.

ostris/OpenFLUX.1

Text-to-Image • Updated Oct 3, 2024 • 598 • 678
OnomaAIResearch/Illustrious-xl-early-release-v0

Text-to-Image • Updated Feb 13, 2025 • 70.9k • 411
black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 585k • • 4.52k
TencentARC/PhotoMaker-V2

Text-to-Image • Updated Jul 22, 2024 • 12k • 152

Notus 7B v1

Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus

argilla/notus-7b-v1

Text Generation • 7B • Updated Dec 5, 2023 • 192 • 123
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 232 • 81
TheBloke/notus-7B-v1-GGUF

Text Generation • 7B • Updated Dec 4, 2023 • 604 • 23
TheBloke/notus-7B-v1-AWQ

Text Generation • 7B • Updated Dec 4, 2023 • 30 • 3

DIBT Prompt collective SPIN

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset

argilla/zephyr-7b-spin-iter0-v0

Text Generation • Updated Mar 13, 2024 • 13 • 1
argilla/zephyr-7b-spin-iter1-v0

Text Generation • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter2-v0

Text Generation • Updated Mar 13, 2024 • 12 • 1
argilla/zephyr-7b-spin-iter3-v0

Text Generation • Updated Mar 13, 2024 • 11 • 8

Preference Datasets for KTO

This collection contains a list of curated preference datasets for KTO fine-tuning for intent alignment of LLMs through signals.

argilla/ultrafeedback-binarized-preferences-cleaned-kto

Viewer • Updated Mar 19, 2024 • 231k • 1.88k • 9
argilla/distilabel-intel-orca-kto

Viewer • Updated Mar 19, 2024 • 23.1k • 23 • 9
argilla/distilabel-capybara-kto-15k-binarized

Viewer • Updated Mar 19, 2024 • 15.1k • 26 • 5
argilla/kto-mix-15k

Viewer • Updated Apr 19, 2024 • 15.3k • 73 • 14

Datasets built with ⚗️ distilabel

This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel

Runtime error

15

Distilabel Synthetic Data Pipeline Finder

⚗

15

Find and view synthetic data pipelines on Hugging Face
argilla/distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 16, 2024 • 7.56k • 1.26k • 182
argilla/distilabel-intel-orca-dpo-pairs

Viewer • Updated Aug 7, 2025 • 12.9k • 3.09k • 181
alvarobartt/HelpSteer-AIF

Viewer • Updated Feb 6, 2024 • 1k • 69 • 6

Argilla v2.0 compatible datasets

Ready for rg.Dataset.from_hub(). Each dataset contains a my_dataset_name/tree/main/creation_script.py to see the fullconfig and creation pipeline.

argilla/multi-modal-vlm-visit-bench

Viewer • Updated Aug 7, 2024 • 575 • 41 • 4
argilla/llm-chat-preference

Viewer • Updated Jul 30, 2024 • 7.56k • 80 • 2
argilla/rag-embeddings-relevance-similarity

Viewer • Updated Jul 31, 2024 • 6.25k • 28 • 1
argilla/textcat-tokencat-pii-per-domain

Viewer • Updated Jul 30, 2024 • 2.1k • 119

Notux 8x7B v1

Notux 8x7B v1 model (DPO fine-tune of Mixtral 8x7B Instruct v0.1) and datasets used. More information at https://github.com/argilla-io/notus

argilla/notux-8x7b-v1

Text Generation • 47B • Updated Mar 4, 2024 • 25 • 164
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 2.64k • 157
TheBloke/notux-8x7b-v1-GGUF

Text Generation • 47B • Updated Dec 29, 2023 • 270 • 5
TheBloke/notux-8x7b-v1-AWQ

Text Generation • 47B • Updated Dec 29, 2023 • 8 • 3

Preference Datasets for DPO

This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs

argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 232 • 81
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 2.64k • 157
argilla/ultrafeedback-multi-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 158k • 73 • 7
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 155k • 18 • 5

Domain Specific Data

This is a collection of tools for building domain specific datasets using human domain expertise and synthetic data generation.

argilla/farming

Viewer • Updated Apr 25, 2024 • 1.7k • 374 • 8

Synthetic Data Generator

A collection of tools and datasets related to no-code the Synthetic Data Generation.

Running

118

AutoTrain Advanced

🚀

118

Create powerful AI models without code
Sleeping

9

Transformers Pipeline Playground

🐇

9

Search, load and play with transformer pipelines
Running

7

Synthetic Data Generator Argilla Reviewer

🧬

7

Review datasets created with the Synthetic Data Generator
argilla/synthetic-sft-customer-support-single-turn

Viewer • Updated Dec 11, 2024 • 100 • 49 • 7

Datasets built with ⚗️ distilabel

This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel

Runtime error

15

Distilabel Synthetic Data Pipeline Finder

⚗

15

Find and view synthetic data pipelines on Hugging Face
argilla/distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 16, 2024 • 7.56k • 1.26k • 182
argilla/distilabel-intel-orca-dpo-pairs

Viewer • Updated Aug 7, 2025 • 12.9k • 3.09k • 181
alvarobartt/HelpSteer-AIF

Viewer • Updated Feb 6, 2024 • 1k • 69 • 6

Open Image Generation Models

A collection of models that are open source equivalents of flux-schnell and flux-dev.

ostris/OpenFLUX.1

Text-to-Image • Updated Oct 3, 2024 • 598 • 678
OnomaAIResearch/Illustrious-xl-early-release-v0

Text-to-Image • Updated Feb 13, 2025 • 70.9k • 411
black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 585k • • 4.52k
TencentARC/PhotoMaker-V2

Text-to-Image • Updated Jul 22, 2024 • 12k • 152

Argilla v2.0 compatible datasets

Ready for rg.Dataset.from_hub(). Each dataset contains a my_dataset_name/tree/main/creation_script.py to see the fullconfig and creation pipeline.

argilla/multi-modal-vlm-visit-bench

Viewer • Updated Aug 7, 2024 • 575 • 41 • 4
argilla/llm-chat-preference

Viewer • Updated Jul 30, 2024 • 7.56k • 80 • 2
argilla/rag-embeddings-relevance-similarity

Viewer • Updated Jul 31, 2024 • 6.25k • 28 • 1
argilla/textcat-tokencat-pii-per-domain

Viewer • Updated Jul 30, 2024 • 2.1k • 119

Notus 7B v1

Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus

argilla/notus-7b-v1

Text Generation • 7B • Updated Dec 5, 2023 • 192 • 123
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 232 • 81
TheBloke/notus-7B-v1-GGUF

Text Generation • 7B • Updated Dec 4, 2023 • 604 • 23
TheBloke/notus-7B-v1-AWQ

Text Generation • 7B • Updated Dec 4, 2023 • 30 • 3

Notux 8x7B v1

Notux 8x7B v1 model (DPO fine-tune of Mixtral 8x7B Instruct v0.1) and datasets used. More information at https://github.com/argilla-io/notus

argilla/notux-8x7b-v1

Text Generation • 47B • Updated Mar 4, 2024 • 25 • 164
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 2.64k • 157
TheBloke/notux-8x7b-v1-GGUF

Text Generation • 47B • Updated Dec 29, 2023 • 270 • 5
TheBloke/notux-8x7b-v1-AWQ

Text Generation • 47B • Updated Dec 29, 2023 • 8 • 3

DIBT Prompt collective SPIN

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset

argilla/zephyr-7b-spin-iter0-v0

Text Generation • Updated Mar 13, 2024 • 13 • 1
argilla/zephyr-7b-spin-iter1-v0

Text Generation • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter2-v0

Text Generation • Updated Mar 13, 2024 • 12 • 1
argilla/zephyr-7b-spin-iter3-v0

Text Generation • Updated Mar 13, 2024 • 11 • 8

Preference Datasets for DPO

This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs

argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 232 • 81
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 2.64k • 157
argilla/ultrafeedback-multi-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 158k • 73 • 7
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 155k • 18 • 5

Preference Datasets for KTO

This collection contains a list of curated preference datasets for KTO fine-tuning for intent alignment of LLMs through signals.

argilla/ultrafeedback-binarized-preferences-cleaned-kto

Viewer • Updated Mar 19, 2024 • 231k • 1.88k • 9
argilla/distilabel-intel-orca-kto

Viewer • Updated Mar 19, 2024 • 23.1k • 23 • 9
argilla/distilabel-capybara-kto-15k-binarized

Viewer • Updated Mar 19, 2024 • 15.1k • 26 • 5
argilla/kto-mix-15k

Viewer • Updated Apr 19, 2024 • 15.3k • 73 • 14

Domain Specific Data

This is a collection of tools for building domain specific datasets using human domain expertise and synthetic data generation.

argilla/farming

Viewer • Updated Apr 25, 2024 • 1.7k • 374 • 8

AI & ML interests

Team members 5

argilla 's collections 10

AutoTrain Advanced

Transformers Pipeline Playground

Synthetic Data Generator Argilla Reviewer

Distilabel Synthetic Data Pipeline Finder

AutoTrain Advanced

Transformers Pipeline Playground

Synthetic Data Generator Argilla Reviewer

Distilabel Synthetic Data Pipeline Finder