--- license: cc-by-nc-4.0 library_name: transformers tags: - llama3 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/FBQCFBZbgm0FpC4tBegIh.png) [GGUF](https://huggingface.co/mradermacher/badger-nu-llama-3.1-8B-UltraLong-GGUF) [iMat](https://huggingface.co/mradermacher/badger-nu-llama-3.1-8B-UltraLong-i1-GGUF) # Badger ν Llama 3.1 8B UltraLong Instruct Badger is a *recursive normalized denoised fourier interpolation* of the following models: ```python # Badger Nu models = [ ('Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct', 'Llama-3.1-8B-Instruct'), ('Skywork-o1-Open-Llama-3.1-8B', 'Llama-3.1-8B-Instruct'), ('Dolphin3.0-Llama3.1-8B', 'Llama-3.1-8B'), ('Llama-3.1-Nemotron-Nano-8B-v1', 'Llama-3.1-8B-Instruct'), ('cogito-v1-preview-llama-8B', 'Llama-3.1-8B'), ('Llama-3.1-Tulu-3.1-8B', 'Llama-3.1-8B'), ('DeepHermes-3-Llama-3-8B-Preview', 'Llama-3.1-8B'), ('Fireball-R1.1-Llama-3.1-8B', 'Llama-3.1-8B'), ('OpenMath2-Llama3.1-8B', 'Llama-3.1-8B-Instruct'), ('Foundation-Sec-8B', 'Llama-3.1-8B'), ('Bio-Medical-Llama-3-8B', 'Meta-Llama-3-8B-Instruct'), ('Llama-3.1-Hawkish-8B', 'Llama-3.1-8B-Instruct'), ('Einstein-v6.1-Llama3-8B', 'Meta-Llama-3-8B'), ('Llama-3-Instruct-8B-SimPO-v0.2', 'Meta-Llama-3-8B-Instruct'), ('Llama-3.1_OpenScholar-8B', 'Llama-3.1-8B-Instruct'), ('L3-8B-Stheno-v3.2', 'Meta-Llama-3-8B-Instruct'), ('L3.1-EtherealRainbow-v1.0-rc1-8B', 'Llama-3.1-8B-Instruct'), ('Llama3.1-8B-ShiningValiant2', 'Llama-3.1-8B-Instruct'), ('Pantheon-RP-1.0-8b-Llama-3', 'Meta-Llama-3-8B'), ('SillyTilly-SlopJob-8b-RP-ForFree', 'Meta-Llama-3-8B'), ('opus-v1.2-llama-3-8b-base-run3.4-epoch2', 'Meta-Llama-3-8B'), ('llama-3-fantasy-writer-8b', 'Meta-Llama-3-8B-Instruct'), ('Llama-3.1-SuperNova-Lite', 'Llama-3.1-8B-Instruct'), ] task_add = [ ('meta-llama-3-8b-instruct-hf-ortho-baukit-2fail-128total', 'Meta-Llama-3-8B-Instruct') ] all_models = models + task_add model_path = "./models/l38/" in_model = "Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct" out_model = 'Llama-3.1-SuperNova-Lite' root_model = 'Llama-3.1-Nemotron-8B-UltraLong-1M-Instruct' ``` * with thanks to [NVIDIA](https://huggingface.co/nvidia), [Arcee](https://huggingface.co/arcee-ai), [Nous](https://huggingface.co/NousResearch), the geniuses in [SillyTilly](https://huggingface.co/SillyTilly), [Cognitive Computations](https://huggingface.co/cognitivecomputations), and all of the other AI labs and independent model creators for your hard work! Llama 3 may be the last open model trained in the US based on the highly valuable [LibGen](https://libgen.is/) data set. While the usage of this dataset has been highly controversial, there is no arguing that it represents some of the finest text data that mankind has produced. In light of this, and given the open model community has made a lot of advancements since my last release of Badger Mu, I thought it might be time to give Llama 3 8B another look. One of the primary motivators of this decision was [Unsloth publishing turnkey GRPO notebooks](https://docs.unsloth.ai/basics/reasoning-grpo-and-rl), which I found to be quite easy to run on Paperspace A6000s using the shivamb25/unsloth-dev container. I'm really excited to try this model as the basis for my further experiments. ### Format Use the Llama 3 Instruct format. ### Models ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/VSKVfdRFdAVe-LaO7gnif.png) We have a few strong clusters of models - UltraLong being the most different, is the base; the reasoning models bear a lot of similarity; and then we have a diversity of unique models for the latter group. ## Correspondence to Praxis Maldevide (prax@maldevide.com) ## Citation
@article{badger-nu,
  title={Llama 3 Is All You Need: LibGen Is The Best Source Of Human Textual Data},
  author={Praxis Maldevide},
  journal={None},
  year={2025}
 }