Wrongly classified models

#157
by mrseeker87 - opened

Hello,

As the guy who made the following models, I want you to be informed that these are NOT instruct-based finetunes (nor are they LoRA). They are tuned on various datasets containing novels and will perform poorly on benchmarks that are based on executing instructions:

  • KoboldAI's "Erebus" models
  • KoboldAI's "Nerys" models (this is an exception that it also contains a CYOA-dataset)
  • KoboldAI's "Janeway" models
  • KoboldAI's "Picard" models

EDIT:
Best example of the dataset used would be "Gutenberg"

Another one would be the "Nerybus" models, which are a blend of Erebus & Nerys models.

deleted

Thanks for raising this @mrseeker87 . It does seem misleading.

@clefourrier I think it's better if the leaderboard separates instruction-tuned from (vanilla) fine-tuned :)

On a different note, I also think that merges are a different thing from these two... but maybe that's too many categories lol. Re-labeling around 400 models just to accommodate these changes might be too tedious for a small group of hardworking maintainers. The easiest fix would be to revert the label to "fine-tuned" so that no re-labeling is needed.

Open LLM Leaderboard org

Hi!
Thank you for your comments @mrseeker87 !

The category was initially just "fine-tuned" but it became "instruction-tuned" through successive edits, I should have paid closer attention. I fixed it back to "fine-tuned"! Thanks for your vigilance!

clefourrier changed discussion status to closed
Open LLM Leaderboard org

@jaspercatapang If you want to give a hand in relabeling the models to add the difference between instruction-tuned and vanilla fine-tuned, I can change the front end to fit!

deleted

@clefourrier sure, happy to help 🙂

Open LLM Leaderboard org

Amazing! I'm going to create a separate issue for this.

Sign up or log in to comment