Check new, much better, version of this model

pinned

by DreamGenX - opened Apr 25, 2024

Discussion

DreamGenX

DreamGen org Apr 25, 2024

This model has issues (trained without BOS token), please use the following preview models instead:

DreamGenX pinned discussion Apr 25, 2024

JoggyMuffin

Apr 25, 2024

But no quants :|

Adzeiros

Apr 26, 2024

Yea, I'm waiting on quants as well. I can just BARELY not run the full model on my VRAM haha.

DreamGenX

DreamGen org Apr 26, 2024

I just spotted GGUF for one of them:
https://huggingface.co/localfultonextractor/opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5-Q8_0-GGUF

mradermacher

Apr 26, 2024

Well, I was watching this drama and wanted to wait till a more "final" version appears. But I've put both in the queue and a full set of static quants should be available in a few hours.

mradermacher

Apr 26, 2024

Spoke too soon:

NotImplementedError: Unknown rope scaling type: dynamic

the models are not supported by llama.cpp at the moment it seems. Not without disabling rope scaling at least.

DreamGenX

DreamGen org Apr 26, 2024

You can remove that from the config and use llama.cpp's own rope scaling.
Though I am surprised it throws an error like this.

mradermacher

Apr 26, 2024

llama.cpp can throw a lot of interesting errors, even with old models that did convert fine at the time :)

mradermacher

Apr 26, 2024

https://huggingface.co/mradermacher/opus-v1.2-llama-3-8b-base-run3.4-epoch2-GGUF

mradermacher

Apr 26, 2024

•

edited Apr 26, 2024

https://huggingface.co/mradermacher/opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5-GGUF

BTW, anybody can request quants from me at https://huggingface.co/mradermacher/model_requests in cases I overlooked it. Can save the model creators a lot of time, too :)

DreamGenX

DreamGen org Apr 26, 2024

Awesome, thank you @mradermacher !

dosb

Jun 9, 2024

@DreamGenX , thanks for you work on these Llama3-8B finetunes. Any plans for a 70B finetune?

DreamGenX

DreamGen org Jun 10, 2024

•

edited Jun 10, 2024

@dobs I finished a train of L3 70B DreamGen model few weeks ago. I changed the template to take full advantage of the built-in tokens, so first I need to update the documentation.
Based on user feedback it performs better than any other DreamGen model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment