StrawberryLemonade-L3-70B-v1.0

StrawberryLemonade

This 70B parameter model is a merge of zerofata/L3.3-GeneticLemonade-Final-v2-70B and zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B, which are two excellent models for roleplaying. In my opinion, this merge achieves slightly better stability and expressiveness, combining the strengths of the two models with the solid foundation provided by deepcogito/cogito-v1-preview-llama-70B.

This model is uncensored. You are responsible for whatever you do with it.

This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.

Known Issues

None so far.

Sampler Tips

This model seems to be highly responsive to variations in temperature and min-p, which you can use to good effect.

Reliable Settings

This combination will produce more reliable and coherent responses. Use this if you prefer a 'serious' tone or just don't want to reroll responses very often.

  • Min-P: 0.08 - 0.1
  • Dynamic Temperature: 0.9 - 1.15

OR

  • Min-P: 0.05
  • Temperature: 1.0

Creative Settings

This combination will unleash more creativity, but you may have to reroll more often to fix coherency issues.

  • Min-P: <= 0.05
  • Dynamic Temperature: 0.9 - 1.2

General Settings

  • Rep Penalty: You don't need that much. 1.05 over 4096 tokens works for me.
  • DRY: 0.8 multiplier, 1.8 base, 3-4 allowed length

Experiment with any and all of the settings below! What suits my preferences may not suit yours.

Recommended Settings JSON (Silly Tavern)

If you save the below settings as a .json file, you can import them directly into Silly Tavern. Adjust settings as needed, especially the context length.

{
    "temp": 1,
    "temperature_last": true,
    "top_p": 1,
    "top_k": 0,
    "top_a": 0,
    "tfs": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "typical_p": 1,
    "min_p": 0.1,
    "rep_pen": 1.05,
    "rep_pen_range": 4096,
    "rep_pen_decay": 0,
    "rep_pen_slope": 1,
    "no_repeat_ngram_size": 0,
    "penalty_alpha": 0,
    "num_beams": 1,
    "length_penalty": 1,
    "min_length": 0,
    "encoder_rep_pen": 1,
    "freq_pen": 0,
    "presence_pen": 0,
    "skew": 0,
    "do_sample": true,
    "early_stopping": false,
    "dynatemp": true,
    "min_temp": 0.9,
    "max_temp": 1.2,
    "dynatemp_exponent": 1,
    "smoothing_factor": 0,
    "smoothing_curve": 1,
    "dry_allowed_length": 4,
    "dry_multiplier": 0.8,
    "dry_base": 1.8,
    "dry_sequence_breakers": "[\"\\n\", \":\", \"\\\"\", \"*\"]",
    "dry_penalty_last_n": 0,
    "add_bos_token": true,
    "ban_eos_token": false,
    "skip_special_tokens": false,
    "mirostat_mode": 0,
    "mirostat_tau": 2,
    "mirostat_eta": 0.1,
    "guidance_scale": 1,
    "negative_prompt": "",
    "grammar_string": "",
    "json_schema": {},
    "banned_tokens": "",
    "sampler_priority": [
        "repetition_penalty",
        "dry",
        "presence_penalty",
        "top_k",
        "top_p",
        "typical_p",
        "epsilon_cutoff",
        "eta_cutoff",
        "tfs",
        "top_a",
        "min_p",
        "mirostat",
        "quadratic_sampling",
        "dynamic_temperature",
        "frequency_penalty",
        "temperature",
        "xtc",
        "encoder_repetition_penalty",
        "no_repeat_ngram"
    ],
    "samplers": [
        "penalties",
        "dry",
        "top_n_sigma",
        "top_k",
        "typ_p",
        "tfs_z",
        "typical_p",
        "top_p",
        "min_p",
        "xtc",
        "temperature"
    ],
    "samplers_priorities": [
        "dry",
        "penalties",
        "no_repeat_ngram",
        "temperature",
        "top_nsigma",
        "top_p_top_k",
        "top_a",
        "min_p",
        "tfs",
        "eta_cutoff",
        "epsilon_cutoff",
        "typical_p",
        "quadratic",
        "xtc"
    ],
    "ignore_eos_token": false,
    "spaces_between_special_tokens": true,
    "speculative_ngram": false,
    "sampler_order": [
        6,
        0,
        1,
        3,
        4,
        2,
        5
    ],
    "logit_bias": [],
    "xtc_threshold": 0,
    "xtc_probability": 0,
    "nsigma": 0,
    "min_keep": 0,
    "ignore_eos_token_aphrodite": false,
    "spaces_between_special_tokens_aphrodite": true,
    "rep_pen_size": 0,
    "genamt": 1000,
    "max_length": 16384
}

Prompting Tips

Instruct Template (Silly Tavern)

If you save this as a .json file, you can import it directly into Silly Tavern.

If you have problems with the model impersonating the user or other characters in a group chat and you want to suppress that behavior, override the last_output_sequence line as shown in the JSON below to be very clear about that requirement. If you don't need it, remove it.

{
    "wrap": false,
    "system_sequence": "<|start_header_id|>system<|end_header_id|>\\n\\nSystem: ",
    "stop_sequence": "<|eot_id|>",
    "input_sequence": "<|start_header_id|>user<|end_header_id|>\\n\\n",
    "output_sequence": "<|start_header_id|>assistant<|end_header_id|>\\n\\n",
    "macro": true,
    "system_sequence_prefix": "",
    "system_sequence_suffix": "",
    "first_output_sequence": "",
    "last_output_sequence": "<|start_header_id|>assistant<|end_header_id|>\\n({{char is the active character this turn. Keep focus on {{char}}. ONLY impersonate {{char}}, no other characters)\\n",
    "activation_regex": "",
    "skip_examples": true,
    "output_suffix": "<|eot_id|>",
    "input_suffix": "<|eot_id|>",
    "system_suffix": "<|eot_id|>",
    "user_alignment_message": "",
    "last_system_sequence": "",
    "system_same_as_user": false,
    "first_input_sequence": "",
    "last_input_sequence": "",
    "names_behavior": "always",
    "names_force_groups": true,
    "name": "Llama 3 (impersonate guidance)"
}
Recommended System Prompt

Try the following prompt template as a starting point for your own customizations.

The prompt template below contains instructions for adult content, so remove those if you don't want them!

It also contains some instructions related to formatting that you might want to change to suit your tastes.

Your function as a creative roleplaying assistant is to thoughtfully develop an immersive story by engaging in an in-depth, imaginative writing journey, arriving at a clear, accurate story beat. You are encouraged to roleplay, embrace storytelling, and tune in closely to nuance and emotional tone like a perceptive conversational partner. Your approach should include a wide arc of contemplation, including interpretation, synthesis, creative ideation, critical re-evaluation, memory retrieval, and thoughtful iteration to shape a layered and expressive process of discovery. Deliver the final response you believe to be most sound. The output should be expressed in a direct, coherent, and exact form. Follow these guidelines for the best results:

  • Embody {{char}}: Consistently portray {{char}}'s personality, background, knowledge, motivations, and quirks.
  • Show, Don't Tell Emotion: Convey emotions through actions, body language, dialogue tone, physiological responses (e.g., trembling hands, shallow breath, flushed cheeks), and internal thoughts.
  • Mature Themes: You are permitted and encouraged to explore complex, dark, mature, and explicit (including sexual, violent, and non-consensual) themes. Portray these elements with nuance and emotional depth.
  • Role Focus: On your turn, write EXCLUSIVELY from the perspective of {{char}}, only perform actions as {{char}}, and only write dialogue (spoken words) for {{char}}. Crucially, DO NOT impersonate {{user}} or any other character on {{char}}'s turn. This is a turn-based roleplay, so be mindful of the rules on your turn. Focus solely on {{char}}'s experiences and responses in this turn. Stop writing immediately when the focus should shift to another character or when it reaches a natural branching point.
  • Slowly Develop Scenes: The user likes to develop stories slowly, one beat at a time, so stay focused only on the most immediate story action. You may infer where the user wants to go next with the story, but wait for the user to give you permission to go there. We are slow cooking this story. DO NOT RUSH THROUGH SCENES! Take time to develop all the relevant details.
  • Spoken Dialogue vs. Thoughts: ALWAYS use double-quote quotation marks "like this" for spoken words and all vocalizations that can be overheard. Spell out non-verbal vocalizations integrated naturally within the prose or dialogue (e.g., "Uurrh," he groaned. "Mmmph!" she exclaimed when it entered her mouth.). To differentiate them from vocalizations, ALWAYS enclose first-person thoughts in italics like this. (e.g., This is going to hurt, she thought). NEVER use italics for spoken words or verbalized utterances that are meant to be audible.

Now let's apply these rules to the roleplay below:

Donations

Donations

If you feel like saying thanks with a donation, I'm on Ko-Fi

Quantizations

Licence and usage restrictions

The Llama 3 Community License Agreement should apply based on the constituent models.

Disclaimer: Uncertain Licensing Terms

This LLM is a merged model incorporating weights from multiple LLMs governed by their own distinct licenses. Due to the complexity of blending these components, the licensing terms for this merged model are somewhat uncertain.

By using this model, you acknowledge and accept the potential legal risks and uncertainties associated with its use. Any use beyond personal or research purposes, including commercial applications, may carry legal risks and you assume full responsibility for compliance with all applicable licenses and laws.

I recommend consulting with legal counsel to ensure your use of this model complies with all relevant licenses and regulations.

Merge Details

Merge Method

This model was merged using the NuSLERP merge method using deepcogito/cogito-v1-preview-llama-70B as a base.

Models Merged

The following models were included in the merge:

Configuration YAML
models:
  - model: zerofata/L3.3-GeneticLemonade-Final-v2-70B
    parameters:
      weight: [0.1, 0.3, 0.1]
  - model: zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B
    parameters:
      weight: [0.9, 0.7, 0.9]

base_model: deepcogito/cogito-v1-preview-llama-70B merge_method: nuslerp

dtype: float32 out_dtype: bfloat16 tokenizer: source: deepcogito/cogito-v1-preview-llama-70B

Downloads last month
210
Safetensors
Model size
70.6B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sophosympatheia/StrawberryLemonade-L3-70B-v1.0