Tifa-Deepseek-14b-CoT-Crazy

A huggingface format conversion of the GGUF from here: https://huggingface.co/ValueFX9507/Tifa-Deepsex-14b-CoT

For merging, requantizing, finetuning and such. A partial translation of the model card:

A number of RL strategies are used, mainly using 671B R1 distilled data, with high output divergence, inheriting the advantages of R1, and also inheriting the harmfulness of R1. Good literary performance.

Incremental training of 0.4T novel content

40K SFT data generated by TifaMax, 60K SFT data generated by DeepseekR1, 2K high-quality artificial data

30K DPO reinforcement learning data generated by TifaMax to prevent duplication, enhance context association, and improve political security

10K PPO data generated by TifaMax, 10K PPO data generated by DeepseekR1

16k ultra-long context training

Random truncation training enhances robustness

8×H20 GPU full-scale fine-tuning

Personal observations:

Don't let the Deepsex name fool you.

This model seems very strong at SFW, English, long form (>32K context) storywriting, especially for a 14B, with good comprehension of the whole plot, details and the current state of the story. This is interesting, as it was "only" trained at 16K and (seemingly) mostly in Chinese.

Downtown-Case
/

Tifa-Deepsex-14b-CoT-Crazy-HF

Tifa-Deepseek-14b-CoT-Crazy

Personal observations:

Model tree for Downtown-Case/Tifa-Deepsex-14b-CoT-Crazy-HF