--- license: apache-2.0 datasets: - liyucheng/zhihu_rlhf_3k language: - zh base_model: - lucky2me/Dorami-Instruct --- # Dorami-Chat Dorami-Chat is a Direct Preference Optimization(DPO) model based on the Supervised Fine-tuning(SFT) model lucky2me/Dorami-Instruct ## Model description ### Training data - [liyucheng/zhihu_rlhf_3k](https://huggingface.co/datasets/liyucheng/zhihu_rlhf_3k) ### Training code - [dorami](https://github.com/6zeus/dorami.git) ## How to use ### 1. Download model from Hugging Face Hub to local ``` git lfs install git clone https://huggingface.co/lucky2me/Dorami-Chat ``` ### 2. Use the model downloaded above ```python from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig model_path = "The path of the model downloaded above" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained(model_path) prompt="fill in any prompt you like." inputs = tokenizer(prompt, return_tensors="pt") generation_config = GenerationConfig(max_new_tokens=64, do_sample=True, top_k=2, eos_token_id=model.config.eos_token_id) outputs = model.generate(**inputs, generation_config=generation_config) decoded_text = tokenizer.batch_decode(outputs, skip_special_tokens=True) print(decoded_text) ```