Triangle104 commited on
Commit
e2e8a7a
·
verified ·
1 Parent(s): 5e87634

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -0
README.md CHANGED
@@ -17,6 +17,123 @@ language:
17
  This model was converted to GGUF format from [`EpistemeAI/DeepPhi-3.5-mini-instruct`](https://huggingface.co/EpistemeAI/DeepPhi-3.5-mini-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/EpistemeAI/DeepPhi-3.5-mini-instruct) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22
 
 
17
  This model was converted to GGUF format from [`EpistemeAI/DeepPhi-3.5-mini-instruct`](https://huggingface.co/EpistemeAI/DeepPhi-3.5-mini-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/EpistemeAI/DeepPhi-3.5-mini-instruct) for more details on the model.
19
 
20
+ ---
21
+ Model Summary
22
+ -
23
+
24
+
25
+
26
+ Reason Phi model for top performing model with it's size of 3.8B.
27
+ Phi-3 - synthetic data and filtered publicly available websites - with a
28
+ focus on very high-quality, reasoning dense data. The model belongs to
29
+ the Phi-3 model family and supports 128K token context length.
30
+
31
+
32
+
33
+
34
+
35
+
36
+
37
+ Run locally
38
+ -
39
+
40
+
41
+
42
+
43
+
44
+
45
+
46
+
47
+ 4bit
48
+
49
+
50
+
51
+
52
+ After obtaining the Phi-3.5-mini-instruct model checkpoint, users can use this sample code for inference.
53
+
54
+
55
+ import torch
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
57
+
58
+ torch.random.manual_seed(0)
59
+
60
+ model_path = "EpistemeAI/DeepPhi-3.5-mini-instruct"
61
+
62
+ # Configure 4-bit quantization using bitsandbytes
63
+ quantization_config = BitsAndBytesConfig(
64
+ load_in_4bit=True,
65
+ bnb_4bit_quant_type="nf4", # You can also try "fp4" if desired.
66
+ bnb_4bit_compute_dtype=torch.float16 # Or torch.bfloat16 depending on your hardware.
67
+ )
68
+
69
+ model = AutoModelForCausalLM.from_pretrained(
70
+ model_path,
71
+ device_map="auto",
72
+ torch_dtype=torch.float16,
73
+ trust_remote_code=True,
74
+ quantization_config=quantization_config,
75
+ )
76
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
77
+
78
+ messages = [
79
+ {"role": "system", "content": """
80
+ You are a helpful AI assistant. Respond in the following format:
81
+ <reasoning>
82
+ ...
83
+ </reasoning>
84
+ <answer>
85
+ ...
86
+ </answer>"""},
87
+ {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
88
+ {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
89
+ {"role": "user", "content": "What about solving a 2x + 3 = 7 equation?"},
90
+ ]
91
+
92
+ def format_messages(messages):
93
+ prompt = ""
94
+ for msg in messages:
95
+ role = msg["role"].capitalize()
96
+ prompt += f"{role}: {msg['content']}\n"
97
+ return prompt.strip()
98
+
99
+ prompt = format_messages(messages)
100
+
101
+ pipe = pipeline(
102
+ "text-generation",
103
+ model=model,
104
+ tokenizer=tokenizer,
105
+ )
106
+
107
+ generation_args = {
108
+ "max_new_tokens": 500,
109
+ "return_full_text": False,
110
+ "temperature": 0.0,
111
+ "do_sample": False,
112
+ }
113
+
114
+ output = pipe(prompt, **generation_args)
115
+ print(output[0]['generated_text'])
116
+
117
+
118
+
119
+
120
+
121
+
122
+
123
+
124
+ Uploaded model
125
+ -
126
+
127
+
128
+
129
+ Developed by: EpistemeAI
130
+ License: apache-2.0
131
+ Finetuned from model : unsloth/phi-3.5-mini-instruct-bnb-4bit
132
+
133
+
134
+ This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
135
+
136
+ ---
137
  ## Use with llama.cpp
138
  Install llama.cpp through brew (works on Mac and Linux)
139