Text Generation
Transformers
Safetensors
llama
text-generation-inference
TildeSIA commited on
Commit
1dda6bb
·
verified ·
1 Parent(s): 1e768eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -77,3 +77,29 @@ We train TildeOpen LLM using the [Tilde's branch](https://github.com/tilde-nlp/l
77
 
78
  ## Tokeniser details
79
  We built the TildeOpen LLM tokeniser to ensure equitable language representation across languages. Technically, we trained the tokeniser to represent the same text regardless of the language it is written in, using a similar number of tokens. In practice, TildeOpen LLM will be more efficient and faster than other models for our focus languages, as writing out answers will require fewer steps. For more details on how TildeOpen LLM compares against other models, see **[TILDE Bench](https://tilde-nlp.github.io/tokenizer-bench.html)**!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ## Tokeniser details
79
  We built the TildeOpen LLM tokeniser to ensure equitable language representation across languages. Technically, we trained the tokeniser to represent the same text regardless of the language it is written in, using a similar number of tokens. In practice, TildeOpen LLM will be more efficient and faster than other models for our focus languages, as writing out answers will require fewer steps. For more details on how TildeOpen LLM compares against other models, see **[TILDE Bench](https://tilde-nlp.github.io/tokenizer-bench.html)**!
80
+
81
+
82
+ ## Running model using HF transformers
83
+ When loading the tokeniser, you must set ```use_fast=False```.
84
+ ```python
85
+ from transformers import AutoTokenizer, AutoModelForCausalLM
86
+
87
+ # Load tokenizer + model
88
+ tokenizer = AutoTokenizer.from_pretrained("TildeAI/TildeOpen-30b", use_fast=False)
89
+ model = AutoModelForCausalLM.from_pretrained(
90
+ "TildeAI/TildeOpen-30b",
91
+ torch_dtype=torch.bfloat16,
92
+ device_map="auto"
93
+ )
94
+
95
+ # Tokenize
96
+ inputs = tokenizer(user_in, return_tensors="pt").to(model.device)
97
+
98
+ # Generate (greedy, deterministic)
99
+ outputs = model.generate(
100
+ **inputs,
101
+ max_new_tokens=512,
102
+ repetition_penalty=1.2,
103
+ do_sample=False,
104
+ )
105
+ ```