malhajar
/

Llama-2-7b-chat-tr

Model card Files Files and versions Community

Llama-2-7b-chat-tr / README.md

malhajar's picture

Update README.md

091b920 12 months ago

|

1.95 kB

	---
	language:
	- en
	tags:
	- llama-2
	- turkish
	- dolly
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->
	malhajar/Llama-2-7b-chat-dolly-tr is a finetuned version of Llama-2-7b-hf using SFT Training.
	This model can answer information in turkish language as it is finetuned on a turkish dataset specifically [`databricks-dolly-15k-tr`]( https://huggingface.co/datasets/atasoglu/databricks-dolly-15k-tr)


	### Model Description

	- Developed by: [`Mohamad Alhajar`](https://www.linkedin.com/in/muhammet-alhajar/)
	- Language(s) (NLP): Turkish
	- Finetuned from model: [`meta-llama/Llama-2-7b-hf`](https://huggingface.co/meta-llama/Llama-2-7b-hf)

	### Prompt Template

	```
	<s>[INST] <prompt> [/INST]
	```

	## How to Get Started with the Model

	Use the code sample provided in the original post to interact with the model.
	```python
	from transformers import AutoTokenizer,AutoModelForCausalLM

	model_id = "malhajar/Llama-2-7b-chat-dolly-tr"
	model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
	device_map="auto",
	torch_dtype=torch.float16,
	revision="main")

	tokenizer = AutoTokenizer.from_pretrained(model_id)

	question: "what is the will to truth?"
	# For generating a response
	prompt = '''
	### Instruction:
	{question}

	### Response:'''
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids
	output = model.generate(inputs=input_ids,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,top_k=50, do_sample=True,repetition_penalty=1.3
	top_p=0.95)
	response = tokenizer.decode(output[0])

	print(response)
	```

	## Example Generation

	```
	<s>[INST] Türkiyenin en büyük şehir nedir? [/INST]
	İstanbul, dünyanın en kalabalık ikinci ve Turuncu kütle'de yer almaktadır. Pek çok insandaki birçok ünlüsün bulundusuyla biliniyor.
	```