infCapital
/

viet-llama2-ft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

viet-llama2-ft / README.md

hungeni's picture

Update README.md

7f02805 over 1 year ago

|

history blame contribute delete

684 Bytes

	---
	datasets:
	- tatsu-lab/alpaca
	- ewof/alpaca-instruct-unfiltered
	- databricks/databricks-dolly-15k
	- teknium/GPTeacher-General-Instruct
	- garage-bAInd/Open-Platypus
	- Honkware/oasst1-alpaca-json
	- GAIR/lima
	- infCapital/viet-llama2-ft-tiny
	language:
	- vi
	---

	+ LLaMa2 - 7B Chat models, extend vocab size to 44800 for Vietnamese understanding.
	+ Continual Pre-Train with 2B Vietnames Tokens aligned from VnNews Corpus, 10K vnthuquan books, wikipedia_vi
	+ Fine-Tuning with infCapital/viet-llama2-ft-tiny dataset, the combination of vaious dataset then translated into Vietnamese using OpenAI GPT-3

	+ For more information: email me at [email protected] \| http://fb.com/hungbui2013