Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,85 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- kobkrit/rd-taxqa
|
| 5 |
+
- iapp_wiki_qa_squad
|
| 6 |
+
- Thaweewat/alpaca-cleaned-52k-th
|
| 7 |
+
- Thaweewat/instruction-wild-52k-th
|
| 8 |
+
- Thaweewat/databricks-dolly-15k-th
|
| 9 |
+
- Thaweewat/hc3-24k-th
|
| 10 |
+
- Thaweewat/gpteacher-20k-th
|
| 11 |
+
- Thaweewat/onet-m6-social
|
| 12 |
+
- Thaweewat/alpaca-finance-43k-th
|
| 13 |
+
language:
|
| 14 |
+
- th
|
| 15 |
+
- en
|
| 16 |
+
library_name: transformers
|
| 17 |
+
pipeline_tag: text-generation
|
| 18 |
+
tags:
|
| 19 |
+
- openthaigpt
|
| 20 |
+
- llama
|
| 21 |
---
|
| 22 |
+
|
| 23 |
+
# 🇹🇭 OpenThaiGPT 1.0.0-beta
|
| 24 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Fb8eiMDaqiEQL6ahbAY0h%2Fimage.png?alt=media&token=6fce78fd-2cca-4c0a-9648-bd5518e644ce
|
| 25 |
+
https://openthaigpt.aieat.or.th/" width="200px">
|
| 26 |
+
|
| 27 |
+
🇹🇭 OpenThaiGPT Version 1.0.0-beta is a Thai language 7B-parameter LLaMA v2 Chat model finetuned to follow Thai translated instructions and extend more than 24,554 most popular Thai words vocabularies into LLM's dictionary for turbo speed.
|
| 28 |
+
|
| 29 |
+
# ---- Lora Adapter Format of OpenThaiGPT 1.0.0-beta ----
|
| 30 |
+
|
| 31 |
+
## Upgrade from OpenThaiGPT 1.0.0-alpha
|
| 32 |
+
- Add more than 24,554 most popular Thai words vocabularies into LLM's dictionary and re-pretrain embedding layers which make it generate Thai text 10 times faster than previous version.
|
| 33 |
+
|
| 34 |
+
## Pretrain Model
|
| 35 |
+
- [https://huggingface.co/ChanonUtupon/openthaigpt-merge-lora-llama-2-7B-3470k](https://huggingface.co/ChanonUtupon/openthaigpt-merge-lora-llama-2-7B-3470k)
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
## Support
|
| 39 |
+
- Official website: https://openthaigpt.aieat.or.th
|
| 40 |
+
- Facebook page: https://web.facebook.com/groups/openthaigpt
|
| 41 |
+
- A Discord server for discussion and support [here](https://discord.gg/rUTp6dfVUF)
|
| 42 |
+
- E-mail: [email protected]
|
| 43 |
+
|
| 44 |
+
## License
|
| 45 |
+
**Source Code**: License Apache Software License 2.0.<br>
|
| 46 |
+
**Weight**: Research and **Commercial uses**.<br>
|
| 47 |
+
|
| 48 |
+
## Code and Weight
|
| 49 |
+
**Web Demo**: https://demo-beta.openthaigpt.aieat.or.th/<br>
|
| 50 |
+
**Colab Demo**: https://colab.research.google.com/drive/1NkmAJHItpqu34Tur9wCFc97A6JzKR8xo?usp=sharing<br>
|
| 51 |
+
**Finetune Code**: https://github.com/OpenThaiGPT/openthaigpt-finetune-010beta<br>
|
| 52 |
+
**Inference Code**: https://github.com/OpenThaiGPT/openthaigpt<br>
|
| 53 |
+
**Weight (Lora Adapter)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-7b-chat<br>
|
| 54 |
+
**Weight (Huggingface Checkpoint)**: https://huggingface.co/openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
|
| 55 |
+
|
| 56 |
+
## Sponsors
|
| 57 |
+
Pantip.com, ThaiSC<br>
|
| 58 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FiWjRxBQgo0HUDcpZKf6A%2Fimage.png?alt=media&token=4fef4517-0b4d-46d6-a5e3-25c30c8137a6" width="100px">
|
| 59 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2Ft96uNUI71mAFwkXUtxQt%2Fimage.png?alt=media&token=f8057c0c-5c5f-41ac-bb4b-ad02ee3d4dc2" width="100px">
|
| 60 |
+
|
| 61 |
+
### Powered by
|
| 62 |
+
OpenThaiGPT Volunteers, Artificial Intelligence Entrepreneur Association of Thailand (AIEAT), and Artificial Intelligence Association of Thailand (AIAT)
|
| 63 |
+
|
| 64 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2F6yWPXxdoW76a4UBsM8lw%2Fimage.png?alt=media&token=1006ee8e-5327-4bc0-b9a9-a02e93b0c032" width="100px">
|
| 65 |
+
<img src="https://1173516064-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FvvbWvIIe82Iv1yHaDBC5%2Fuploads%2FBwsmSovEIhW9AEOlHTFU%2Fimage.png?alt=media&token=5b550289-e9e2-44b3-bb8f-d3057d74f247" width="100px">
|
| 66 |
+
|
| 67 |
+
### Authors
|
| 68 |
+
* Kobkrit Viriyayudhakorn ([email protected])
|
| 69 |
+
* Sumeth Yuenyong ([email protected])
|
| 70 |
+
* Thaweewat Rugsujarit ([email protected])
|
| 71 |
+
* Jillaphat Jaroenkantasima ([email protected])
|
| 72 |
+
* Norapat Buppodom ([email protected])
|
| 73 |
+
* Koravich Sangkaew ([email protected])
|
| 74 |
+
* Peerawat Rojratchadakorn ([email protected])
|
| 75 |
+
* Surapon Nonesung ([email protected])
|
| 76 |
+
* Chanon Utupon ([email protected])
|
| 77 |
+
* Sadhis Wongprayoon ([email protected])
|
| 78 |
+
* Nucharee Thongthungwong ([email protected])
|
| 79 |
+
* Chawakorn Phiantham ([email protected])
|
| 80 |
+
* Patteera Triamamornwooth ([email protected])
|
| 81 |
+
* Nattarika Juntarapaoraya ([email protected])
|
| 82 |
+
* Kriangkrai Saetan ([email protected])
|
| 83 |
+
* Pitikorn Khlaisamniang ([email protected])
|
| 84 |
+
|
| 85 |
+
<i>Disclaimer: Provided responses are not guaranteed.</i>
|