Update README.md
Browse files
README.md
CHANGED
@@ -13,15 +13,15 @@ language:
|
|
13 |
- km
|
14 |
- ta
|
15 |
---
|
16 |
-
# SEA-LION-7B-
|
17 |
|
18 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
19 |
The sizes of the models range from 3 billion to 7 billion parameters.
|
20 |
|
21 |
-
SEA-LION-7B-
|
22 |
These instructions have been carefully curated and rewritten to ensure the model was trained on truly open, commercially permissive and high quality datasets.
|
23 |
|
24 |
-
SEA-LION-7B-
|
25 |
|
26 |
SEA-LION stands for _Southeast Asian Languages In One Network_.
|
27 |
|
@@ -33,20 +33,20 @@ SEA-LION stands for _Southeast Asian Languages In One Network_.
|
|
33 |
|
34 |
## Model Details
|
35 |
### Base model
|
36 |
-
SEA-LION-7B-
|
37 |
|
38 |
### Benchmark Performance
|
39 |
|
40 |
|
41 |
| Model | ARC | HellaSwag | MMLU | TruthfulQA | Average |
|
42 |
|--------------------------------------------------|:-----:|:---------:|:-----:|:----------:|:-------:|
|
43 |
-
| SEA-LION
|
44 |
-
| SEA-LION
|
45 |
|
46 |
### Usage
|
47 |
For the full installation, training and inference guide, please refer to the [Github](https://github.com/caviato/sealion-gptq).
|
48 |
|
49 |
-
In order for SEA-LION-7B-
|
50 |
|
51 |
SEA-LION can be run using the 🤗 Transformers library
|
52 |
```python
|
@@ -57,7 +57,7 @@ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
|
|
57 |
import torch
|
58 |
|
59 |
tokenizer = AutoTokenizer.from_pretrained(
|
60 |
-
"aisingapore/
|
61 |
trust_remote_code=True
|
62 |
)
|
63 |
|
@@ -67,7 +67,7 @@ quantize_config = BaseQuantizeConfig(
|
|
67 |
)
|
68 |
|
69 |
model = AutoGPTQForCausalLM.from_quantized( # will be loaded to GPU
|
70 |
-
"aisingapore/
|
71 |
device = "cuda:0",
|
72 |
quantize_config = quantize_config,
|
73 |
torch_dtype=torch.float16,
|
@@ -127,10 +127,10 @@ The previous release of the commercially non-permissive SEA-LION-Instruct-Resear
|
|
127 |
|
128 |
## Technical Specifications
|
129 |
### Fine-Tuning Details
|
130 |
-
|
131 |
|
132 |
## Data
|
133 |
-
SEA-LION-7B-
|
134 |
|
135 |
In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
|
136 |
|
|
|
13 |
- km
|
14 |
- ta
|
15 |
---
|
16 |
+
# SEA-LION-v1-7B-IT-GPTQ
|
17 |
|
18 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
19 |
The sizes of the models range from 3 billion to 7 billion parameters.
|
20 |
|
21 |
+
SEA-LION-v1-7B-IT is a multilingual model which has been fine-tuned with **thousands of English and Indonesian instruction-completion pairs** alongside a smaller pool of instruction-completion pairs from other ASEAN languages.
|
22 |
These instructions have been carefully curated and rewritten to ensure the model was trained on truly open, commercially permissive and high quality datasets.
|
23 |
|
24 |
+
SEA-LION-v1-7B-IT-GPTQ is the quantized version of SEA-LION-v1-7B-IT using a [modified version](https://github.com/caviato/AutoGPTQ) of the [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ) library with Wikipedia texts.
|
25 |
|
26 |
SEA-LION stands for _Southeast Asian Languages In One Network_.
|
27 |
|
|
|
33 |
|
34 |
## Model Details
|
35 |
### Base model
|
36 |
+
SEA-LION-v1-7B-IT-GPTQ is quantized from [SEA-LION-v1-7B-IT](https://huggingface.co/aisingapore/SEA-LION-v1-7B-IT).
|
37 |
|
38 |
### Benchmark Performance
|
39 |
|
40 |
|
41 |
| Model | ARC | HellaSwag | MMLU | TruthfulQA | Average |
|
42 |
|--------------------------------------------------|:-----:|:---------:|:-----:|:----------:|:-------:|
|
43 |
+
| SEA-LION-v1-7B-IT (FP16) | 40.78 | 68.20 | 27.12 | 36.29 | 43.10 |
|
44 |
+
| SEA-LION-v1-7B-IT-GPTQ (4-Bit, 128 group size)| 39.93 | 67.32 | 27.11 | 36.32 | 42.67 |
|
45 |
|
46 |
### Usage
|
47 |
For the full installation, training and inference guide, please refer to the [Github](https://github.com/caviato/sealion-gptq).
|
48 |
|
49 |
+
In order for SEA-LION-v1-7B-IT-GPTQ to work, please install the [modified version of the AutoGPTQ](https://github.com/caviato/AutoGPTQ) library. Installation information can be found [here](https://github.com/caviato/AutoGPTQ#install-from-source).
|
50 |
|
51 |
SEA-LION can be run using the 🤗 Transformers library
|
52 |
```python
|
|
|
57 |
import torch
|
58 |
|
59 |
tokenizer = AutoTokenizer.from_pretrained(
|
60 |
+
"aisingapore/SEA-LION-v1-7B-IT-GBTQ",
|
61 |
trust_remote_code=True
|
62 |
)
|
63 |
|
|
|
67 |
)
|
68 |
|
69 |
model = AutoGPTQForCausalLM.from_quantized( # will be loaded to GPU
|
70 |
+
"aisingapore/SEA-LION-v1-7B-IT-GBTQ",
|
71 |
device = "cuda:0",
|
72 |
quantize_config = quantize_config,
|
73 |
torch_dtype=torch.float16,
|
|
|
127 |
|
128 |
## Technical Specifications
|
129 |
### Fine-Tuning Details
|
130 |
+
SEA-LION-v1-7B-IT was fine-tuned using 8x A100-40GB using parameter efficient fine tuning in the form of LoRA.
|
131 |
|
132 |
## Data
|
133 |
+
SEA-LION-v1-7B-IT was trained on a wide range of instructions that were manually and stringently verified by our team. A large portion of the effort was dedicated to ensuring that each instruction-completion pair that the model sees is of a high quality and any errors were corrected and rewritten by native speakers or else dropped from our mix.
|
134 |
|
135 |
In addition, special care was taken to ensure that the datasets used had commercially permissive licenses through verification with the original data source.
|
136 |
|