Update README.md
Browse files
README.md
CHANGED
@@ -1,19 +1,3 @@
|
|
1 |
-
# Quantization
|
2 |
-
Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]`
|
3 |
-
|
4 |
-
Base Model: [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)
|
5 |
-
|
6 |
-
Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
|
7 |
-
|
8 |
-
Steps to create:
|
9 |
-
1. `git clone https://github.com/LambdaLabsML/lambda-quant`
|
10 |
-
2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
|
11 |
-
3. `python quantize.py -m meta-llama/Llama-3.3-70B-Instruct -q Dynamic-F8`
|
12 |
-
## Evaluation
|
13 |
-
TODO
|
14 |
-
## Benchmarks
|
15 |
-
TODO
|
16 |
-
# Base Model README.md
|
17 |
---
|
18 |
library_name: transformers
|
19 |
language:
|
@@ -58,6 +42,26 @@ extra_gated_description: >-
|
|
58 |
extra_gated_button_content: Submit
|
59 |
license: llama3.3
|
60 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
## Model Information
|
62 |
|
63 |
The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
language:
|
|
|
42 |
extra_gated_button_content: Submit
|
43 |
license: llama3.3
|
44 |
---
|
45 |
+
# Quantization
|
46 |
+
Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]`
|
47 |
+
|
48 |
+
Base Model: [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)
|
49 |
+
|
50 |
+
Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
|
51 |
+
|
52 |
+
Steps to create:
|
53 |
+
1. `git clone https://github.com/LambdaLabsML/lambda-quant`
|
54 |
+
2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
|
55 |
+
3. `python quantize.py -m meta-llama/Llama-3.3-70B-Instruct -q Dynamic-F8`
|
56 |
+
|
57 |
+
## Evaluation
|
58 |
+
TODO
|
59 |
+
|
60 |
+
## Benchmarks
|
61 |
+
TODO
|
62 |
+
|
63 |
+
# Base Model README.md
|
64 |
+
|
65 |
## Model Information
|
66 |
|
67 |
The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
|