clowman commited on
Commit
68fab69
·
verified ·
1 Parent(s): 2705396

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -16
README.md CHANGED
@@ -1,19 +1,3 @@
1
- # Quantization
2
- Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]`
3
-
4
- Base Model: [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)
5
-
6
- Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
7
-
8
- Steps to create:
9
- 1. `git clone https://github.com/LambdaLabsML/lambda-quant`
10
- 2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
11
- 3. `python quantize.py -m meta-llama/Llama-3.3-70B-Instruct -q Dynamic-F8`
12
- ## Evaluation
13
- TODO
14
- ## Benchmarks
15
- TODO
16
- # Base Model README.md
17
  ---
18
  library_name: transformers
19
  language:
@@ -58,6 +42,26 @@ extra_gated_description: >-
58
  extra_gated_button_content: Submit
59
  license: llama3.3
60
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ## Model Information
62
 
63
  The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: transformers
3
  language:
 
42
  extra_gated_button_content: Submit
43
  license: llama3.3
44
  ---
45
+ # Quantization
46
+ Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]`
47
+
48
+ Base Model: [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)
49
+
50
+ Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
51
+
52
+ Steps to create:
53
+ 1. `git clone https://github.com/LambdaLabsML/lambda-quant`
54
+ 2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
55
+ 3. `python quantize.py -m meta-llama/Llama-3.3-70B-Instruct -q Dynamic-F8`
56
+
57
+ ## Evaluation
58
+ TODO
59
+
60
+ ## Benchmarks
61
+ TODO
62
+
63
+ # Base Model README.md
64
+
65
  ## Model Information
66
 
67
  The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.