h4shy
/

gemma-3-1b-it-fast-GUFF

Text Generation

Model card Files Files and versions

h4shy commited on May 22

Commit

8f05793

·

verified ·

1 Parent(s): 1f4c44a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ base_model:
 base_model_relation: quantized
 pipeline_tag: text-generation
 ---
-This version is quantized by h4shy for the purpose of production usage on old and/or cheap hardware and CPU-only setups, the goal here is to achieve an inference-ready setup aiming for production use with considerable resource constrains, these particular quantization choices will help inferences with medium to heavy CPU constrains and low to medium RAM constrains, as well as reservations for production efficiency.
 Q5_0: Medium to fast inference, optimal RAM usage.
 Q8_0: More inference speed, more RAM usage.

 base_model_relation: quantized
 pipeline_tag: text-generation
 ---
+This version is quantized by h4shy with the consideration of low-end hardware and CPU-only setups. The goal here is to achieve an inference-ready setup aiming for production use with considerable resource constrains, as these particular quantization choices will help with running inferences where there are medium to high CPU constrains and low to medium RAM constrains, as well as reserving resources for production efficiency.
 Q5_0: Medium to fast inference, optimal RAM usage.
 Q8_0: More inference speed, more RAM usage.