Motif-Technologies
/

Motif-2.6B

Text Generation

text-generation-inference

Model card Files Files and versions

JH-Motif commited on 24 days ago

Commit

1fe9873

·

verified ·

1 Parent(s): e5ec424

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -113,6 +113,27 @@ The benchmarks and metrics used are identical to those in the [Phi-3 technical r
 #### Gemma 1 & 2
 The benchmarks and metrics used are identical to those in the [Gemma 2 technical report](https://arxiv.org/abs/2408.00118).
 #### Gemma 3
 The benchmarks and metrics used are identical to those in the [Gemma 3 technical report](https://arxiv.org/abs/2503.19786).
@@ -127,6 +148,6 @@ The benchmarks and metrics used are identical to those in the [Gemma 3 technical
 |MATH|4-shot|48|75.6|40.2|-16.25%|-46.83%|
 |HiddenMath*|-|15.8|43|-|-|-|
 |MMLU(val)|5-shot|-|48.8|57.93|-|+18.71%|
-|||||**Average**|+24.71%|-8.28%|
 \*: We were unable to find an evaluation framework for this benchmark.

 #### Gemma 1 & 2
 The benchmarks and metrics used are identical to those in the [Gemma 2 technical report](https://arxiv.org/abs/2408.00118).
+|Benchmark|Metric|Gemma 1 2B|Gemma 1 7B|Gemma 2 2B|Gemma 2 9B|Motif 2.6B|Improvement(over 1 1B)|Improvement(over 1 7B)|Improvement(over 2 2B)|Improvement(over 2 9B)|
+|---|---|---|---|---|---|---|---|---|---|---|
+|MMLU|5-shot||||||||||
+|ARC-C|25-shot||||||||||
+|GSM8K|5-shot||||||||||
+|AGIEval*|3-5-shot||||||||||
+|DROP|3-shot, F1||||||||||
+|BBH|3-shot, CoT||||||||||
+|Winogrande|5-shot||||||||||
+|HellaSwag|10-shot||||||||||
+|MATH|4-shot||||||||||
+|ARC-e|0-shot||||||||||
+|PIQA|0-shot||||||||||
+|SIQA|0-shot||||||||||
+|Boolq|0-shot||||||||||
+|TriviaQA|5-shot||||||||||
+|NQ|5-shot||||||||||
+|HumanEval|pass@1||||||||||
+|MBPP|3-shot||||||||||
+|||||||**Average**|**TBA**|**TBA**|**TBA**|**TBA**|
 #### Gemma 3
 The benchmarks and metrics used are identical to those in the [Gemma 3 technical report](https://arxiv.org/abs/2503.19786).
 |MATH|4-shot|48|75.6|40.2|-16.25%|-46.83%|
 |HiddenMath*|-|15.8|43|-|-|-|
 |MMLU(val)|5-shot|-|48.8|57.93|-|+18.71%|
+|||||**Average**|**+24.71%**|**-8.28%**|
 \*: We were unable to find an evaluation framework for this benchmark.